Tagged PDF documents
The objective of this technique is to specify the language of a passage,
phrase, or word using the /Lang entry to provide information in the
PDF document that user agents need to present text and other linguistic
content correctly. This is normally accomplished using a tool for authoring
PDF.
Both assistive technologies and conventional user agents can render text more accurately when the language is identified. Screen readers can load the correct pronunciation rules. As a result, users with disabilities are better able to understand the content.
This technique can be used to set the default language for the entire document if the entire document is contained in the container or tag. In this case, this technique would apply to Success Criterion 3.1.1.
/Lang entry to specify the language for a paragraph using Adobe Acrobat ProThis example is shown with Adobe Acrobat Pro. There are other software tools that perform similar functions.
Acrobat includes numerous preset language selections. If you need to specify a language that is not on the list, such as Russian, you must type the ISO 639-2 code for the language, rather than its name.
/Lang entry to specify the language for a specific word or phrase using Adobe Acrobat ProThis example is shown with Adobe Acrobat Pro. There are other software tools that perform similar functions.
When you tag a word or phrase, Acrobat splits the original content into three document content tags: one for the text that precedes your selection, one for the selection, and one for the text that follows the selection. As needed, drag the document content tag for the selected text into position between the other two tags, so that the text reads in the proper order. All three tags must also be at the same level beneath their parent tag. Drag them into place if they are not.
This example is shown in operation in the working example of marking a specific word or phrase in Acrobat Pro (PDF).
/Lang entryBelow the level of the default document language, the language for a passage may be specified for the following items:
Span tag./Lang entry in the structure element dictionary.The following code fragment illustrates code that is typical for using the /Lang entry to override the default document language by specifying a marked-content sequence within a page's content stream:
/P % Start of marked-content sequence
BDC
(See you later, or in Spanish you would say, ) Tj
/Span << /Lang (es-MX) >>% Start of nested marked-content sequence
BDC
(Hasta la vista.) Tj
EMC% End of nested marked-content sequence
EMC% End of marked-content sequence
The following code fragment illustrates code that is typical for using the /Lang entry in the structure element dictionary. In this case, the /Lang entry applies to the marked-content sequence having an MCID (marked-content identifier) value of 0 within the indicated page's content stream.
1 0 obj% Structure element
<< /Type /StructElem
/S /Span% Structure type
/P /P% Parent in structure hierarchy
/K<< /Type /MCR
/Pg 2 0 R% Page containing marked-content sequence
/MCID 0% Marked-content identifier
>>
/Lang (es-MX)% Language specification for this element
>>
endobj
2 0 obj% Page object
<< /Type /Page
/Contents 3 0 R% Content stream
…
>>
endobj
3 0 obj% Page's content stream
<< /Length … >>
stream
BT
/P % Start of marked-content sequence
BDC
(See you later, or in Spanish you would say, ) Tj
/Span << /MCID 0 >>% Start of nested marked-content sequence
BDC
(Hasta la vista.) Tj
EMC% End of nested marked-content sequence
EMC% End of marked-content sequence
ET
endstream
endobj
Verify that the language of a passage, phrase, or word that differs
from the language of the surrounding text is correctly specified
by a /Lang entry on an enclosing tag or container:
/Lang entry value to open the PDF document and view the language settings.