Digital Content Primer
How Files Structure Data
HTML is plain text. A human can read it directly. Tags like <h1> and <p> give information explicit, standardized labels.
<!DOCTYPE html>
<html>
<body>
<h1>Lesson Title</h1>
<p>This is a paragraph.</p>
</body>
</html>Because it’s an open standard maintained by the W3C, the set of labels is public. You can view the file in a text editor, though it is rendered by a web browser.
Office files are compressed archives. Inside a .docx or .pptx is actually a folder of XML files, zipped together.
PK\x03\x04... [Compressed Header]
word/document.xml
word/styles.xml
[Content as XML inside the archive]The structure exists—headings, paragraphs, images—but it’s in Microsoft’s proprietary schema. Other software has to reverse-engineer it. There aren’t one-to-one mappings of labels with other formats. You can’t view the data in a text editor, it is rendered and edited all within a Microsoft application.
PDFs are primarily images. They describe where ink goes on a page, not what the content means. Though, they can accomodate semantic structure via “tags,” which is a schema Adobe made up.
%PDF-1.7
1 0 obj << /Type /Catalog /Pages 2 0 R >>
stream
[Binary image and font data]
%%EOF“Tagged PDFs” add a semantic layer, but it’s bolted on—-not fundamental to the format. Many PDFs lack it entirely, and it you need special software to tell if that layer is present. PDFs can be rendered by various applications, but require special programs to perform limited editing.
What Markup Actually Does
Enables Logical Structures
Headings can provide outline structures and thematic sections.
Establishes Relationships
Items can be parts sets like tables or lists.
Allows Navigation in Time
The structure and relationships of markup turns a stream of raw text into an interactive, navigable structure in time. Instead of a document being read like an audio cassette in an uninterrupted stream, it becomes operable information.
Why Conversions Break Things
When you export to another format, software translates between markup systems. Each format has its own vocabulary, its own rules.
Example: What happens to equations
A common question: “I thought equations were accessible in Word?”
They are—when viewed in Word. But here’s what happens when that file travels:
- In Word: Microsoft understands its own equation markup (OMML). Screen readers can read it.
- Exported to HTML: Word doesn’t speak MathML. The equation gets rendered as an image.
- Exported to PDF: Same problem. The equation becomes a picture.
The equation looks fine. But the structural layer is gone—it is only percievable visually.
This is why we recommend keeping your original files. The .docx contains the richest version. But what students receive depends on how it travels to them.
This pattern repeats everywhere:
| What you had | What you get |
|---|---|
| Headings | Bold text |
| Lists | Lines with bullet characters |
| Table headers | Plain cells |
| Alt text | Nothing |
Browsers
Web Browsers are increasingly complex applications that render dynamic documents. Though the complexity of these dynamic documents extends into applications in their own right like D2L/Brightspace. Under the hood, web broswers have three important parts:
- Render HTML documents (headings, paragraphs, lists, links)
- Render SVG (scalable vector graphics, like Adobe Illustrator) from markup
- Programatically alter documents via JavaScript
Other Files in Browsers
Broswers are increasingly monolithic and attempt to render and preview everything within themselves, assuming most users want a seamless flow, but this can create issues:
- Due to the different labels direct conversion to HTML is not possible.
- The are loaded in a preview that attempts to accurately render but cannot replicate an entire program.
- These previews lose text reflow and pose at least subtle usability issues (and sometimes WCAG fail scenarios) for everyone.
- The user now decides to view the preview or download a file and open in another program.
- The desktop applications aren’t always available to the user.
- Different browsers render previews differently.
Take Aways
Consider Authoring New Documents in HTML
Unlike other software, accessibility features aren’t proprietary or plug-ins, they are open and inherent in HTML documents, and built by a much broader community.
Take Care When Converting or Exporting
Share original files if possible. Verify PDFs with accessibility checkers.
Express how you intend your documents to be viewed
If you authored accessibility features into your PowerPoints, make sure your students know they should use the PowerPoint desktop application to view them.
When in Doubt Try to Picture Your Students Digital Experience
What devices are they using? How are they studying? Are the files you are using creating a richer experience or any potential barriers?