26 June 2026 15:00 UTC via Zoom
Present:
Regrets:
This meeting was the inaugural session of a new W3C community group focused on defining use cases and developing accessible solutions for EPUB media overlays, which synchronize audio and video with document content. The group members, including representatives from DAISY Consortium, Benetech, Celia, Swedish Agency for Accessible Media (MTM), EDRLab, and other development organizations, introduced themselves and discussed how they would work together using the group’s GitHub issue tracker/wiki for requirements gathering. Key discussion points included the need for skippability and escapability features in media overlays, challenges with TTS language support for non-Western languages, requirements for speaking structural elements like page numbers and table descriptions, and the potential use of CSS selectors and text fragments as alternatives to deprecated CFI (Content Fragment Identifier) specifications. The group also explored the gap between TTS read-aloud functionality and media overlay synchronization, particularly around providing semantic information that isn’t present in the text content.
Marisa introduced the purpose of the group, which is to define use cases and work on accessible solutions for EPUB media overlays, particularly focusing on audio and video synchronization with document content. She noted that the group will be guided by existing accessibility standards and will liaise with the publishing maintenance working group. Marisa and Daniel provided brief introductions, with Marisa identifying herself as a developer at Daisy with experience in media overlays since their introduction to EPUB in 2011, and Daniel introducing himself as a software developer with the Data Consortium. EPUB Media Overlay Working Group
The meeting focused on introductions from participants including representatives from DAISY libraries and others working on EPUB production and media overlays. Marisa outlined the working process for the group, which will involve regular meetings, use of a GitHub issue tracker for requirements and use cases, and adherence to the W3C code of conduct. The group agreed to begin the first phase by focusing on defining problems and collecting use cases related to media overlays and synchronized media in EPUB.
Avneesh discussed how the eBraille working group successfully used GitHub’s issue tracker and wiki features to accommodate participants with disabilities, avoiding complex GitHub features like pull requests. George raised concerns about capturing Hiroshi’s requirements for sign language video use cases, which Avneesh confirmed is being addressed through ongoing discussions with Hiroshi. The group identified skippability and escapability in EPUB media overlays as an under-specified area that needs discussion, with Avneesh noting this was previously addressed in DAISY but remains unresolved in EPUB Accessibility 1.1. Shane highlighted a gap in the current media overlay specification regarding continuous listening sessions with aligned highlighting or underlining, similar to Amazon WhisperSync, and mentioned that Adrian and he have opened issues on the W3C repository regarding these specifications.
Hadrien discussed industry use cases for media overlay specifications, particularly highlighting Nota’s need to offer separate text and audio downloads rather than combining them in EPUB format. He explained that many libraries face challenges with EPUB’s current design, which requires both formats to be included together, leading to performance issues when users only need text or audio. Hadrien also mentioned ongoing work to explore synchronization beyond HTML IDs and the potential for image fragment-level synchronization in EPUB.
The group discussed challenges with screen reader functionality, particularly regarding read-aloud features and the synchronization between screen readers and text positions. George raised concerns about screen readers not knowing where read-aloud content ends, while Avneesh noted this is a complex problem they previously worked on in the DAISY reading software AMIS. The discussion also touched on CFI (Canonical Fragment Identifier) selectors in EPUB, with Hadrien clarifying that while CFI itself hasn’t been deprecated, its usage in EPUB has been largely discontinued after finding minimal actual usage in practice. CFI Usage in EPUB Specifications
The group discussed the current usage and potential applications of CFI in EPUB specifications. Hadrien explained that CFI usage is limited, with Google finding only one instance out of millions of files searched, though some implementations like Colibrio and Cirrus do use it. Marisa suggested that Lars’s interest in CFI might be related to synchronizing media external to an EPUB package, which could be addressed through alternative methods like text fragments or CSS selectors. The group agreed that identifying specific use cases should be prioritized before determining the best technical solution, with Avneesh emphasizing that finding the least common denominator approach that addresses most use cases should be the focus.
The group discussed challenges and requirements around audio synchronization in EPUB, particularly comparing it to DAISY standards. Avneesh noted that while media overlays for navigation documents are possible, they haven’t been explicitly specified in EPUB standards, leading to uncertainty about implementation. Jonas raised important use cases including the need for audio to announce semantic elements like page numbers and tables, as well as handling alternative text for non-visible content. Hadrien explained that while technical implementation isn’t problematic, the main challenge lies in expectations and best practices, suggesting that reading system vendors could better shape the audio-first experience approach similar to DAISY.
The group agreed to schedule their next meeting for mid-July, with Marisa planning to send date options to the mailing list.