<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>266724</bug_id>
          
          <creation_ts>2023-12-20 12:45:24 -0800</creation_ts>
          <short_desc>AX: Feature request: Support WebVTT-based synthesized audio description in video</short_desc>
          <delta_ts>2026-02-02 10:42:05 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>Accessibility</component>
          <version>Safari 17</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>InRadar</keywords>
          <priority>P2</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Adrian Roselli">aroselli</reporter>
          <assigned_to name="Eric Carlson">eric.carlson</assigned_to>
          <cc>andresg_22</cc>
    
    <cc>eric.carlson</cc>
    
    <cc>webkit-bug-importer</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>2001151</commentid>
    <comment_count>0</comment_count>
    <who name="Adrian Roselli">aroselli</who>
    <bug_when>2023-12-20 12:45:24 -0800</bug_when>
    <thetext>WHATWG HTML specifies `kind=&quot;descriptions&quot;` for `&lt;track&gt;` so authors can provide a script for audio descriptions for associated videos:
https://html.spec.whatwg.org/multipage/media.html#attr-track-kind-keyword-descriptions

WHATWG HTML also provides parsing rules (including for making them extended audio descriptions):
https://html.spec.whatwg.org/multipage/media.html#playing-the-media-resource

The current WCAG advisory technique guides authors to use `&lt;track&gt;` with `kind=&quot;descriptions&quot;` to provide a source for the browser to provide synthesized audio description:
https://www.w3.org/WAI/WCAG22/Techniques/html/H96

A caveat is that the latest update from WebVTT CG Report suggests WHATWG HTML may have jumped the gun:
&gt; The majority of the current version of this specification is dedicated to describing how to use WebVTT files for captioning or subtitling. There is minimal information about chapters and time-aligned metadata and nothing about video descriptions at this stage.
https://w3c.github.io/webvtt/#introduction

In other words, I would love to see this supported but if WHATWG HTML is punting to WebVTT and WebVTT has not fully specified it, then perhaps WebKitcan either:
1. take the lead and implement this, helping cement behaviors; or
2. push for clarity in WHATWG HTML using its outsized position of influence.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2001153</commentid>
    <comment_count>1</comment_count>
    <who name="Radar WebKit Bug Importer">webkit-bug-importer</who>
    <bug_when>2023-12-20 12:45:33 -0800</bug_when>
    <thetext>&lt;rdar://problem/119949632&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2085938</commentid>
    <comment_count>2</comment_count>
    <who name="Eric Carlson">eric.carlson</who>
    <bug_when>2025-01-08 17:29:06 -0800</bug_when>
    <thetext>WebKit added experimental support for Standard and Extended Audio Descriptions in 254266@main (332cd88dafc8) and 254502@main (c2f7594742ab).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2086108</commentid>
    <comment_count>3</comment_count>
    <who name="Adrian Roselli">aroselli</who>
    <bug_when>2025-01-09 07:46:43 -0800</bug_when>
    <thetext>Thanks, Eric. I am copying over the instructions you added to a related WHATWG discussion (https://github.com/whatwg/html/issues/10866#issuecomment-2578996539), partly so I can track this in one place:

&gt; WebKit has experimental support for both Standard and Extended Audio Descriptions. It uses the text to speech engine to render cues from a WebVTT track who&apos;s kind is descriptions.

&gt; When Extended AD is enabled, playback is automatically paused if it takes longer to speak a cue&apos;s text than the cue&apos;s specified duration. If playback is paused, it is automatically resumed when the utterance finishes.

&gt; When only Standard AD is enabled, playback is not automatically paused even if the text to speech engine has not finished speaking the cue text after the cue&apos;s end time.

&gt; These features are not enabled by default yet, but you can try them by selecting Settings from the &quot;Safari&quot; menu, selecting the Feature Flags pane, typing &quot;description&quot; in the Search box, and checking one or both features.

Using Safari 18.2 on macOS 15.2 I did the following:

1. Safari &gt; Settings...
2. chose &quot;Featured Flags&quot;
3. checked &quot;Audio descriptions for video - Extended&quot;
4. checked &quot;Audio descriptions for video - Standard&quot;
5. Visited the video at this anchor: https://adrianroselli.com/2023/12/ad-support-in-html-video.html#Synthesized
6. Played the video.
7. Heard no synthesized AD.

I closed and opened Safari. I rebooted my Mac. I reloaded the page. I tried with only one of the two options checked (and vice versa).

So either the feature does not work or my code is broken.

Do you have a sample page posted somewhere that works so I can try it?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2086125</commentid>
    <comment_count>4</comment_count>
    <who name="Eric Carlson">eric.carlson</who>
    <bug_when>2025-01-09 08:50:38 -0800</bug_when>
    <thetext>My apologies, I forgot one important step - you need to enable Audio Descriptions in system preferences: 

 - Open System Settings
 - Open the Accessibility panel
 - Click on &quot;Audio Descriptions&quot;
 - Toggle the &quot;Play audio descriptions when available&quot; button</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2086192</commentid>
    <comment_count>5</comment_count>
    <who name="Adrian Roselli">aroselli</who>
    <bug_when>2025-01-09 14:37:07 -0800</bug_when>
    <thetext>I have enabled those settings and confirm it works. Thank you!

I see this feature has existed for over 2 years (a year before I filed this issues). When will this feature move out of experimental?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2087418</commentid>
    <comment_count>6</comment_count>
    <who name="Eric Carlson">eric.carlson</who>
    <bug_when>2025-01-15 05:23:54 -0800</bug_when>
    <thetext>Pull request: https://github.com/WebKit/WebKit/pull/39076</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>2177221</commentid>
    <comment_count>7</comment_count>
    <who name="EWS">ews-feeder</who>
    <bug_when>2026-02-02 10:42:03 -0800</bug_when>
    <thetext>Committed 306643@main (fb031eee9437): &lt;https://commits.webkit.org/306643@main&gt;

Reviewed commits have been landed. Closing PR #39076 and removing active labels.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>