Bug 10363 - Better VoiceOver rendering of web pages
Summary: Better VoiceOver rendering of web pages
Alias: None
Product: WebKit
Classification: Unclassified
Component: Accessibility (show other bugs)
Version: 420+
Hardware: Macintosh OS X 10.4
: P2 Normal
Assignee: Nobody
Depends on:
Reported: 2006-08-12 01:51 PDT by Nicholas Shanks
Modified: 2008-07-24 11:30 PDT (History)
2 users (show)

See Also:

xhtml2ssml.xsl (7.85 KB, application/xsl+xml)
2006-08-12 01:52 PDT, Nicholas Shanks
no flags Details
ssml2macintalk.xsl (7.89 KB, application/xsl+xml)
2006-08-12 01:53 PDT, Nicholas Shanks
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nicholas Shanks 2006-08-12 01:51:27 PDT
I've been looking at how to better render pages aurally. I have written two XSLT sheets which improve the spoken rendering of elements. Since I don't have the time to figure out why my XSLT calls aren't transforming anything, nor work out how VoiceOver works (e.g. words like "link" and "button" get said in one voice, while the main page is read in another) I am just going to upload them and hope someone can do something funky with it.

The first is xhtml2ssml, where a web page's (or element's) DOM tree is serialised into XHTML, and transformed by this XSLT. The output SSML (http://www.w3.org/TR/speech-synthesis/) can then be sent to an SSML-capable Text-to-Speech renderer such as the "Swift" engine (cepstral.com) or further transformed via ssml2macintalk and sent to Apple's TTS engine, via the Speech Manager.

The authoritative URL for each of these XSLT documents is:
http://web.nickshanks.com/stylesheets/xhtml2ssml.xsl &

This method is beneficial in that it preserves much of the semantic nature of documents, <html:em> elements become <ssml:emphasis> elements, become [[emph +]] MacinTalk commands. However it's most powerful feature, and the reason it's a two-step process, is because this allows people to include SSML commands directly into their XHTML documents, and have them understood by the Speech Manager.

Example serialisation of H1 element:
<h1 xml:lang="en"><ssml:phoneme alphabet="x-apple-macintalk" ph="r1>EHzyUWm2>EY">Résumé</ssml:phoneme></h1>

Would send this to MacinTalk:
[[inpt PHON]]r1>EHzyUWm2>EY[[inpt TEXT]]

(When serialising a DOM tree of a specific node, any inherited xml:lang attribute should always be added, so the TTS engine can pick an appropriate voice)

For anyone wondering, here are a few Mac TTS providers for voices in languages other than US English:
http://www.assistiveware.com/proloquo.php — UK English, German, Dutch, Flemish, French, Spanish, Venezuelan Spanish, Polish, Swedish, Norwegian, Brazilian Portuguese, Russian
http://cepstral.com/ — UK English, Americas Spanish, Canadian French, German, Italian
http://www.speechissimo.com/ — French, German, Spanish, Italian

Some TTS engines for WebKit on Windows are listed here: http://www.nextup.com/TextAloud/SpeechEngine/voices.html including Aussie/Indian English, Chinese, Japanese, and Korean. I don't know how many of these support SSML.
Comment 1 Nicholas Shanks 2006-08-12 01:52:48 PDT
Created attachment 10001 [details]
Comment 2 Nicholas Shanks 2006-08-12 01:53:17 PDT
Created attachment 10002 [details]
Comment 3 Nicholas Shanks 2006-08-12 02:03:36 PDT
nota bene: output of the DOCTYPE and <ssml:metadata> element are suppressed because Swift v4.1 (June 2006) actually speaks them out loud.
Comment 4 chris fleizach 2008-07-24 11:30:50 PDT
I don't think WebKit or VoiceOver can do anything with these style sheets. Providing them to developers of web pages may be beneficial