Text-to-speech and language issues

vincenthell · ‎01-03-2025

I recently gave feedback to the Nowlearning plattform via the 'open case' feature and I'd like to give the community a chance to give their input on the matter in the comments. here is my original feedback:

I'm a fairly new consultant and learning Servicenow for the first time. I'm doing the IT Service Management (ITSM) Implementation On Demand course and have some feedback.
I'm not a native speaker but my English skills have been regularly praised by British and American native speakers without prompt. I am confident in my language and understanding skills.

I am having issues with the text-to-speech format of the courses:

The most irritating thing to me is that most of the sentences used by the text-to-speech are not meant to be listened to. They are structured and written in a way suitable for reading. Its super dense technical documentation with explanations sometimes spanning only one or two sentences. Written and spoken language are very different and for a good reason! it's not possible to follow what is being 'said' if the speech used is one that's meant to be read. I'll quote from a video:

"As discussed, when creating an incident from a problem, the problem will have a lot of the information automatically populated with the information from the incident. The information copy can also be configured using the problem properties. So you don't have to make any changes to the code in the platform to modify which fields are being transported over from the incident to the problem."
IT Service Management (ITSM) Implementation On Demand - From detection to diagnosis

When reading this, it's not easy to understand but completely manageable. Now try reading that to someone while showing them a graphic and test how much is understood and memorized. And that's just one paragraph! The way that the sentences are structured make it harder to understand them as compared to reading them, even if they are read cleanly....but they're not; which brings me to my next point...
The text-to-speech bot has bad pronounciation and vocal emphasis. Perhaps that's where my non-native speaker aspect comes into play but the pronounciation of some words are just plain wrong (reopened is pronounced like re-oppennd) whitch makes it even harder to understand sentences written in a way like every word matters (read language). Having to decypher mispronounced words draws from the limited resources that I need to grasp the meaning behind the spoken words.
Throw in a vocal emphasis on the wrong word or part of the sentence and you make the whole thing unintelligible. Emphasis plays a VITAL role in vocal communication. The text-to-speech messes that up in almost every sentence and it's compounding with every other issue I've mentioned so far.
Last point - filler words, examples and repetition:
When teaching, the use of examples and repetition is essential to give the student time to process the information-heavy parts of what is being said. People do that naturally, as they themselves need time to prepare for their next thoughts and confirming, that what they just said is correct. Adding filler words is also necessary to make complex sentences more digestable.
The language used in the lessons with text-to-speech does not acknowledge these corner stones of good communication, which contributes to it's failure in teaching in a efficient (student friendly) manner.

Conclusion: The way that Nowlearning currently deploys text to speech is placing an unjustifiable burden on their students. Of course it's a cost-cutting measure and I understand that but what I mean with unjustifiable is that the fixes can be implemented easily and, being a learning platform, I expect the people running it to be aware of these issues and not make them in the first place.

Here are my recommendations:

Deploy a state of the art text-to-speech programm. Better programs have been demonstrated regularly in the past 2 years and I have had no issue understanding bot-speech from any of the AI companies. Alternatively: let humans talk again (I would personally prefer that!)
2: When scripting videos, respect the requirements of vocal communication and teaching, as mentioned above. If you need further information about this topic (I'm giving you a stern look right now) you can find very good advice on this youtube channel: https://www.youtube.com/@askvinh

I don't think I'm asking for too much and I believe you can (and hopefully will) use my feedback in the constructive way that it is intended.

Thank you for taking your time to read my words.

Best regards,
Vincent Hellsing from Agineo-Germany

Text-to-speech and language issues

IT Service Management (ITSM) Implementation On Demand - From detection to diagnosis