Speech and Audio to be Transformed Into Usable Data

 Nuance Communications, Inc. today introduced the Nuance Transcription Engine (NTE), a powerful, fast and accurate engine that can quickly transform massive amounts of recorded audio into actionable data across a wide range of industries.

With decades of experience in voice and language solutions, Nuance will now support the transcription needs of a wide range of organisations and applications including freeform audio in enterprises, broadcast media, and analytics among others. For example, with NTE an organisation can: transcribe large volumes of audio content within its call centre to provide rich customer insights and improve service; produce rapid transcription of broadcast media in order to understand and analyse what is being said in near-real time worldwide, or fully transcribe corporate audio and video assets for rapid searching and indexing. Additionally, NTE can capture and document what is being said in a conference room or interview to create accurate automated archives of critical conversations.

NTE can produce valuable text output from multi-speaker audio files by applying highly accurate (88 per cent) transcription capabilities. By unlocking the data from audio, actionable insight can be achieved through big data analytics. NTE supports 15 languages and 30 dialects, with additional languages rolling out regularly, to increase efficiency and streamline operations around the world.

In a number of real-world situations, the ability to generate fast and accurate transcription can have an immeasurable impact,” said Robert Weideman, executive vice president and general manager, Nuance Enterprise. “Whether it’s to provide deeper voice of the customer insights, to understand discussions in health and human services interviews, or to provide insights for law enforcement, we’re proud to introduce Nuance Transcription Engine which leverages our vast experience in speech and natural language understanding to unlock, capture, and mine data, providing the valuable insights that organisations desire.”

Nuance Transcription Engine transcribes speech content with the highest accuracy in the industry at a remarkable speed, making previously inaccessible unstructured data available for analysis, searching and indexing. Pre-recorded audio can be processed from 1x Real Time Factor (RTF), making it possible to process 60s of audio in 60s, with maximum accuracy, and up to 10x RTF enabling processing of 60s of audio in just 6s. NTE can also transcribe live conversations, streaming results after only approximately ten seconds, making transcription available almost immediately after the words are spoken.

To provide organisations with the most meaningful insights, and appeal to a wide range of applications, NTE has two output formats designed to power analytics/data mining as well as those that require formatted text. The first output format, optimised for search/analytics use-cases, produces word level time-stamps and n-best lists. In pre-recorded audio, NTE can identify and assign transcript elements to specific speakers – even if the audio file was recorded in mono – and non-speech audio like noise and silence is flagged. The alternative output style generates a human-readable format. In both formats, customers can easily customise the vocabulary for their application (e.g. adding product names or person names at runtime), or create a set of specialised language models offline, and activate them selectively at runtime.

Accurately capturing massive amounts of unstructured data that was previously inaccessible is rapidly creating new disruptive business opportunities to analyse and act on transient content.  For example, Nuance partner, Veritone, is already using NTE to enable unsurpassed efficiency and accuracy in transcribing global media. “We are thrilled to work with Nuance to incorporate the powerful NTE engine into our Cognitive Media Platform,” said Chad Steelberg, CEO, Veritone. “Through this partnership, we are able to transcribe global media in near-real time to unlock actionable intelligence. As a result, Veritone’s customers and development partners are finding that streaming and recorded media are increasingly valuable assets for cognitive analysis and decision making.”

Author: Dylan Jones

Share This Post On