AssemblyAI
p/assemblyai
Transform speech into meaning with one seamless API
Meredith Rauch
Universal-1 — Multilingual speech AI model trained on 12.5M hours of data
Featured
22
Try AssemblyAI's most capable and highly trained speech recognition model trained on 12.5M hours of multilingual audio data. Universal-1 achieves best-in-class speech-to-text accuracy, reduces word error rate and hallucinations, and improves timestamps.
Replies
Meredith Rauch
The Universal-1 Speech-to-Text model was created to focus on the nuances of spoken language across accents, tone, dialect, faithfulness, and more. We hope the new capabilities of Universal-1 will help power the next generation of AI products and features built with voice data. Give it a try and let us know what you're building!
André J
@meredith_rauch Awesome! How close is it to human level interpretation? Are we 75% there or? What's the benchmark? Or is there a benchmark?
Meredith Rauch
@sentry_co Great question! Universal-1 in English came in at 92.7% accuracy across multiple datasets. We display all benchmarks here: https://www.assemblyai.com/bench... If you'd like to see a more in depth analysis of Universal-1, I highly recommend you check out our research which includes Word Error Rate by language, timestamp accuracy, hallucinations, and more. > https://www.assemblyai.com/resea... Hope this helps! :)
André J
@meredith_rauch I think there could be a clearer benchmark 92.7% against what?. I.e: Humans can interpret gibberish, or very chopped up or noisy audio. By filling in the blanks, or using context.
Kirill Markin
Congratulations on the launch! How did you manage to ensure high accuracy for languages with fewer training data?
William Jin
Great launch! How does Universal-1 handle multilingual support and accuracy across different languages?
Albert
congratulations on the launch of universal-1, dylan, britney, and meredith. your focus on reducing word error rates and enhancing dialect recognition is impressive. could you share how universal-1 handles low-resource languages and if there are plans to expand its linguistic dataset?
Bon
Congrats on the launch of Universal-1!
Benjamin Sloutsky
Seems like a great product for all the devs. Congrats on the launch!
Brandon
Congratulations on the launch! It's fantastic to see your product achieving a higher accuracy in voice recognition than many big companies. It's truly impressive and admirable. In the future, if my product development involves voice recognition, I will definitely consider using your API!
Alain Goldman
How good is it at Korean, cause elevenlabs is bad at that.
Meredith Rauch
@alain_gold1 Our Best tier supports Korean and includes speaker labels, custom spelling, automatic punctuation, etc. We encourage you to try our API for free - more on supported languages and account creation here: https://www.assemblyai.com/docs/...
Ghost Kitty
Comment Deleted
Salar Davari
Loved it. seems to be one of a kind.
Ena Gluhakovic
The Universal-1 Speech-to-Text model sounds incredibly promising! Since I use tools like this often for my work, I was wondering can you share some examples of the unique challenges you faced in achieving high accuracy across different tones and speaking styles?
Brett Hibbler
Excited to test this. Does the playground currently allow us to access this model? Or will it soon? I'm assuming part of what makes this work well is the same process you've been using, with word prediction based on context, as opposed to straight word for word phonetic reproduction of what it "hears"?
Meredith Rauch
@1lastshot Yes, the playground is currently utilizing Universal-1! You will be able to select tiers, and Universal-1 currently falls under [Best]. More on our tiers and research here: https://www.assemblyai.com/blog/...
Brett Hibbler
@meredith_rauch appreciate the reply and confirmation. I assume this probably works well for cleaner audio, but I'm finding newer transcription iterations, including universal 1after testing, are actually regressing for muddier audio. Their tendency to rewrite and predict what was supposedly said, as opposed to just trying its best to phonetically reproduce word for word, is resulting in compelling looking but pretty far off results. Is there a way to adjust for this on our end?
Zenda
Cool!👍 The free version can transcribe up to 100 hours of audio, It's sufficient for the initial use of the project.
maruhum siagian
Joy to have this app!!!
Gopal Jangid
Congratulations on the launch 🎊 I think this can be used for converting podcasts into blogs.🤔
Amy Yan
Great launch, AssemblyAI! 🚀 Universal-1's focus on multilingual speech recognition and improving accuracy is exactly what the industry needs. This model will make a huge difference in developing global voice applications. Excellent job!