We (Bee Partners) have not yet gone deep on audio processing in generative AI (it’s on our list!), but we have written code against the open sourced neural net called Whisper by OpenAI.
Whisper approaches human level robustness and accuracy on English speech recognition.
More information about Whisper can be found here. Whisper also has an API that we have experimented with as a potential input option for interrogating LLMs. We expect to see many more genAI NLP deployments soon as independent developers have begun incorporating Whisper into a wide array of applications.
<aside> 💡 If you are on a Mac, you can try a utility that Tim uses frequently called MacWhisper for super quick and highly accurate speech translation into any app. More about MacWhisper here.
</aside>
<aside> ☝ Back to other topics in Explore
</aside>