Groq Whisper API
|

Groq Whisper API: An Innovative Communication Interface

In the midst of our exploration of Claude 3.5 Sonnet, GroqInc unveiled an API for OpenAI’s latest Whisper-large-v3 model. This solution converts speech to text at an incredibly rapid pace. Moreover, it’s offered at the affordable rate of $0.03 per transcription hour! For those interested in leveraging Groq’s Whisper API alongside the OpenAI client, here’s how you can do it:

from openai import OpenAI
groq = OpenAI(api_key="YOUR_GROQ_API_KEY",
base_url="
audio_file = open("/path/to/your_audio.mp3", "rb")
transcript = groq.audio.transcriptions.create(
model="whisper-large-v3", file=audio_file, response_format="text"
print(transcript)
Example of Groq's Whisper API being utilized with OpenAI client
Illustration of using Groq’s Whisper API with the OpenAI client

Understanding Whisper

Whisper is an advanced model specifically developed for automatic speech recognition (ASR) and speech-to-text translation. Its training was extensive, involving 680,000 hours of labeled data, which has equipped it with the capability to adapt to a variety of datasets and applications without bespoke adjustments.

This cutting-edge model was first introduced through the study “Robust Speech Recognition via Large-Scale Weak Supervision” by Alec Radford and others from OpenAI. The model’s source code is openly accessible for reference.

Key features of the Whisper large-v3 iteration include:

  • An upgrade to 128 Mel frequency bins for its input, up from the previous 80.
  • The addition of a language token specifically for Cantonese.

The development of Whisper large-v3 involved training on 1 million hours of weakly labeled audio alongside an additional 4 million hours of pseudolabeled audio content, originally recognized by the Whisper large-v2 model. This comprehensive training covered 2.0 epochs across the combined datasets.

Among its advancements, the large-v3 model boasts a notable reduction in recognition errors, ranging from 10% to 20%, compared to its predecessor, large-v2.

Learn more: Comparing Whisper large-v3 and large-v2 models

Related readings:



OptiPrime – Global leading total performance marketing “mate” to drive businesses growth effectively. Elevate your business with our tailored digital marketing services. We blend innovative strategies and cutting-edge technology to target your audience effectively and drive impactful results. Our data-driven approach optimizes campaigns for maximum ROI.

Spanning across continents, OptiPrime’s footprint extends from the historic streets of Quebec, Canada to the dynamic heartbeat of Melbourne, Australia; from the innovative spirit of Aarhus, Denmark to the pulsating energy of Ho Chi Minh City, Vietnam. Whether boosting brand awareness or increasing sales, we’re here to guide your digital success. Begin your journey to new heights with us!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *