Close Menu
Elon Musk Monitor
  • Home
  • Elon Musk
  • AI
  • Cybertruck
    • DOGE & Cryptocurrency
    • Financial & Business
  • Grok
    • Hyperloop & Urban Mobility
    • Innovations & Future Projects
  • Mars Colonization
  • Neuralink
    • Philanthropy & Humanitarian Efforts
    • Public Perception & Cultural Impact
    • SolarCity & Renewable Energy
  • SpaceX
  • Starlink
  • Tesla
    • The Boring Company
  • X

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

July decision expected on combination of European space companies

June 17, 2025

XRP Must Complete Right Shoulder Before Takeoff: How Low?

June 17, 2025

Rising Bitcoin Dominance Above 64% Dashes Hopes Of Altcoin Season, Here’s Why

June 17, 2025
Facebook X (Twitter) Instagram
Elon Musk Monitor
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
  • Home
  • Elon Musk
  • AI
  • Cybertruck
    • DOGE & Cryptocurrency
    • Financial & Business
  • Grok
    • Hyperloop & Urban Mobility
    • Innovations & Future Projects
  • Mars Colonization
  • Neuralink
    • Philanthropy & Humanitarian Efforts
    • Public Perception & Cultural Impact
    • SolarCity & Renewable Energy
  • SpaceX
  • Starlink
  • Tesla
    • The Boring Company
  • X
Elon Musk Monitor
Home » OpenAI Introduces New Audio Models in API, Can Be Used for Agentic Workflows
Grok

OpenAI Introduces New Audio Models in API, Can Be Used for Agentic Workflows

elonmuskBy elonmuskMarch 21, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link


OpenAI, on Thursday, introduced new audio models in application programming interface (API) that offer improved performance in accuracy and reliability. The San Francisco-based AI firm released three new artificial intelligence (AI) models for both speech-to-text transcription and text-to-speech (TTS) functions. The company claimed that these models will enable developers to build applications with agentic workflows. It also stated that the API can enable businesses to automate customer support-like operations. Notably, the new models are based on the company’s GPT-4o and GPT-4o mini AI models.

OpenAI Brings New Audio Models in API

In a blog post, the AI firm detailed the new API-specific AI models. The company highlighted that over the years it has released several AI agents such as Operator, Deep Research, Computer-Using Agents, and the Responses API with built-in tools. However, it added that the true potential of agents can only be unlocked when they can perform intuitively and interact across mediums beyond text.

There are three new audio models. GPT-4o-transcribe and GPT-4o-mini-transcribe are the speech-to-text models and the GPT-4o-mini-tts is, as the name suggests, a TTS model. OpenAI claims that these models outperform its existing Whisper models which were released in 2022. However, unlike the older models, the new ones are not open-source.

Coming to the GPT-4o-transcribe, the AI firm stated that it showcases improved “word error rate” (WER) performance on the Few-shot Learning Evaluation of Universal Representations of Speech (FLEURS) benchmark which tests AI models on multilingual speech across 100 languages. OpenAI said the improvements were a result of targeted training techniques such as reinforcement learning (RL) and extensive midtraining with high-quality audio datasets.

These speech-to-text models can capture audio even in challenging scenarios such as heavy accents, noisy environments, and varying speech speeds.

The GPT-4o-mini-tts model also comes with significant improvements. The AI firm claims that the models can speak with customisable inflections, intonations, and emotional expressiveness. This will enable developers to build applications that can be used for a wide range of tasks including customer service and creative storytelling. Notably, the model only offers artificial and preset voices.

OpenAI’s API pricing page highlights that the GPT-4o-based audio model will cost $40 (roughly Rs. 3,440) per million input tokens and $80 (roughly Rs. 6,880) per million output tokens. On the other hand, the GPT-4o mini-based audio models will be charged at the rate of $10 (roughly Rs. 860) per million input tokens and $20 (roughly Rs. 1,720) per million output tokens.

All of the audio models are now available to developers via API. OpenAI is also releasing an integration with its Agents software development kit (SDK) to help users build voice agents.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
elonmusk
  • Website

Related Posts

Google Unveils India-Focused Safety Charter, Shares How It Is Using AI to Combat Online Frauds and Scams

June 17, 2025

Reddit Unveils Reddit Community Intelligence, Its Suite of AI-Powered Ad Tools for Enterprises

June 17, 2025

OpenAI Improves Web Search Tool in ChatGPT, Can Now Handle More Complex Queries

June 17, 2025
Leave A Reply Cancel Reply

Don't Miss
Cybertruck

Tesla Cybertruck police truck donor revealed

A batch of Tesla Cybertrucks were recently revealed to be a donation to the Las…

Tesla upgrades its ridiculous Cybertruck wiper after owners report issue

February 27, 2025

Tesla Cybertruck contract with State Dept. may have been modified after Biden admin

February 26, 2025

This Tesla Cybertruck feature helped it earn a ‘Best Tech’ award

February 25, 2025
Top Posts

XRP Must Complete Right Shoulder Before Takeoff: How Low?

June 17, 2025

Rising Bitcoin Dominance Above 64% Dashes Hopes Of Altcoin Season, Here’s Why

June 17, 2025

Bear Signal Lingers On Dogecoin—Here’s Why That’s Bullish

June 17, 2025

Ethereum’s $4K Target Within Reach, Here’s What Needs to Happen First

June 17, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to Elon Musk Monitor, your go-to source for comprehensive, up-to-date information on the life, work, and innovations of one of the most influential figures in the world today—Elon Musk. Our mission is to keep you informed about Musk’s ventures and projects, ranging from electric vehicles to space exploration, and everything in between. Whether you’re a tech enthusiast, investor, or simply curious about Musk’s impact on the world, we’ve got you covered.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

XRP Must Complete Right Shoulder Before Takeoff: How Low?

June 17, 2025

Rising Bitcoin Dominance Above 64% Dashes Hopes Of Altcoin Season, Here’s Why

June 17, 2025

Bear Signal Lingers On Dogecoin—Here’s Why That’s Bullish

June 17, 2025
Most Popular

How I met my partner on X/Twitter

February 8, 2025

DOGE staffer resigns after racist posts uncovered. Elon Musk might bring him back.

February 9, 2025

OpenAI accuses DeepSeek of stealing data, internet digs into the ‘irony’

February 9, 2025
  • Home
  • About Us
  • Advertise with Us
  • Contact Us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 elonmuskmonitor. Designed by elonmuskmonitor.

Type above and press Enter to search. Press Esc to cancel.