whisper
Automatic Speech Recognition • OpenAIWhisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.
| Model Info | |
|---|---|
| More information | link ↗ | 
| Unit Pricing | $0.00045 per audio minute | 
Usage
Workers - TypeScript
  export interface Env {  AI: Ai;}
export default {  async fetch(request, env): Promise<Response> {    const res = await fetch(      "https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/samples/cpp/windows/console/samples/enrollment_audio_katie.wav"    );    const blob = await res.arrayBuffer();
    const input = {      audio: [...new Uint8Array(blob)],    };
    const response = await env.AI.run(      "@cf/openai/whisper",      input    );
    return Response.json({ input: { audio: [] }, response });  },} satisfies ExportedHandler<Env>;curl
  curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/openai/whisper  \  -X POST  \  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN"  \  --data-binary "@talking-llama.mp3"Parameters
* indicates a required field
Input
-  0string
-  1object-  audioarray requiredAn array of integers that represent the audio data constrained to 8-bit unsigned integer values -  itemsnumberA value between 0 and 255 
 
-  
 
-  
Output
-  textstring requiredThe transcription 
-  word_countnumber
-  wordsarray-  itemsobject-  wordstring
-  startnumberThe second this word begins in the recording 
-  endnumberThe ending second when the word completes 
 
-  
 
-  
-  vttstring
API Schemas
The following schemas are based on JSON Schema
{    "oneOf": [        {            "type": "string",            "format": "binary"        },        {            "type": "object",            "properties": {                "audio": {                    "type": "array",                    "description": "An array of integers that represent the audio data constrained to 8-bit unsigned integer values",                    "items": {                        "type": "number",                        "description": "A value between 0 and 255"                    }                }            },            "required": [                "audio"            ]        }    ]}{    "type": "object",    "contentType": "application/json",    "properties": {        "text": {            "type": "string",            "description": "The transcription"        },        "word_count": {            "type": "number"        },        "words": {            "type": "array",            "items": {                "type": "object",                "properties": {                    "word": {                        "type": "string"                    },                    "start": {                        "type": "number",                        "description": "The second this word begins in the recording"                    },                    "end": {                        "type": "number",                        "description": "The ending second when the word completes"                    }                }            }        },        "vtt": {            "type": "string"        }    },    "required": [        "text"    ]}Was this helpful?
- Resources
- API
- New to Cloudflare?
- Directory
- Sponsorships
- Open Source
- Support
- Help Center
- System Status
- Compliance
- GDPR
- Company
- cloudflare.com
- Our team
- Careers
- © 2025 Cloudflare, Inc.
- Privacy Policy
- Terms of Use
- Report Security Issues
- Trademark