Transcribe any audio or video file to SRT, VTT, or plain text in seconds. Powered by OpenAI's open-source Whisper model running locally in your browser — no upload, no watermark, no sign-up.
Drop an audio or video file, or click to browse. MP4, MP3, WAV, M4A, MOV and more are supported.
Whisper runs in your browser. First use downloads the model (~80 MB) once and caches it for instant offline reuse.
Preview segments with timestamps. Download SRT for video players, VTT for the web, or plain text — or copy to clipboard.
Yes — 100% free with no sign-up, no watermark, and no upload limits.
No. The Whisper model runs entirely in your browser. Your audio and video never leave your device.
Whisper supports 99 languages. The UI exposes common ones (English, Chinese, Japanese, Korean, Spanish, French, German, Portuguese, Russian, Arabic, Hindi) plus auto-detect.
We use Whisper base by default — very good accuracy on clear speech. Switch to Fast mode (whisper-tiny) for shorter clips or simple English, or Precise (whisper-base) for production captions.
First use downloads the Whisper model (~80 MB) and caches it. Subsequent runs start in seconds and work offline.