Skip to content

Techeons

Imagine | Explore | Innovate

Menu
  • Home
Menu

Transcribe locally using a local model with Whisper.cpp

Posted on February 23, 2025

Whisper.cpp is an open-source, C++ implementation of the Whisper speech recognition system. Whisper is an automatic speech recognition (ASR) system developed by OpenAI that can transcribe and translate speech in multiple languages.

It is a lightweight, CPU-only, and highly optimized version of the original Whisper model, which makes it suitable for deployment on edge devices, such as smartphones, tablets, and single-board computers.

An exciting development in the field of speech recognition, enabling developers to build innovative applications that can understand and interact with human speech!

Some key features of Whisper.cpp include:

  • High accuracy: Whisper.cpp achieves state-of-the-art ASR performance on various benchmarks.
  • Multi-language support: Whisper.cpp supports transcription and translation in multiple languages, including English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese, Japanese, and Korean.
  • Low latency: Whisper.cpp is optimized for real-time transcription and can process audio streams with low latency.
  • Small footprint: Whisper.cpp has a small binary size, making it suitable for deployment on resource-constrained devices.

How to set it up

1) Clone the repository:

git clone https://github.com/ggerganov/whisper.cpp.git

2) Navigate into the directory:

cd whisper.cpp

3) Download one of the Whisper models converted in ggml format. For example:

sh ./models/download-ggml-model.sh base.en

4) Now build the whisper-cli example and transcribe an audio file like this:

# build the project
cmake -B build
cmake --build build --config Release

# transcribe an audio file
./build/bin/whisper-cli -f samples/jfk.wav

For a quick demo, simply run make base.en.

The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples.

For detailed usage instructions, run: ./build/bin/whisper-cli -h

Note that the whisper-cli example currently runs only with 16-bit WAV files, so make sure to convert your input before running the tool. For example, you can use ffmpeg like this:

ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav

More audio samples

If you want some extra audio samples to play with, simply run:

make -j samples

This will download a few more audio files from Wikipedia and convert them to 16-bit WAV format via ffmpeg.

You can download and run the other models as follows:

make -j tiny.en
make -j tiny
make -j base.en
make -j base
make -j small.en
make -j small
make -j medium.en
make -j medium
make -j large-v1
make -j large-v2
make -j large-v3
make -j large-v3-turbo

Links

  • https://github.com/ggerganov/whisper.cpp

Share on Social Media
x facebook pinterest linkedin tumblr reddit emailwhatsapptelegrammastodon

Leave a Reply Cancel reply

You must be logged in to post a comment.

Recent Posts

  • Nginx: How to increase timeout for Nginx
  • Cheat Sheet: Essential Git Commands
  • Setting a default shell in Linux
  • Setting up Composer on Linux
  • Switch easily between Python versions on a Mac using pyenv

Tags

ai alerting aws b2 backblaze certificate cheatsheet cloud commands data-science datalake devops dns docker dremio git gitlab infra jenkins kubernetes linux metabase minikube minio monitoring mount mysql nginx nodejs notebooks openssh php python scala secrets spark ssh ssl ubuntu ufw usb web dev tools windows xampp zeppelin

©2026 Techeons | Design: Newspaperly WordPress Theme