🧩 Model Card: whisper-large-v3-turbo

  • Type: Speach-to-Text (ASR: Automatic Speech Recognition)
  • Think: No
  • Base Model: openai/whisper-large-v3-turbo
  • Max Context Length: NA
  • Default Context Length: NA

▶️ Run with FastFlowLM in PowerShell:

ASR model requires to use with an LLM (load concurrently) for CLI Mode. ASR model can be used as an independent ASR model in Server Mode (flm v0.9.21 and after).

CLI Mode

Start with ASR enabled:

Load the ASR model (whisper-v3:turbo) in the background, with concurrent LLM loading (gemma3:4b).

flm run gemma3:4b --asr 1 

or

flm run gemma3:4b -a 1 

Then, type (replace filename.mp3 with your audio file path):

/input "path\to\audio_sample.mp3" summarize it

Server Mode

Start with ASR enabled:

  • Load the ASR model (whisper-v3:turbo) in the background, with concurrent LLM loading (gemma3:4b).
    flm serve gemma3:4b --asr 1 
    

    or

    flm serve gemma3:4b -a 1 
    
  • Load the ASR model (whisper-v3:turbo) as a standalone ASR model.
    flm serve --asr 1 
    

    or

    flm serve -a 1 
    

Send audio to POST /v1/audio/transcriptions via any OpenAI Client or Open WebUI.

see more API details here → /v1/audio/

Example 1: OpenAI Client


# Import the official OpenAI Python SDK (FastFlowLM mirrors the OpenAI API schema)
from openai import OpenAI

# Initialize the client to point at your local FastFlowLM server
# - base_url: FastFlowLM's local OpenAI-compatible REST endpoint
# - api_key: Dummy token; FastFlowLM typically doesn't enforce auth, but the client requires a string
client = OpenAI(
    base_url="http://localhost:52625/v1",  # FastFlowLM local API endpoint
    api_key="flm",                         # Placeholder key
)

# Open the audio file in binary mode and create a transcription request
# - model: name of the speech-to-text model exposed by FLM (e.g., "whisper-v3")
# - file: file-like object pointing to your audio
with open("audio.mp3", "rb") as f:
    resp = client.audio.transcriptions.create(
        model="whisper-v3",
        file=f,
    )

# Print the transcribed text returned by the server
print(resp.text)

Example 2: Open WebUI

  • Follow Open WebUI setup guide.
  • In the bottom-left corner, click User icon, then select Settings.
  • In the bottom panel, open Admin Settings.
  • In the left sidebar, navigate to Audio.
  • Set Speech-to-Text Engine to OpenAI.
  • Enter:

    API Base URL: http://localhost:52625/v1 (Open WebUI Desktop) or http://host.docker.internal:52625/v1 (Open WebUI in Docker)
    API KEY: flm (any value works)
    STT Model: whisper-large-v3-turbo (type in the model name; can be different)

  • Save the setting.
  • You’re ready to upload audio files! (Choose an LLM to load and use concurrently)