FastFlowLM FastFlowLM
How It Works Models Benchmarks Demos Test Drive
Testimonials
Team Community News Roadmap
Docs
How It Works Models Benchmarks Demos Test Drive
Testimonials
Team Community News Roadmap
Docs
GitHub Discord Email

Docs

LFM

Overview
Install

Instructions

Overview CLI Server Basics Open WebUI Microsoft AI Toolkit Obsidian RAG + LangChain Web Search + LangChain

Models

Overview LLaMA DeepSeek Qwen Gemma MedGemma GPT-OSS LFM Whisper EmbeddingGemma

Benchmarks

Overview LLaMA 3 Gemma 3 Qwen 3 GPT-OSS LFM2 1.2B

🧩 Model Card: LiquidAI/LFM2-1.2B

  • Type: Text-to-Text
  • Think: No
  • Base Model: LiquidAI/LFM2-1.2B
  • Max Context Length: 32k tokens
  • Default Context Length: 32k tokens (change default)
  • Set Context Length at Launch

▶️ Run with FastFlowLM in PowerShell:

flm run lfm2:1.2b

FastFlowLM The leading LLM inference runtime for parallel NPU architectures

© 2025 FastFlowLM. All rights reserved.

Site

Technology Testimonials Company Docs

Connect

GitHub Discord Email