Roadmap

What's coming next

Powered by unparalleled expertise in hardware-accelerated parallel processing and deep knowledge of LLM internals, FastFlowLM is advancing the frontier of on-device AI.

View on GitHub Contribute

Development priorities

Performance

Ongoing kernel optimizations and memory management improvements.
Model support

Expanding support for new architectures and quantization formats.
Developer tools

Enhanced CLI, better debugging, and improved documentation.

Roadmap

Future directions

Comprehensive NPU Support

FastFlowLM aims to be the go-to runtime for Ryzen™ AI NPUs, offering broad model compatibility, top-tier performance, and a robust developer ecosystem.
Expanding to New Architectures

We are actively extending platform support to additional NPU architectures, including Qualcomm, Intel, Broadcom, and more.
Inference at Scale

Building advanced inference optimization software designed to scale seamlessly across multiple chips, cards, and to enable rack-level parallelism.

Get involved

Roadmap priorities are discussed openly in our community, and we are actively seeking strategic partners and hardware collaborators to accelerate this work. Join the conversation to help shape FastFlowLM’s future.

Join Discord Contact us