Demos
See FastFlowLM running on real hardware
Explore live demos and recordings that showcase FastFlowLM powering LLMs, VLMs, and embeddings fully on the Ryzen™ AI NPU. From interactive chat experiences to system integrations, these examples highlight what’s possible when you make the NPU the primary inference engine.
GPT-OSS on NPU
GPT-OSS-20B streaming fully on the Ryzen™ AI NPU
Stream GPT-OSS-20B locally with FastFlowLM, keeping CPU and GPU usage minimal while the NPU does the heavy lifting.