Realtime AI avatar
Track the stack for low-latency face-to-face video agents, from speech and LLMs to avatar streaming.

Wan-Streamer v0.1 research tracker
Follow Alibaba's realtime video AI model, API availability, Hugging Face updates, and production-ready alternatives for interactive AI avatars and video agents.
Public API
Not available yet
Open weights
Not released
Hugging Face
Paper only
Best current path
Avatar API + realtime LLM
API status
Wan-Streamer is different from normal text-to-video tools: the paper describes a model that can listen, watch, reason, speak, and generate video responses in a realtime loop. Public access has not launched, so this page tracks what is usable now and what changes.
No public API endpoint yet.
No official downloadable weights.
Project page, arXiv paper, HF paper.
Alternatives
Most production systems combine a realtime LLM, speech, avatar rendering, and WebRTC instead of using one end-to-end model.
Track the stack for low-latency face-to-face video agents, from speech and LLMs to avatar streaming.
Compare APIs for website sales agents, product explainers, onboarding guides, and customer support.
Follow tools that can power interactive AI anchors for education, shopping, livestreams, and role play.
Not yet. Wan-Streamer v0.1 is currently a research project and demo, not a public commercial API or downloadable model.
No. Wan 2.7 APIs generate or edit videos asynchronously. Wan-Streamer is about realtime audio-video interaction.
Teams usually combine OpenAI Realtime or another realtime LLM with Tavus, HeyGen, D-ID, Simli, Azure Avatar, LiveKit, or Pipecat.
Public API status, Hugging Face updates, papers, GitHub releases, alternatives, latency claims, pricing, and integration notes.