Airi is an interactive digital companion engine designed to bridge large language models with local animation rendering. It functions as a middleware platform that synchronizes conversational text streams with skeletal and facial movements to drive virtual avatars in real time.
The framework distinguishes itself by integrating desktop context awareness, allowing characters to maintain situational awareness of a user's screen activity across both desktop and web environments. It utilizes a hybrid execution model that splits computational workloads between cloud-based language processing and local client-side rendering to facilitate responsive, low-latency interactions.
The system manages character behavior through a modular state machine that maps emotional tokens from language models to specific animation clips. It supports bidirectional event streaming to coordinate concurrent tasks, including audio synthesis, lip-sync generation, and character movement.