omlx is a local inference server designed to run large language models, vision models, and embedding models on Apple Silicon. It provides a private alternative to industry-standard AI endpoints by hosting a local API gateway that mirrors OpenAI and Anthropic specifications. The system distinguishes itself through specialized hardware optimizations, including continuous batching for high throughput and a tiered caching system that offloads memory blocks to SSD. It also functions as a Model Context Protocol host, enabling the integration of local models with external tools, agents, and structur
GLM-4 is an open weights large language model designed as a multimodal chat system. It functions as a reasoning-focused and multilingual model capable of processing and generating responses across text and visual data types. The model is distinguished by its function-calling capabilities, allowing it to interface with external tools and APIs to execute tasks and retrieve real-time information. It is optimized for complex logical reasoning, mathematical problem solving, and deep research involving long-form content generation. Broad capabilities include multilingual text generation, the creat
Open-claude-cowork is an LLM agent workflow orchestrator and multi-agent collaborative workspace. It serves as a SaaS tool integration framework and a real-time AI chat interface designed to connect large language models with external software applications and browser tools to automate complex business processes. The platform functions as a headless browser automation tool, enabling AI agents to navigate websites and interact with web-based interfaces automatically. It allows for the creation of shared environments where multiple agents coordinate using external tools and shared memory to com
This project is a development framework for building edge-based AI agents that perform multimodal inference and system-level automation directly on mobile devices. By prioritizing local-first execution, the platform ensures data privacy and offline functionality, allowing developers to run large language models on hardware without requiring external server connectivity. The framework distinguishes itself through an integrated orchestration layer that connects language models to custom tools, scripts, and native device intents. It provides a structured registry for mapping natural language ins