# Desktop Computer Control Agents

> Search results for `give agents a computer to control the desktop and apps` on awesome-repositories.com. 114 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/give-agents-a-computer-to-control-the-desktop-and-apps

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/give-agents-a-computer-to-control-the-desktop-and-apps).**

## Results

- [bytedance/ui-tars-desktop](https://awesome-repositories.com/repository/bytedance-ui-tars-desktop.md) (36,445 ⭐) — UI-TARS-desktop is a cross-platform desktop application designed to automate software interface interactions. It functions as a local agent environment that interprets graphical user interfaces through multimodal visual-language model reasoning, allowing it to navigate and manipulate software by simulating human-like mouse and keyboard inputs.

The platform distinguishes itself by executing all visual recognition and decision-making logic directly on the host machine. This local inference model ensures that screen data and sensitive information remain private, as no processing is offloaded to
- [pot-app/pot-desktop](https://awesome-repositories.com/repository/pot-app-pot-desktop.md) (17,110 ⭐) — This application is a cross-platform desktop utility designed for automated translation, optical character recognition, and speech synthesis. It functions as a modular client that integrates various local and remote language services, allowing users to process text through hotkeys, clipboard monitoring, or direct input.

The software distinguishes itself through a plugin-based architecture and a built-in automation framework. By exposing a local network interface, it enables external applications and scripts to programmatically trigger its translation and recognition workflows. Users can furth
- [desktop/desktop](https://awesome-repositories.com/repository/desktop-desktop.md) (21,587 ⭐) — This project is a graphical desktop client for managing version control repositories. It provides a visual interface that translates complex command-line operations into intuitive workflows for tracking local code changes and synchronizing them with remote servers.

The application distinguishes itself through integrated credential management and network configuration tools. It utilizes secure authentication flows to handle remote service logins and includes a network layer that automatically detects and applies system-wide proxy settings. These capabilities ensure that version control operati
- [forem/forem](https://awesome-repositories.com/repository/forem-forem.md) (22,726 ⭐) — Forem is an open-source platform designed for building and managing technical communities. It functions as a social publishing engine that enables members to share long-form content, participate in threaded discussions, and engage through social interactions. The platform provides tools for organizations to maintain branded profiles, host community hackathons, and facilitate collaborative learning through structured educational tracks.

Beyond its social features, Forem integrates advanced capabilities for AI agent workflow orchestration and codebase knowledge graphing. It allows developers to
- [e2b-dev/open-computer-use](https://awesome-repositories.com/repository/e2b-dev-open-computer-use.md) (2,084 ⭐) — Open-computer-use is a framework designed to connect vision-capable language models to isolated cloud-based desktop environments. It functions as an agentic interface that enables autonomous systems to interact with graphical user interfaces by simulating mouse movements, keyboard keystrokes, and shell commands. By bridging language models with remote workspaces, the platform facilitates the execution of complex, long-running tasks within secure, sandboxed environments.

The platform distinguishes itself through its ability to orchestrate thousands of concurrent, isolated instances, making it
- [bytebot-ai/bytebot](https://awesome-repositories.com/repository/bytebot-ai-bytebot.md) (10,413 ⭐) — Bytebot is an LLM desktop automation framework and virtual Linux desktop environment. It enables AI agents to plan and execute mouse and keyboard actions on a virtual computer using natural language, allowing for autonomous desktop automation and the integration of legacy systems that lack native APIs.

The system operates as an LLM API gateway and a Model Context Protocol server, routing requests across multiple language model providers with integrated load balancing and rate limiting. It provides isolated, containerized environments where agents use visual reasoning to interpret screenshots
- [leanote/desktop-app](https://awesome-repositories.com/repository/leanote-desktop-app.md) (1,559 ⭐) — Leanote Desktop App, based on  Electron(atom-shell) http://leanote.org
- [andrewyng/aisuite](https://awesome-repositories.com/repository/andrewyng-aisuite.md) (14,692 ⭐) — This project is a framework for managing generative AI services through a unified provider interface and adapter layer. It provides a standardized API for calling multiple cloud-based and locally hosted models, translating provider-specific parameters and responses into a uniform format.

The system includes an agent orchestrator designed for long-running tasks, featuring state persistence for resuming runs and execution tracing to monitor decision-making processes. It integrates the Model Context Protocol to connect models to external servers and filesystems and employs a policy-based executi
- [hiddify/hiddify-app](https://awesome-repositories.com/repository/hiddify-hiddify-app.md) (30,948 ⭐) — Hiddify is a cross-platform proxy client designed to manage secure network connections and traffic routing across desktop and mobile operating systems. It functions as a unified proxy manager, providing a centralized interface to configure and control various network proxy protocols for encrypted and private internet access.

The application distinguishes itself by integrating local loopback interception, which configures the operating system network stack to route traffic through a local port for granular filtering. It also serves as a self-hosted infrastructure tool, enabling users to automa
- [lobehub/lobehub](https://awesome-repositories.com/repository/lobehub-lobehub.md) (78,736 ⭐) — LobeHub is a comprehensive multi-agent orchestration platform designed for building, configuring, and deploying specialized AI agents. It provides a unified chat-based gateway that allows users to manage autonomous agent teams across web, desktop, and mobile environments. By utilizing a framework that supports persistent memory and granular tool integration, the platform enables the execution of complex, multi-step workflows and domain-specific tasks.

The platform distinguishes itself through an interactive artifact renderer that injects dynamic, visual UI elements directly into the chat stre
- [getpaseo/paseo](https://awesome-repositories.com/repository/getpaseo-paseo.md) (9,118 ⭐) — Paseo is an LLM coding agent orchestrator and multi-agent workflow manager designed to coordinate multiple AI agents across isolated git worktrees. It provides a unified control interface for managing these agents and their associated environments to execute complex programming tasks.

The system distinguishes itself through a remote agent daemon that enables secure access to local coding agents via encrypted relays. It employs a git worktree environment manager to isolate parallel tasks into dedicated directories and branch-based server URLs, preventing file collisions and network port confli
- [samypesse/how-to-make-a-computer-operating-system](https://awesome-repositories.com/repository/samypesse-how-to-make-a-computer-operating-system.md) (0 ⭐) — How to Make a Computer Operating System
- [ghuntley/how-to-build-a-coding-agent](https://awesome-repositories.com/repository/ghuntley-how-to-build-a-coding-agent.md) (5,145 ⭐) — This repository is a reference implementation and guided tutorial for building an AI coding agent that combines conversational interaction with file system manipulation and sandboxed shell execution. The agent uses a large language model as its core decision-making component, operating within a turn-based conversational loop where it can generate responses or invoke tools, and tool results are fed back into the dialogue. It provides primitives for reading, writing, and listing files on the local filesystem, as well as searching code using regular expressions.

The agent’s capabilities are exte
- [openai/openai-agents-python](https://awesome-repositories.com/repository/openai-openai-agents-python.md) (27,191 ⭐) — This project is a Python framework for building autonomous, event-driven agent systems. It provides a unified runtime for orchestrating multi-agent workflows, managing persistent conversation state, and executing code within secure, isolated sandbox environments. The framework is designed to handle complex task delegation, allowing agents to invoke other agents as tools while maintaining context across multi-turn interactions.

The framework distinguishes itself through its deep integration with the Model Context Protocol, enabling agents to connect to external data sources and remote services
- [amitshekhariitbhu/rxjava2-android-samples](https://awesome-repositories.com/repository/amitshekhariitbhu-rxjava2-android-samples.md) (4,917 ⭐) — This repository is a collection of practical implementation patterns and reference samples for using RxJava 2 to manage asynchronous data streams in Android applications. It serves as a reactive programming implementation guide, providing code examples for handling complex event-driven logic and asynchronous patterns within mobile environments.

The project distinguishes itself by providing specific reference implementations for common mobile challenges, such as real-time search with debouncing, on-demand list pagination for infinite scrolling, and cache-first data streaming to reduce perceive
- [getstation/desktop-app](https://awesome-repositories.com/repository/getstation-desktop-app.md) (1,684 ⭐) — One app to rule them all!
- [different-ai/openwork](https://awesome-repositories.com/repository/different-ai-openwork.md) (10,046 ⭐) — Openwork is an LLM agent orchestration platform and cross-platform desktop application designed for building and running automated workflows. It serves as a local AI agent host and session manager, allowing users to connect local project folders to various large language models and remote cloud workers.

The project distinguishes itself through a local-first execution model that enables agents to process files directly on a host machine. It implements human-in-the-loop permissioning to intercept agent resource requests, requiring explicit user approval before accessing specific local system fi
- [amitshekhariitbhu/fast-android-networking](https://awesome-repositories.com/repository/amitshekhariitbhu-fast-android-networking.md) (5,906 ⭐) — 🚀 A Complete Fast Android Networking Library that also supports HTTP/2 🚀
- [amitshekhariitbhu/androidnetworking](https://awesome-repositories.com/repository/amitshekhariitbhu-androidnetworking.md) (5,906 ⭐) — AndroidNetworking is an HTTP networking library for Android that handles the full lifecycle of network communication, from sending requests to parsing responses and caching data. It provides a unified interface for executing GET, POST, PUT, DELETE, HEAD, and PATCH requests, with support for both synchronous and asynchronous execution, and includes built-in JSON response parsing that converts server responses directly into Java objects or lists.

The library distinguishes itself through a set of integrated capabilities that go beyond basic request execution. It manages file downloads and upload
- [the-control-group/voyager](https://awesome-repositories.com/repository/the-control-group-voyager.md) (11,819 ⭐) — Voyager is a Laravel administration panel and PHP database manager that provides a web-based dashboard for managing application data and administrative user privileges. It functions as a BREAD CRUD manager, allowing users to browse, read, edit, archive, and delete database records through a graphical interface.

The system enables database content management without the need to write custom controller code or execute raw SQL. It includes tools for role-based access control to define and manage administrative permissions, restricting access to backend tools based on assigned user roles.
- [microsoft/omniparser](https://awesome-repositories.com/repository/microsoft-omniparser.md) (24,377 ⭐) — OmniParser is a multimodal interaction engine designed to function as a desktop automation agent. It interprets visual screen information to execute complex, multi-step tasks across operating system environments by bridging visual interface perception with language models. Through a continuous cycle of observation and command execution, the system grounds high-level natural language instructions into precise, coordinate-based actions.

The project distinguishes itself by utilizing vision-based parsing to interact with software interfaces without requiring access to underlying application progr
- [the-pr-agent/pr-agent](https://awesome-repositories.com/repository/the-pr-agent-pr-agent.md) (11,637 ⭐) — PR-Agent is an AI code review automation system that uses large language models to evaluate code quality and suggest improvements within the version control workflow. It functions as an automated pull request reviewer and summarizer, analyzing code changes to provide logic explanations and concise descriptions of pending merges.

The system includes a context compressor that shrinks large file patches to fit within the token limits of language models. It supports custom coding standard enforcement by allowing users to adjust review categories and prompting logic via configuration files to alig
- [bloxstraplabs/bloxstrap](https://awesome-repositories.com/repository/bloxstraplabs-bloxstrap.md) (3,034 ⭐) — Bloxstrap is a custom game bootstrapper and configuration tool for Roblox. It replaces the standard launcher to enable advanced startup configurations, inject internal engine flags, and manage a specialized installation directory.

The project provides a client mod manager that allows users to override local assets, such as sounds, textures, and fonts, and ensures these customizations persist across game updates. It includes a configuration utility to unlock hidden graphics settings and engine parameters, alongside a server tracker that identifies the geographic location of active game servers
- [anthropics/claude-code](https://awesome-repositories.com/repository/anthropics-claude-code.md) (132,728 ⭐) — Anthropic's terminal-native AI coding agent.
- [liquidgalaxylab/lg-gesture-and-voice-control](https://awesome-repositories.com/repository/liquidgalaxylab-lg-gesture-and-voice-control.md) (0 ⭐) — LG Gesture and Voice Control An App To Provide Gesture and Voice Control for Liquid Galaxy .
- [galaxy-s10/billd-desk](https://awesome-repositories.com/repository/galaxy-s10-billd-desk.md) (5,418 ⭐) — This project is a remote desktop software suite and administration tool designed for controlling remote devices via web browsers or desktop applications across different operating systems. It functions as a secure remote access gateway and device manager, providing a centralized backend for auditing sessions and deploying private infrastructure to target machines.

The system distinguishes itself through the use of GPU-accelerated video streaming and hardware encoding to reduce latency. It enables multi-device monitoring via a screen wall and supports the creation of virtual display emulations
- [tigervnc/tigervnc](https://awesome-repositories.com/repository/tigervnc-tigervnc.md) (6,884 ⭐) — TigerVNC is a remote desktop software system consisting of a server and client implementation. It enables the streaming of graphical desktop environments across different operating systems by implementing the RFB protocol to exchange pixel data and input events.

The software provides secure remote access through password authentication and the use of cryptographic certificates to verify remote server identities. It facilitates remote system management by capturing a local display on a server and forwarding it to a viewer application for remote control.

The system includes capabilities for di
- [fyne-io/fyne](https://awesome-repositories.com/repository/fyne-io-fyne.md) (27,941 ⭐) — Fyne is a cross-platform graphical user interface toolkit for the Go programming language. It provides a comprehensive framework for building native applications that run on desktop, mobile, and web environments from a single codebase. The toolkit centers on a canvas-based rendering engine and a device-independent layout engine, ensuring that visual elements maintain consistent dimensions and behavior across diverse operating systems and screen densities.

The project distinguishes itself through a reactive data-binding system that automatically synchronizes application state with interface co
- [air-controller/air-controller-desktop](https://awesome-repositories.com/repository/air-controller-air-controller-desktop.md) (581 ⭐) — 中文文档
- [mack-a/v2ray-agent](https://awesome-repositories.com/repository/mack-a-v2ray-agent.md) (19,081 ⭐) — V2Ray-agent is a shell-based orchestration tool designed to automate the deployment, configuration, and lifecycle management of network proxy services. It provides a structured framework for setting up encrypted tunnels and managing proxy processes as persistent background services through system initialization managers.

The project distinguishes itself through a modular architecture that integrates automated security certificate management and multi-user access control into the deployment workflow. By utilizing template-driven configuration generation and reverse proxy traffic multiplexing,
- [httpie/cli](https://awesome-repositories.com/repository/httpie-cli.md) (38,228 ⭐) — This project is a terminal-based HTTP client designed for interacting with web services, debugging APIs, and automating network requests. It provides a specialized command-line interface that simplifies the construction of complex HTTP exchanges, allowing users to test and inspect web services directly from the shell.

The tool distinguishes itself through a declarative syntax engine that translates shorthand command-line tokens into fully formed HTTP requests, including headers, parameters, and body payloads. It features a modular, plugin-based architecture that enables users to extend core f
- [amidaware/tacticalrmm](https://awesome-repositories.com/repository/amidaware-tacticalrmm.md) (4,161 ⭐) — TacticalRMM is a remote monitoring and management platform designed for overseeing endpoints and automating IT administration. It functions as an endpoint management tool and IT automation framework, providing a centralized dashboard for executing scripts, monitoring system health, and managing remote devices across multiple tenants.

The platform distinguishes itself through a comprehensive remote administration suite that includes real-time shell access, remote file management, and registry editing. It integrates with third-party remote desktop software and provides a hierarchical policy inh
- [kunkundi/crossdesk](https://awesome-repositories.com/repository/kunkundi-crossdesk.md) (3,776 ⭐) — Crossdesk is a cross-platform remote desktop software used for streaming and controlling remote computers. It consists of a hardware-accelerated video streamer and a web-based client that allows users to operate remote devices through a browser without installing platform-specific software.

The system utilizes a self-hosted connection relay deployed via containers to manage remote sessions and forward traffic when direct peer-to-peer connections fail. To maintain performance, it employs hardware-accelerated video encoding and streaming to reduce latency and CPU load.

The software provides cr
- [dokploy/dokploy](https://awesome-repositories.com/repository/dokploy-dokploy.md) (34,901 ⭐) — Dokploy is a self-hosted platform-as-a-service designed to simplify the deployment and management of containerized applications and databases. It provides a centralized control plane that decouples administrative management from application workloads, allowing users to oversee infrastructure across multiple server nodes through a unified web interface or a command-line tool.

The platform distinguishes itself through an extensive library of pre-configured application templates, enabling the rapid deployment of databases, identity providers, and various productivity or development tools. It sup
- [cube-js/cube](https://awesome-repositories.com/repository/cube-js-cube.md) (20,251 ⭐) — Cube is a semantic data layer that provides a unified framework for defining business metrics, dimensions, and relationships across diverse data sources. By acting as a headless business intelligence engine, it transforms raw data into a governed model that can be queried via SQL, REST, and GraphQL interfaces. This architecture ensures consistent data definitions and logic across all downstream analytical applications and reporting tools.

The platform distinguishes itself through its integrated conversational AI capabilities, which allow users to explore data using natural language. It orches
- [google-research/google-research](https://awesome-repositories.com/repository/google-research-google-research.md) (38,139 ⭐) — This repository serves as a comprehensive research platform and toolkit for advancing machine learning, quantum computing, and large-scale scientific data analysis. It provides foundational frameworks for developing complex algorithmic systems, offering the necessary infrastructure for distributed training, computational graph execution, and high-performance model development.

The project distinguishes itself by integrating specialized research domains with robust, privacy-preserving methodologies. It supports diverse scientific discovery through tools for quantum simulation, physics-informed
- [the-open-agent/openagent](https://awesome-repositories.com/repository/the-open-agent-openagent.md) (5,303 ⭐) — OpenAgent is an autonomous AI agent framework designed to orchestrate language models and retrieved context to execute complex user goals. It functions as a platform for building autonomous agents that utilize iterative loops to select tools and process information.

The project features a multi-model gateway that abstracts various large language model providers, allowing users to switch between models on a per-conversation basis without modifying code. It also includes a RAG knowledge base system that ingests documents and generates embeddings to provide semantic context during inference.

Th
- [liquidgalaxylab/gesture-controller](https://awesome-repositories.com/repository/liquidgalaxylab-gesture-controller.md) (1 ⭐) — A cheap and easy solution to add a new way to control Liquid Galaxy, the main goal of the project is to add a gesture controller for the platform, totally based in just a android app and a server to listen the commands.
- [hoppscotch/hoppscotch](https://awesome-repositories.com/repository/hoppscotch-hoppscotch.md) (79,618 ⭐) — Hoppscotch is an open-source API development ecosystem designed for building, testing, and debugging REST, GraphQL, and real-time APIs. It provides a unified platform that functions across web browsers, desktop applications, and command-line interfaces, allowing developers to manage the entire API lifecycle from a single environment.

The platform distinguishes itself through a highly interactive, command-driven interface that utilizes a global spotlight palette and keyboard shortcuts to streamline complex workflows. It supports advanced request manipulation and validation by executing JavaScr
- [appium/appium-desktop](https://awesome-repositories.com/repository/appium-appium-desktop.md) (4,809 ⭐) — Appium Desktop is an app for Mac, Windows, and Linux which gives you the power of the Appium automation server in a beautiful and flexible UI. It is basically a graphical interface for the Appium Server. You can set options, start/stop the server, see logs, etc... You also don't need to use…
- [appium/appium](https://awesome-repositories.com/repository/appium-appium.md) (21,647 ⭐) — Appium is a cross-platform automation server that enables user interface testing across mobile, desktop, and web environments. It functions as a unified server architecture that translates automation scripts into platform-specific actions using the W3C WebDriver protocol.

The project distinguishes itself through a modular architecture that decouples core server logic from platform-specific implementations. This design allows for the integration of custom drivers and plugins, enabling support for specialized hardware, unique application environments, and non-standard interaction patterns that
- [e2b-dev/awesome-ai-agents](https://awesome-repositories.com/repository/e2b-dev-awesome-ai-agents.md) (25,903 ⭐) — This project is a curated repository and directory focused on the artificial intelligence agent ecosystem. It serves as a centralized knowledge base for developers and researchers to discover frameworks, platforms, and autonomous software entities designed for reasoning, planning, and executing complex tasks.

The directory distinguishes itself through a community-driven curation model, where contributors maintain and update the collection via a distributed version control system. This collaborative approach ensures that the index remains current with the latest academic resources, open-source
- [cjpais/handy](https://awesome-repositories.com/repository/cjpais-handy.md) (15,515 ⭐) — Handy is a local speech-to-text automation tool designed to convert spoken audio into text and inject it directly into active desktop applications. By running machine learning models entirely on the host hardware, it provides a private, offline-first environment for dictation and command execution. The system functions as a background service that manages microphone input, transcription state, and text output, enabling hands-free typing across various software environments.

The project distinguishes itself through a modular pipeline that integrates local language models for post-transcription
- [allegroai/clearml](https://awesome-repositories.com/repository/allegroai-clearml.md) (6,733 ⭐) — ClearML is a comprehensive MLOps platform designed to manage the entire machine learning lifecycle. It functions as an experiment tracking tool, a data versioning system, and a pipeline orchestrator, while providing infrastructure for GPU cluster management and model serving.

The platform is distinguished by its ability to handle hybrid-cloud compute scheduling and fractional GPU allocation, allowing multiple workloads to share a single hardware accelerator. It employs a metadata-based approach to data versioning, using virtual views to track large datasets and artifacts without duplicating r
- [metauto-ai/agent-as-a-judge](https://awesome-repositories.com/repository/metauto-ai-agent-as-a-judge.md) (785 ⭐) — 👩‍⚖️ Agent-as-a-Judge: The Magic for Open-Endedness
- [clearml/clearml](https://awesome-repositories.com/repository/clearml-clearml.md) (6,740 ⭐) — ClearML is a comprehensive MLOps platform designed to manage the end-to-end machine learning lifecycle, from initial experimentation to production deployment. It provides a suite of integrated tools including a pipeline orchestrator for automating workflows, an experiment tracking tool for logging hyperparameters and metrics, and a metadata-driven data versioning system for managing large-scale datasets and model artifacts.

The platform is distinguished by its advanced compute management and serving capabilities. It features a GPU compute manager that supports fractional resource slicing and
- [elibroftw/modern-desktop-app-template](https://awesome-repositories.com/repository/elibroftw-modern-desktop-app-template.md) (370 ⭐) — Tauri v2 & React v19 boilerplate for a modern desktop application. Not a project nor a substitute for my Tauri video tutorials.
- [qtile/qtile](https://awesome-repositories.com/repository/qtile-qtile.md) (5,273 ⭐) — Qtile is a programmable tiling window manager and compositor written and configured in Python. It organizes application windows into non-overlapping tiles or floating modes to maximize screen real estate and supports both X11 and Wayland display server protocols.

The environment is defined by executing Python scripts, allowing the programmatic customization of keybindings, visual styles, and system behaviors. This approach enables a personalized workspace where the entire user interface and layout logic are managed through a script-based configuration.

The project covers broad capability are
- [podman-desktop/podman-desktop](https://awesome-repositories.com/repository/podman-desktop-podman-desktop.md) (7,722 ⭐) — Podman Desktop is a graphical user interface for building, managing, and deploying containers and Kubernetes clusters from a local workstation. It serves as a container engine manager and a Kubernetes cluster dashboard, providing a visual environment for tasks typically handled via the command line.

The project includes a container extension framework that allows users to integrate additional tools and capabilities into the management environment through a plugin system and extension catalog.

The software covers the full container lifecycle, including image building and pushing to registries
- [bytedance/ui-tars](https://awesome-repositories.com/repository/bytedance-ui-tars.md) (9,622 ⭐) — UI-TARS is an LLM GUI automation framework and multimodal action grounding system. It functions as a GUI agent orchestrator and cross-platform device controller that uses large language models to interpret graphical interfaces and execute actions across desktop and mobile operating systems.

The system translates model-generated coordinates into precise screen positions to interact with visual user interface elements. It employs a multimodal approach to interpret screen layouts and decomposes complex goals into multi-step trajectories through reasoning and error correction.

The project provid
