7 repos

Awesome GitHub RepositoriesMultimodal Processing Tools

Systems for ingesting and synthesizing non-textual data types, including vision, audio, and speech, within AI pipelines.

Explore 7 awesome GitHub repositories matching artificial intelligence & ml · Multimodal Processing Tools. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

josephmisiti/awesome-machine-learning
josephmisiti/awesome-machine-learning
71,702GitHubView on GitHub
This project is a comprehensive, community-driven directory of machine learning resources, software libraries, and educational materials. It serves as a centralized knowledge base for developers and researchers, organizing tools and frameworks by their primary programming language and technical domain to simplify disco
Python
OpenHands/OpenHands
OpenHands/OpenHands
67,974GitHubView on GitHub
OpenHands is an autonomous agent framework designed for software engineering workflows. It provides a modular platform for orchestrating AI agents that reason, plan, and execute tasks within isolated, containerized development environments. By integrating with standard version control and development tools, the system
Pythonagentartificial-intelligencechatgpt
xtekky/gpt4free
xtekky/gpt4free
65,720GitHubView on GitHub
This project provides a unified interface for interacting with a wide range of artificial intelligence services, acting as a central orchestration layer for text and image generation. It standardizes access to diverse AI backends, allowing developers to integrate multiple language and vision models through a single, co
Pythonchatbotchatbotschatgpt
CorentinJ/Real-Time-Voice-Cloning
CorentinJ/Real-Time-Voice-Cloning
59,355GitHubView on GitHub
This project is a neural text-to-speech engine and voice cloning toolkit designed to generate synthetic speech that mimics the vocal characteristics of a target speaker. It functions as a real-time audio synthesizer, utilizing a deep learning pipeline to convert written text into high-fidelity speech output with minima
Pythondeep-learningpythonpytorch
AntonOsika/gpt-engineer
AntonOsika/gpt-engineer
55,201GitHubView on GitHub
GPT-Engineer is an autonomous agent and framework designed for AI-assisted software development. It functions as a generative codebase architect that translates natural language requirements into complete, functional software projects by reading and writing files directly to the local file system. The platform disting
Pythonaiautonomous-agentcode-generation
RVC-Boss/GPT-SoVITS
RVC-Boss/GPT-SoVITS
55,111GitHubView on GitHub
GPT-SoVITS is a text-to-speech synthesis engine and voice cloning toolkit designed for generating natural-sounding human speech. It functions as a neural audio processing pipeline that maps input text to high-fidelity audio waveforms, utilizing conditional variational autoencoders and flow-based decoders to ensure expr
Pythontext-to-speechttsvits
appwrite/appwrite
appwrite/appwrite
54,884GitHubView on GitHub
Appwrite is a backend-as-a-service platform that provides a unified development environment for building full-stack applications. It integrates essential infrastructure components—including authentication, databases, storage, and serverless functions—into a single, centralized interface to simplify application developm
TypeScriptandroidappwritebackend

Explore sub-tags

7 repos

Awesome GitHub RepositoriesMultimodal Processing Tools

Systems for ingesting and synthesizing non-textual data types, including vision, audio, and speech, within AI pipelines.

Explore 7 awesome GitHub repositories matching artificial intelligence & ml · Multimodal Processing Tools. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

josephmisiti/awesome-machine-learning
josephmisiti/awesome-machine-learning
71,702GitHubView on GitHub
This project is a comprehensive, community-driven directory of machine learning resources, software libraries, and educational materials. It serves as a centralized knowledge base for developers and researchers, organizing tools and frameworks by their primary programming language and technical domain to simplify disco
Python
OpenHands/OpenHands
OpenHands/OpenHands
67,974GitHubView on GitHub
OpenHands is an autonomous agent framework designed for software engineering workflows. It provides a modular platform for orchestrating AI agents that reason, plan, and execute tasks within isolated, containerized development environments. By integrating with standard version control and development tools, the system
Pythonagentartificial-intelligencechatgpt
xtekky/gpt4free
xtekky/gpt4free
65,720GitHubView on GitHub
This project provides a unified interface for interacting with a wide range of artificial intelligence services, acting as a central orchestration layer for text and image generation. It standardizes access to diverse AI backends, allowing developers to integrate multiple language and vision models through a single, co
Pythonchatbotchatbotschatgpt
CorentinJ/Real-Time-Voice-Cloning
CorentinJ/Real-Time-Voice-Cloning
59,355GitHubView on GitHub
This project is a neural text-to-speech engine and voice cloning toolkit designed to generate synthetic speech that mimics the vocal characteristics of a target speaker. It functions as a real-time audio synthesizer, utilizing a deep learning pipeline to convert written text into high-fidelity speech output with minima
Pythondeep-learningpythonpytorch
AntonOsika/gpt-engineer
AntonOsika/gpt-engineer
55,201GitHubView on GitHub
GPT-Engineer is an autonomous agent and framework designed for AI-assisted software development. It functions as a generative codebase architect that translates natural language requirements into complete, functional software projects by reading and writing files directly to the local file system. The platform disting
Pythonaiautonomous-agentcode-generation
RVC-Boss/GPT-SoVITS
RVC-Boss/GPT-SoVITS
55,111GitHubView on GitHub
GPT-SoVITS is a text-to-speech synthesis engine and voice cloning toolkit designed for generating natural-sounding human speech. It functions as a neural audio processing pipeline that maps input text to high-fidelity audio waveforms, utilizing conditional variational autoencoders and flow-based decoders to ensure expr
Pythontext-to-speechttsvits
appwrite/appwrite
appwrite/appwrite
54,884GitHubView on GitHub
Appwrite is a backend-as-a-service platform that provides a unified development environment for building full-stack applications. It integrates essential infrastructure components—including authentication, databases, storage, and serverless functions—into a single, centralized interface to simplify application developm
TypeScriptandroidappwritebackend

Awesome Multimodal Processing Tools GitHub Repositories

josephmisiti/awesome-machine-learning

OpenHands/OpenHands

xtekky/gpt4free

CorentinJ/Real-Time-Voice-Cloning

AntonOsika/gpt-engineer

RVC-Boss/GPT-SoVITS

appwrite/appwrite

Explore sub-tags

Awesome Multimodal Processing Tools GitHub Repositories

josephmisiti/awesome-machine-learning

OpenHands/OpenHands

xtekky/gpt4free

CorentinJ/Real-Time-Voice-Cloning

AntonOsika/gpt-engineer

RVC-Boss/GPT-SoVITS

appwrite/appwrite

Explore sub-tags