1 repo

Awesome GitHub RepositoriesMultimodal Learning Frameworks

Libraries for training and deploying models that process and relate multiple data types like text and images.

Distinguishing note: Specifically addresses the mapping of disparate data types into shared latent spaces.

Explore 1 awesome GitHub repository matching artificial intelligence & ml · Multimodal Learning Frameworks. Refine with filters or upvote what's useful.

Find the best repos with AI.We'll search the best matching repositories with AI.

openai/CLIP
openai/CLIP
32,614View on GitHub
CLIP is a neural network architecture designed to map visual and textual data into a shared latent vector space. By utilizing transformer-based feature extraction and multi-modal tokenization, the system aligns images and natural language strings, enabling cross-modal similarity analysis and semantic classification. The project functions as a zero-shot classification engine, identifying image content by calculating the cosine similarity between visual features and arbitrary text labels without requiring task-specific retraining. Beyond inference, it serves as a research toolkit for evaluating
Mapping visual and textual data into a shared mathematical space to enable advanced cross-modal search and analytical reasoning tasks.
Jupyter Notebookdeep-learningmachine-learning
32,614View on GitHub