Gemma

Gemma - run multimodal LLMs locally | Awesome Repos

Features

Open Multimodal Model Deployers - Deploys a family of open and efficient multimodal models capable of text and image generation.
Open-Weights Models - Provides a family of large language models with publicly available weights for local or cloud execution.
Decoder Architectures - Implements a transformer-based architecture utilizing causal attention mechanisms for autoregressive sequence generation.
LLM Fine-Tuning - Fine-tunes large language models on custom datasets using parameter-efficient methods like LoRA.
GPU Memory Optimizers - Optimizes VRAM usage for large models through quantization and parameter sharding to fit on limited GPUs.
Multi-Modal Inference Engines - Provides an inference engine capable of processing and analyzing combined text and vision inputs within a single model.
Multi-modal Language Models - Provides a model capable of processing and generating responses from both textual and visual input sequences.
Cross-Attention Mechanisms - Integrates visual and textual data by mapping different input modalities into a shared latent space for joint processing.
Model Deployments - Provides open weights that can be downloaded and deployed on local or cloud hardware.
Decoder-Only Architectures - Utilizes a decoder-only transformer architecture for autoregressive sequence generation.
Fine-tunable Models - Supports weight adjustment and low-rank adaptation to specialize performance for particular tasks.
Multi-Modal AI - Processes and generates responses based on both textual and visual input sequences within a single workflow.
External Tool Integrations - Connects AI assistants to external utilities and APIs to perform actions and retrieve real-time data.
Text-to-Image Generators - Implements pipelines that generate high-resolution images from natural language text prompts.
Language Model Fine-Tuning - Provides frameworks and utilities for adjusting pre-trained language models using memory-efficient training methods.
Model Fine-Tuning - Provides procedures for adapting pre-trained models to specific datasets or tasks.
Low-Rank Adaptation - Supports training a small number of additional weight matrices while keeping base weights frozen to reduce hardware requirements.
Quantized Models - Includes support for quantized weights to reduce memory usage and increase inference speed on limited hardware.
Weight Quantization - Compresses model weights into lower-precision formats to reduce memory footprint and accelerate inference.
Tool Use And Integration - Enables the model to call external functions and APIs to retrieve real-time data and perform actions.
Distributed Parameter Sharding - Partitions large-scale model tensors across multiple compute nodes to facilitate parallel processing and overcome memory limits.
Deep Learning Frameworks - Open-weights large language models based on advanced research.

Open-source alternatives to Gemma

Similar open-source projects, ranked by how many features they share with Gemma.

nndl/llm-beginner
nndl/llm-beginner
6,421View on GitHub
This project is a collection of educational resources and technical guides focused on the development and implementation of large language models. It provides a comprehensive curriculum covering transformer architectures, training methods, and deployment strategies. The materials provide detailed instructions for building autonomous agents using reasoning loops and tool integration, as well as guides for fine-tuning models through supervised learning and preference optimization. It also includes tutorials for constructing retrieval augmented generation pipelines and implementing transformer m
Pythonagentfudannlpllm
View on GitHub6,421
qwenlm/qwen-7b
QwenLM/Qwen-7B
21,343View on GitHub
Qwen-7B is a pretrained causal language model designed for natural language generation, text processing, and complex reasoning tasks. It is available as an instruction-tuned model optimized for conversational interactions and a tool-use model capable of executing function calls and interacting with external APIs. The project provides a quantized version of the model to reduce GPU memory usage and supports the development of autonomous agents that can execute code and perform functions to complete complex goals. The system covers a wide range of capabilities including model fine-tuning throug
Python
View on GitHub21,343
thudm/chatglm3
THUDM/ChatGLM3
13,676View on GitHub
ChatGLM3 is an open-weights large language model designed for bilingual conversational interactions in English and Chinese. It functions as a tool-augmented system capable of calling external functions and executing internal code to resolve complex tasks. The model utilizes four-bit quantization to reduce memory requirements, enabling inference on consumer hardware and diverse processing units including GPUs and CPUs. It features an expanded context window for processing and summarizing long documents and includes a supervised fine-tuning pipeline for adapting the model to specialized domains
Python
View on GitHub13,676
thudm/glm-4
THUDM/GLM-4
7,059View on GitHub
GLM-4 is an open weights large language model designed as a multimodal chat system. It functions as a reasoning-focused and multilingual model capable of processing and generating responses across text and visual data types. The model is distinguished by its function-calling capabilities, allowing it to interface with external tools and APIs to execute tasks and retrieve real-time information. It is optimized for complex logical reasoning, mathematical problem solving, and deep research involving long-form content generation. Broad capabilities include multilingual text generation, the creat
Python
View on GitHub7,059

See all 30 alternatives to Gemma

google-deepmindgemma

Features

Open-source alternatives to Gemma

nndl/llm-beginner

QwenLM/Qwen-7B

THUDM/ChatGLM3

THUDM/GLM-4

Star history

Open-source alternatives to Gemma

nndl/llm-beginner

QwenLM/Qwen-7B

THUDM/ChatGLM3

THUDM/GLM-4