Cheetah

Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions

Features

Multimodal Benchmarks - Benchmark for evaluating interleaved vision-language instructions.

open-compass/VLMEvalKit

VLMEvalKit is a vision-language model evaluation framework and inference engine designed to run standardized benchmarks and measure model accuracy across diverse visual datasets. It serves as a multimodal model benchmark and performance toolkit for calculating metrics and comparing model responses. The toolkit includes a specialized visual reasoning evaluator that uses adversarial samples to distinguish actual image understanding from reliance on language patterns. It also provides capabilities for image generation evaluation, testing a model's ability to create or modify visuals based on tex

BradyFU/Video-MME

779View on GitHub

✨✨CVPR 2025 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

bytedance/lynx-llm

272View on GitHub

paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/

alenai97/MiCEval

6View on GitHub

An automatic evaluation framework for Multimodal Chain-of-Thought.

open-compass/VLMEvalKit

3,824View on GitHub

BradyFU/Video-MME

779View on GitHub

✨✨CVPR 2025 Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

bytedance/lynx-llm

272View on GitHub

paper: https://arxiv.org/abs/2307.02469 page: https://lynx-llm.github.io/

alenai97/MiCEval

6View on GitHub

An automatic evaluation framework for Multimodal Chain-of-Thought.

DCDmllmCheetah

Features

Open-source alternatives to Cheetah

open-compass/VLMEvalKit

BradyFU/Video-MME

bytedance/lynx-llm

alenai97/MiCEval

Star history

Open-source alternatives to Cheetah

open-compass/VLMEvalKit

BradyFU/Video-MME

bytedance/lynx-llm

alenai97/MiCEval