30 open-source projects similar to sakanaai/ai-scientist-v2, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best AI Scientist V2 alternative.
AutoResearchClaw is an agentic system designed to automate the scientific research process. It functions as an autonomous research agent and workflow automator that manages the entire lifecycle of a project, from initial hypothesis generation and literature review to experimental execution and the production of LaTeX-formatted academic papers. The system distinguishes itself through a multi-agent research pipeline that utilizes structured debates for hypothesis refinement and peer review. It employs a branch-and-merge architecture to explore parallel research directions and integrates human-i
AI-Researcher is an LLM research automation framework and scientific workflow orchestrator designed to automate the end-to-end discovery process. It employs autonomous AI research agents to identify research gaps, formulate hypotheses, and execute scientific discovery workflows independently. The system integrates an automated literature review tool for gathering and analyzing academic papers and code repositories with an AI-driven manuscript generator that synthesizes research motivations and experimental results into full-length academic papers. The framework covers a modular research pipe
AI-Scientist is an autonomous research pipeline and framework for scientific discovery. It employs large language model agents to manage the full lifecycle of a scientific project, from initial hypothesis generation to the production of formal academic papers. The system operates through a repeating research loop that integrates automated experimental execution, data analysis, and literature novelty verification. It queries external academic databases to validate the originality of ideas and retrieve citations, then translates experimental findings into LaTeX manuscripts. To refine these outp
This project is a scientific agent framework and workflow orchestrator designed to extend large language models with specialized tools for genomic, chemical, and biological research. It provides a system for planning research hypotheses and executing automated workflows by integrating scientific databases with dynamic code execution. The framework includes a cheminformatics modeling suite for predicting molecular bioactivity and performing virtual screening, alongside a bioinformatics analysis toolkit for processing genomic sequences and single-cell data. It also features an academic document
This project is an LLM research orchestrator and autonomous AI agent framework designed to automate the scientific lifecycle. It functions as an end-to-end research pipeline and model training toolkit, managing everything from initial literature reviews and hypothesis testing to the final drafting of academic papers. The system is distinguished by its ability to convert unstructured academic PDFs into machine-executable knowledge layers, allowing agents to reproduce and extend research findings. It employs a two-loop orchestration architecture and a specialized research engineering skill libr
This project is a machine learning research automation system designed to manage the full research lifecycle, from idea discovery to final paper submission. It utilizes markdown-based skill templates to execute autonomous research tasks and manage iterative loops of deep review and experimentation. The system distinguishes itself through integrated capabilities for academic communication and integrity auditing. It can automate the generation of LaTeX papers, conference slide decks, and evidence-grounded peer review rebuttals. To ensure rigor, it employs cross-model review routing and adversar
Autoresearch is an autonomous machine learning research agent and architecture search framework. It employs a closed-loop system to programmatically rewrite training and architecture source code to discover optimal language model configurations. The system iteratively modifies code and evaluates performance metrics to improve model quality based on a target objective. It optimizes model performance and training efficiency by tracking validation bits per byte, which allows for a fair comparison of architectural changes independently of vocabulary size. The framework manages the full training
MiroThinker is an autonomous research system that uses large language models to perform deep research and predictions through iterative reasoning. It functions as a web-search AI framework capable of retrieving real-time internet data and scraping web content to provide verifiable sources for complex queries. The system includes a multimodal content processor that converts images, audio, and video into text descriptions for analysis by text-based models. To ensure computational accuracy, it utilizes a sandboxed code executor for running Python code and data analysis. Performance is managed th
This is a Python automated machine learning framework designed to automate the design and optimization of machine learning pipelines. It functions as a genetic programming pipeline optimizer and an automated feature selection tool, using evolutionary search to discover the most effective sequences of data processing and model steps. The project focuses on multi-objective optimization to balance competing performance metrics simultaneously. It employs a genetic selection process to identify impactful variables and remove noise from raw datasets, ensuring the resulting machine learning solution
Miroflow is an agent orchestration framework designed to coordinate multiple large language models and autonomous agents to perform complex research and reasoning tasks. It functions as a hierarchical workflow manager that distributes workloads across specialized agents using intent recognition and structured planning to gather deep information and solve challenging queries. The system distinguishes itself through a multi-model integration gateway and a provider-agnostic interface, allowing it to unify various language model providers. It extends these models via a tool-augmented framework th
Open Deep Research is an artificial intelligence framework designed to automate complex, multi-step research workflows. It functions as an autonomous agent that performs iterative web searches, analyzes retrieved data, and synthesizes information into structured reports. By decomposing broad queries into smaller sub-tasks, the system builds a comprehensive knowledge base to address open-ended questions. The platform distinguishes itself through an agentic loop that dynamically refines research strategies based on previous findings. It manages long-form data by compressing and summarizing cont
GPT Researcher is an autonomous agent framework designed to automate the process of gathering, synthesizing, and documenting information from diverse web and local sources. It functions as a research-oriented execution environment that orchestrates specialized agents to perform complex, multi-branch research tasks, transforming raw data into structured, factual, and cited reports. The project distinguishes itself through a graph-based orchestration layer that manages state transitions and information flow between specialized agents. It employs recursive tree-search execution to explore comple
ClearML is a comprehensive MLOps platform designed to manage the entire machine learning lifecycle. It functions as an experiment tracking tool, a data versioning system, and a pipeline orchestrator, while providing infrastructure for GPU cluster management and model serving. The platform is distinguished by its ability to handle hybrid-cloud compute scheduling and fractional GPU allocation, allowing multiple workloads to share a single hardware accelerator. It employs a metadata-based approach to data versioning, using virtual views to track large datasets and artifacts without duplicating r
Feynman is an open-source AI research agent that coordinates multi-agent workflows to search papers, run experiments, and produce cited research briefs. It orchestrates parallel researcher agents that independently investigate subtopics, then synthesizes and verifies findings through a multi-step orchestration loop, enabling deep research across academic papers, web sources, and code. The tool distinguishes itself through several specialized capabilities, including paper claim verification that audits research paper claims against actual code implementations to identify mismatches and validat
This project is a comprehensive AI research workflow framework and skill library designed to transform general large language models into specialized AI research agents. It provides an agentic toolset for academic writing, a knowledge base for AI engineering, and a system for analyzing research artifacts by converting documents and repositories into structured claims and evidence graphs. The framework employs a two-loop orchestration architecture to manage the research lifecycle from ideation and literature surveys to final paper drafting. It distinguishes itself through a modular skill injec
This project is a multi-label classification pipeline designed for genre prediction. It implements a machine learning workflow that assigns multiple category labels to a single item by processing both textual and visual input data. The system utilizes multimodal feature extraction to transform images and text descriptions into semantic vectors. This process includes using pre-trained networks for visual feature extraction and semantic word averaging for text analysis, allowing the model to integrate different data types into a unified input. The pipeline covers the full machine learning life
Local Deep Research is an autonomous research system consisting of an LLM research agent, a local model orchestrator, and a multi-engine search aggregator. It is designed to execute deep research by decomposing complex questions into atomic facts and synthesizing cited reports from academic, technical, and private document sources. The system features an encrypted research workspace that ensures zero-knowledge privacy through isolated, per-user encrypted databases. It utilizes a local RAG knowledge base to index research sources into searchable vector stores, allowing for retrieval-augmented
PyCaret is a Python AutoML platform and MLOps lifecycle manager designed to automate machine learning workflows. It functions as a low-code environment that leverages a scikit-learn native engine to execute preprocessing, training, and evaluation for tabular data. The platform distinguishes itself as an LLM-powered ML copilot, using large language model agents to analyze datasets, design experiment configurations, and explain model results. It also serves as a Kubernetes ML orchestrator and model registry, enabling the versioning of trained pipelines and their promotion to production API endp
mctx is a framework for executing high-performance tree search and state simulations to generate policy targets for neural networks. It functions as a compiled search engine and neural dynamics simulator that predicts state transitions and rewards using learned representations. The project implements a vectorised tree search capable of running parallel search operations across input batches. It utilizes a policy target generator to convert search results into action weights used for training and refining neural network policies. The system covers reinforcement learning workflows by integrati
AutoGluon is an automated machine learning framework designed to optimize model selection and hyperparameter tuning across tabular, text, image, and time series data. It functions as an ensemble learning library and a tabular data prediction engine, aiming to build high-accuracy predictive models without manual algorithm selection. The framework integrates multimodal machine learning pipelines that combine disparate data types into a single representation using specialized encoders. It also includes a probabilistic time series forecaster that fits multiple statistical and deep learning models
LLocalSearch is a privacy-focused search engine and agent framework that uses locally hosted large language models to search the internet and aggregate answers. It functions as a retrieval augmented generation interface where all queries and processing remain on the user's own hardware to ensure data privacy and remove dependency on external cloud API providers. The system employs a chain of autonomous agents that perform recursive internet searches, calling search tools multiple times to gather and synthesize information. It coordinates these models to reason through complex queries, providi
MLOps-Basics is a collection of implementation guides and blueprints for automating the machine learning lifecycle. It provides practical workflows for managing the transition of models from training to production deployment, focusing on the integration of operational tools into the machine learning pipeline. The project features specific architectural patterns for deploying containerized models using serverless infrastructure and cloud registries. It includes frameworks for tracking large datasets and model artifacts via remote storage, as well as guides for converting models into standardiz
UltraRAG is an LLM RAG orchestration platform and AI agent research framework designed to coordinate complex retrieval-augmented generation workflows. It functions as a multimodal RAG engine capable of retrieving and generating responses using text, images, and diverse data types, while providing tools for vector database management and RAG performance evaluation. The platform features a visual RAG pipeline builder that uses a canvas interface to construct and debug data flows, synchronizing visual designs directly with underlying code. It distinguishes itself through an autonomous research s
pi-autoresearch is an autonomous research extension that automates iterative code-editing and performance-measurement loops driven by large language models. It functions as an experiment lifecycle automator, executing repetitive cycles of changes and benchmarks until a specific goal is reached. The system distinguishes itself by organizing successful experimental trials into independent git branches for review and merging. It includes a real-time research dashboard for monitoring metrics and status, and utilizes median absolute deviation to calculate confidence scores that filter benchmark no
Promptbase is a prompt engineering framework designed for designing, testing, and optimizing prompts for large language models. It provides a system for measuring model accuracy and performance through an evaluation toolkit that compares outputs against ground-truth datasets. The project also includes an orchestration pipeline for automating multi-component machine learning tasks across cloud-based endpoints and a utility for preparing retrieval-augmented generation datasets. The framework distinguishes itself through advanced response quality optimization, utilizing chain-of-thought generato
This project is a comprehensive AI infrastructure that combines an LLM agent orchestration framework, an autonomous research system, and a local AI environment. It centers on the creation of a personal knowledge graph and a programmatic prompt engineering library to provide long-term memory and optimized reasoning for artificial intelligence tasks. The system is distinguished by its ability to compose multi-agent teams using specialized personas and deterministic skills to execute complex workflows. It features an autonomous research pipeline capable of deep investigations and adversarial ana
Potpie is an LLM codebase analysis platform and multi-agent orchestration framework designed to act as an AI software engineer. It parses repositories into a structured code knowledge graph, enabling AI agents to perform multi-hop reasoning, dependency tracing, and grounded technical analysis across large codebases. The system distinguishes itself through a spec-driven development framework where agents generate detailed technical specifications and architecture plans before implementing multi-file code changes. It utilizes a durable execution engine to coordinate specialized AI personas for
SuperClaude Framework is an autonomous agent development platform designed for orchestrating complex software development lifecycles. It functions as a Python-based toolkit that enables the deployment of specialized, domain-specific agents capable of coordinating tasks, conducting multi-hop web research, and managing end-to-end project requirements through a unified command interface. The framework distinguishes itself through its iterative planning loops and persistent memory state, which allow agents to evaluate progress in real-time and refine their reasoning strategies across multiple ses
Conductor is a durable workflow engine designed to orchestrate complex, long-running business processes and autonomous agent loops. It functions as a stateful execution platform that persists the entire history of a process, ensuring that workflows remain reliable and recoverable across infrastructure failures, system restarts, and transient network errors. By managing task lifecycles, worker polling, and state transitions, it provides a centralized coordination layer for distributed systems. The platform distinguishes itself through its specialized support for AI agent orchestration, allowin
LangChain.js is a framework for building, executing, and monitoring stateful agentic applications. It provides an orchestration engine that models workflows as directed graphs, allowing developers to connect language models, data sources, and external tools into modular, multi-step processes. The platform distinguishes itself through its focus on stateful execution and human-in-the-loop control. It manages agent lifecycles by persisting execution state across threads, enabling fault tolerance and the ability to pause workflows at designated breakpoints for manual review or modification. This