# semantic code search engine

> Search results for `semantic code search llm` on awesome-repositories.com. 116 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/semantic-code-search-llm

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/semantic-code-search-llm).**

## Results

- [opensemanticsearch/open-semantic-search](https://awesome-repositories.com/repository/opensemanticsearch-open-semantic-search.md) (1,181 ⭐) — Open Semantic Search is an open-source enterprise discovery platform designed to index, analyze, and explore large, diverse document collections. It functions as a comprehensive search engine and analytics suite that transforms unstructured data into structured information through automated processing pipelines.

The platform distinguishes itself by integrating semantic exploration with traditional retrieval methods. It utilizes knowledge graph entity linking and thesaurus-driven query expansion to connect related concepts, allowing users to navigate datasets beyond simple keyword matching. Th
- [github/semantic](https://awesome-repositories.com/repository/github-semantic.md) (9,041 ⭐) — Semantic is a Haskell-based library and command-line tool designed for polyglot source code analysis. It functions as a static program analysis framework and a polyglot abstract syntax tree parser that converts multiple programming languages into structured syntax trees based on grammar definitions.

The system distinguishes itself through a semantic code comparison engine that detects structural and meaningful changes between code versions rather than relying on textual differences. It further enables analysis across different programming syntaxes by translating surface languages into a unifi
- [kilo-org/kilocode](https://awesome-repositories.com/repository/kilo-org-kilocode.md) (15,616 ⭐) — Kilocode is an autonomous engineering platform designed to orchestrate AI agents for complex software development tasks. It functions as a comprehensive system for automating coding, testing, and repository management by integrating directly with your codebase and terminal. The platform provides a unified gateway for model orchestration, allowing for the management of agentic workflows, event-driven automation, and persistent session state across distributed development environments.

The platform distinguishes itself through its federated task management and policy-based access control, which
- [hound-search/hound](https://awesome-repositories.com/repository/hound-search-hound.md) (5,846 ⭐) — Hound is a self-hosted code search engine that indexes source code repositories and provides fast regular expression search results using a trigram-based index. It is designed to be deployed on your own infrastructure, enabling you to search across multiple public and private code repositories simultaneously.

The engine builds its search index by decomposing source code into three-character trigrams, which allows for fast substring matching with regular expressions. It supports searching across multiple repositories in parallel, returning results from the pre-built trigram index. Hound can in
- [aider-ai/aider](https://awesome-repositories.com/repository/aider-ai-aider.md) (46,305 ⭐) — Aider is a command-line interface tool that enables large language models to directly edit, refactor, and manage source code within a local repository. It functions as an AI-powered coding assistant that integrates into the developer workflow, allowing users to apply code changes through natural language prompts while maintaining repository context and version control.

The tool distinguishes itself through a specialized diff-based patching engine that parses model-generated search-and-replace blocks to modify specific file segments without rewriting entire files. It features a provider-agnost
- [asyncfuncai/deepwiki-open](https://awesome-repositories.com/repository/asyncfuncai-deepwiki-open.md) (14,362 ⭐) — This platform is an automated documentation and codebase analysis system designed to generate structured wikis, technical guides, and interactive diagrams from source code repositories. It functions as a retrieval-augmented generation framework that connects codebases to language models, enabling context-aware answers, deep research, and automated documentation updates through semantic vector search.

The system distinguishes itself through a self-hosted, containerized architecture that supports both cloud-based and local AI model execution. It provides sophisticated model orchestration, allow
- [sst/opencode](https://awesome-repositories.com/repository/sst-opencode.md) (175,436 ⭐) — OpenCode is an autonomous software developer and LLM coding agent designed to write code and manage development workflows. It functions as an AI development automator that executes multi-step coding tasks and modifies project files to build software automatically from high-level instructions.

The system employs a task orchestrator to decompose goals into sequences of tool calls and autonomous execution steps. It features a recursive research loop for conducting deep technical searches and a restricted read-only mode for analyzing and exploring large codebases to plan changes without modifying
- [blakeblackshear/frigate](https://awesome-repositories.com/repository/blakeblackshear-frigate.md) (33,778 ⭐) — Frigate is a self-hosted network video recorder that functions as a private, local AI-powered vision engine. It manages video streams by performing real-time object detection, tracking, and classification directly on local hardware, ensuring that security monitoring and activity recording remain independent of cloud services.

The system distinguishes itself through a modular, hardware-accelerated video pipeline that offloads intensive decoding and machine learning inference to dedicated GPUs, NPUs, or specialized accelerators like Coral TPUs and Hailo modules. It utilizes state-based object t
- [guxd/deep-code-search](https://awesome-repositories.com/repository/guxd-deep-code-search.md) (283 ⭐) — DeepCS: Deep Code Search
- [pair-code/llm-comparator](https://awesome-repositories.com/repository/pair-code-llm-comparator.md) (528 ⭐) — LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR team.
- [yusufkaraaslan/skill_seekers](https://awesome-repositories.com/repository/yusufkaraaslan-skill-seekers.md) (9,641 ⭐) — Skill Seekers is a toolset for generating large language model knowledge bases, featuring a multi-source content scraper and a dedicated RAG data pipeline. It extracts technical data from documentation, code, and video to create structured assets and configuration files for AI-powered IDE extensions.

The project distinguishes itself through the ability to transform raw data into polished tutorials and specialized skills for AI plugin marketplaces. It utilizes abstract syntax tree parsing and optical character recognition to analyze GitHub repositories, PDFs, and video frames, converting these
- [krisk/fuse](https://awesome-repositories.com/repository/krisk-fuse.md) (20,347 ⭐) — Fuse is a JavaScript fuzzy search library and client-side search engine designed to index and query JSON data. It provides utilities for approximate string matching and ranking results by relevance, allowing applications to perform fast filtering and searching of datasets without a dedicated backend.

The library distinguishes itself through a token-based search implementation that supports word-order independence and relevance weighting. It utilizes edit-distance scoring to handle typos and insertions, and employs a system of field weighting to prioritize matches in high-value data keys.

The
- [wolfia-app/gpt-code-search](https://awesome-repositories.com/repository/wolfia-app-gpt-code-search.md) (208 ⭐) — gpt-code-assistant is an open-source coding assistant leveraging language models to search, retrieve, explore and understand any codebase.
- [github/codeql](https://awesome-repositories.com/repository/github-codeql.md) (9,252 ⭐) — CodeQL is a semantic code analysis engine and vulnerability scanning tool that treats source code as data. It utilizes a static analysis query language to define complex patterns and security vulnerabilities within a code graph database.

The system represents source code as a relational database, enabling the execution of structural queries and data flow analysis. This approach allows for the detection of security flaws and coding errors across large-scale repositories.

The tool provides capabilities for automated code auditing, static analysis security testing, and custom vulnerability dete
- [meilisearch/meilisearch](https://awesome-repositories.com/repository/meilisearch-meilisearch.md) (58,118 ⭐) — Meilisearch is a Rust-based search engine providing typo-tolerant full-text and vector-based semantic search with real-time conversational capabilities.
- [qwenlm/codeqwen1.5](https://awesome-repositories.com/repository/qwenlm-codeqwen1-5.md) (16,654 ⭐) — CodeQwen1.5 is a large language model designed for generating, completing, and analyzing code. It functions as an AI code generator capable of writing programming logic across hundreds of different languages.

The model is distinguished by its long-context capabilities, allowing it to process up to one million tokens to reason across entire software repositories. It also operates as a function calling model, utilizing specialized formats to execute complex coding tasks and browser-based automation.

The system supports intelligent code completion through fill-in-the-middle capabilities, which
- [semantic-ui-vue/semantic-ui-vue](https://awesome-repositories.com/repository/semantic-ui-vue-semantic-ui-vue.md) (924 ⭐) — Semantic UI integration for Vue
- [microsoft/vscode-copilot-chat](https://awesome-repositories.com/repository/microsoft-vscode-copilot-chat.md) (9,493 ⭐) — This project is an AI-powered IDE extension and LLM coding assistant that provides a conversational interface for generating, refactoring, and debugging code. It functions as an AI agent framework and a Model Context Protocol client, connecting AI models to external data sources and tools to automate complex development tasks.

The system is distinguished by its use of autonomous AI agents capable of multi-step task execution, including the ability to read files, modify code, and run terminal commands iteratively. It supports recursive agent orchestration through subagent delegation and employ
- [protectai/llm-guard](https://awesome-repositories.com/repository/protectai-llm-guard.md) (2,561 ⭐) — LLM Guard is a security firewall and guardrail framework designed to scan and sanitize inputs and outputs for large language models. It functions as a proxy gateway and security layer to block prompt injections, toxicity, and sensitive data leakage while ensuring that model interactions remain compliant with organizational policies.

The system distinguishes itself through a modular scanner pipeline that utilizes local model orchestration to eliminate external network dependencies. It supports real-time security filtering via streaming chunk analysis and implements a fail-fast execution model
- [mastra-ai/mastra](https://awesome-repositories.com/repository/mastra-ai-mastra.md) (21,221 ⭐) — Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention.

The framework distinguishes itself through its focus on observability and secure, isolated execut
- [ktorio/ktor](https://awesome-repositories.com/repository/ktorio-ktor.md) (14,444 ⭐) — Ktor is a framework for building asynchronous server applications and cross-platform network clients using the Kotlin programming language. It provides a lightweight, modular architecture that allows developers to construct services and communication layers by composing independent components and plugins.

The framework is defined by its pipeline-based plugin system, which enables the injection of custom logic into request processing stages, and a type-safe domain-specific language for defining application routing. By utilizing an asynchronous execution model, it handles concurrent network ope
- [wizi-ai/code-search](https://awesome-repositories.com/repository/wizi-ai-code-search.md) (0 ⭐)
- [semantic-org/semantic-ui-react](https://awesome-repositories.com/repository/semantic-org-semantic-ui-react.md) (13,218 ⭐) — Semantic UI React is a declarative component library that provides native React bindings for the Semantic UI design language. It enables the construction of complex user interfaces through a modular, component-based architecture that maps directly to established design patterns, allowing developers to build consistent web application layouts without manual HTML markup.

The library distinguishes itself through a shorthand property system that automatically generates and populates nested child components from data objects, significantly reducing the need for verbose code. It also supports polym
- [potpie-ai/potpie](https://awesome-repositories.com/repository/potpie-ai-potpie.md) (5,161 ⭐) — Potpie is an LLM codebase analysis platform and multi-agent orchestration framework designed to act as an AI software engineer. It parses repositories into a structured code knowledge graph, enabling AI agents to perform multi-hop reasoning, dependency tracing, and grounded technical analysis across large codebases.

The system distinguishes itself through a spec-driven development framework where agents generate detailed technical specifications and architecture plans before implementing multi-file code changes. It utilizes a durable execution engine to coordinate specialized AI personas for
- [appsilon/semantic.dashboard](https://awesome-repositories.com/repository/appsilon-semantic-dashboard.md) (256 ⭐) — semantic.dashboard
- [smol-ai/developer](https://awesome-repositories.com/repository/smol-ai-developer.md) (12,188 ⭐) — This project is an AI software engineering tool and framework for building autonomous coding agents. It provides a system for automating program synthesis and bug fixing by integrating large language models with codebase analysis and iterative refinement loops.

The framework features an agentic development server that exposes task execution interfaces to remote agents through a structured protocol. This allows for the remote execution of development tasks and the embedding of autonomous program synthesis capabilities into external software projects.

The toolset covers AI-driven project scaff
- [chainlit/chainlit](https://awesome-repositories.com/repository/chainlit-chainlit.md) (12,213 ⭐) — Chainlit is a Python framework designed for building and deploying interactive, stateful conversational AI interfaces. It provides a backend-driven platform that connects language models and agent frameworks to a web-based chat frontend, managing the complexities of session state, message history, and real-time communication.

The framework distinguishes itself by offering a component-based UI builder that allows developers to inject interactive widgets, rich media, and data visualizations directly into the chat stream. It supports the visualization of complex agent workflows, enabling users t
- [jamubc/gemini-mcp-tool](https://awesome-repositories.com/repository/jamubc-gemini-mcp-tool.md) (2,246 ⭐) — This tool functions as a Model Context Protocol server that bridges artificial intelligence models with local development environments. It enables AI assistants to perform codebase analysis, execute command-line utilities, and apply automated code modifications directly to local project files. By integrating with the Gemini API, the system facilitates deep interaction between external models and local system resources.

The project distinguishes itself through a robust security and reliability framework designed for automated development workflows. It enforces strict path-based access controls
- [semantic-org/semantic-ui](https://awesome-repositories.com/repository/semantic-org-semantic-ui.md) (51,064 ⭐) — Semantic-UI is an HTML and CSS UI framework consisting of a themed component library and a responsive layout framework. It provides a collection of reusable interface components and a grid-based system of columns and containers designed to build responsive websites.

The framework is distinguished by its use of natural-language class naming, which maps human-readable CSS classes to specific visual styles. It also functions as a right-to-left UI toolkit, utilizing directional mirroring to adjust visual flow and element alignment for languages read from right to left.

The system covers frontend
- [simstudioai/sim](https://awesome-repositories.com/repository/simstudioai-sim.md) (28,796 ⭐) — This project is an AI agent orchestration platform that provides a visual environment for building, testing, and deploying complex automation workflows. It functions as a low-code development interface where users can chain discrete functional blocks into dependency-aware pipelines to integrate artificial intelligence with external data and services. The platform supports the creation of intelligent conversational agents, automated business processes, and multi-service API orchestrations within a unified workspace.

The platform distinguishes itself through its event-driven integration engine,
- [nvidia/semantic-segmentation](https://awesome-repositories.com/repository/nvidia-semantic-segmentation.md) (1,823 ⭐) — Nvidia Semantic Segmentation monorepo
- [nashsu/llm_wiki](https://awesome-repositories.com/repository/nashsu-llm-wiki.md) (12,563 ⭐) — This project is an LLM knowledge base builder and personal knowledge management tool. It is a desktop application designed to transform diverse documents into a persistent, interlinked wiki through LLM analysis and incremental ingestion.

The system distinguishes itself with a knowledge graph visualizer that uses community detection algorithms to map relationships between concepts and identify topical clusters. It features a hybrid retrieval system that combines keyword matching, vector embeddings, and graph relevance to locate information.

The platform covers a wide range of capabilities inc
- [helicone/helicone](https://awesome-repositories.com/repository/helicone-helicone.md) (5,830 ⭐) — Helicone is an AI gateway and observability platform designed to intercept, manage, and monitor interactions with large language models. By acting as a reverse-proxy, it provides a centralized layer for routing requests across multiple AI providers, allowing developers to maintain consistent application logic while gaining deep visibility into model performance, usage, and costs.

The platform distinguishes itself through a robust suite of traffic management and prompt engineering tools. It enables policy-driven control, including automatic failover between providers, rate limiting, and edge-b
- [semantic-release/semantic-release](https://awesome-repositories.com/repository/semantic-release-semantic-release.md) (23,332 ⭐) — Semantic-release is an automated release management tool that determines version increments, generates changelogs, and publishes software packages by analyzing commit history against standardized conventions. It functions as a plugin-based orchestrator that integrates directly into continuous integration pipelines to manage the entire release lifecycle, from verifying environment conditions to distributing artifacts.

The project distinguishes itself through its commit-message-driven approach, which enforces consistent versioning standards and automates the creation of release notes based on t
- [qwenlm/qwen3-coder](https://awesome-repositories.com/repository/qwenlm-qwen3-coder.md) (15,615 ⭐) — Qwen3-Coder is a specialized large language model designed for software development, technical reasoning, and automated code synthesis. Built on transformer-based sequence modeling, it functions as a multilingual programming assistant capable of generating, completing, and debugging source code across more than one hundred programming languages.

The model distinguishes itself through its capacity to process and maintain logical coherence across massive datasets, supporting context windows of up to one million tokens. This allows for repository-scale reasoning, enabling the model to analyze co
- [javafxpert/llm-grovers-search-party](https://awesome-repositories.com/repository/javafxpert-llm-grovers-search-party.md) (0 ⭐)
- [567-labs/instructor](https://awesome-repositories.com/repository/567-labs-instructor.md) (13,176 ⭐) — Instructor is a framework designed for structured data extraction, validation, and language model integration. It functions as a library that transforms unstructured text into validated, type-safe objects by leveraging schema definitions and model-specific tool-calling capabilities. By acting as a validation middleware, the project ensures that language model outputs strictly conform to defined data structures.

The library distinguishes itself through a robust validation-based retry loop that automatically re-submits failed responses with error feedback to iteratively correct schema complianc
- [qwenlm/qwen-code](https://awesome-repositories.com/repository/qwenlm-qwen-code.md) (19,078 ⭐) — Qwen-code is an AI-powered development framework designed for orchestrating intelligent coding agents within terminal and IDE environments. It provides a comprehensive infrastructure for automating software maintenance, code generation, and complex refactoring tasks by managing multi-agent workflows and persistent session states. The system is built to handle both interactive development and automated background processes, ensuring that agents can execute shell commands and file operations safely within isolated, sandboxed environments.

What distinguishes this project is its focus on granular
- [appsilon/shiny.semantic](https://awesome-repositories.com/repository/appsilon-shiny-semantic.md) (512 ⭐) — With this library it is easy to wrap Shiny with Fomantic UI (previously Semantic). Add a few simple lines of code to give your UI a fresh, modern and highly interactive look.
- [gitbookio/gitbook](https://awesome-repositories.com/repository/gitbookio-gitbook.md) (28,902 ⭐) — Gitbook is a documentation-as-code platform designed for centralized technical knowledge management. It functions as a knowledge management system that synchronizes documentation files directly with version control repositories, allowing teams to maintain content alongside their source code.

The platform distinguishes itself through an integrated artificial intelligence layer that provides context-aware search assistance and automated content suggestions. By utilizing block-based content modeling, it enables the construction of structured, modular documentation that can be compiled into stati
- [coatisoftware/sourcetrail](https://awesome-repositories.com/repository/coatisoftware-sourcetrail.md) (16,471 ⭐) — Sourcetrail is an interactive source code explorer and visualizer designed for indexing and navigating relationships between symbols and structures across large, multi-language codebases. It functions as a static analysis indexer and code dependency visualizer that maps calls and dependencies between source files to help reveal project architecture.

The tool enables multi-language project analysis by using a language-agnostic indexing system to track symbols across different programming languages within a single interface. It allows for the discovery of software architecture and the explorati
- [nndl/llm-beginner](https://awesome-repositories.com/repository/nndl-llm-beginner.md) (6,421 ⭐) — This project is a collection of educational resources and technical guides focused on the development and implementation of large language models. It provides a comprehensive curriculum covering transformer architectures, training methods, and deployment strategies.

The materials provide detailed instructions for building autonomous agents using reasoning loops and tool integration, as well as guides for fine-tuning models through supervised learning and preference optimization. It also includes tutorials for constructing retrieval augmented generation pipelines and implementing transformer m
- [harana/search](https://awesome-repositories.com/repository/harana-search.md) (235 ⭐) — Search everything, instantly.
- [jujumilk3/leaked-system-prompts](https://awesome-repositories.com/repository/jujumilk3-leaked-system-prompts.md) (14,134 ⭐) — This project is a research-oriented repository that serves as a centralized database for system-level prompts and internal behavioral instructions extracted from various large language models. Its primary purpose is to provide a transparent, accessible reference for researchers and developers to study how artificial intelligence models are configured, constrained, and governed.

The repository distinguishes itself by cataloging the hidden directives and operational guidelines that define model personas and safety boundaries. By archiving these instruction sets, it enables comparative analysis
- [datawhalechina/llm-cookbook](https://awesome-repositories.com/repository/datawhalechina-llm-cookbook.md) (24,263 ⭐) — This repository is a comprehensive set of tutorials and examples for building software powered by large language models. It serves as an application development guide and a prompt engineering framework, providing instructional content for integrating model logic with user interfaces and external data sources.

The project provides technical walkthroughs for specialized workflows, including the implementation of retrieval augmented generation using vector databases and semantic search. It includes guidance on adapting pre-trained model weights through fine-tuning with private datasets and the o
- [tabbyml/tabby](https://awesome-repositories.com/repository/tabbyml-tabby.md) (33,605 ⭐) — Tabby is a self-hosted AI coding assistant designed to provide real-time code completion and interactive chat capabilities within development environments. By functioning as a private server application, it allows teams to maintain control over their infrastructure and data while integrating intelligent code generation directly into their existing workflows.

The platform distinguishes itself through its repository-aware knowledge retrieval and multi-model orchestration. It indexes local and remote source code repositories and technical documentation into a searchable vector-based knowledge gr
- [codefuse-ai/awesome-code-llm](https://awesome-repositories.com/repository/codefuse-ai-awesome-code-llm.md) (3,385 ⭐) — [TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
- [karpathy/llm.c](https://awesome-repositories.com/repository/karpathy-llm-c.md) (30,230 ⭐) — This project is a low-dependency engine designed for training large language models using native C and CUDA. It provides a bare-metal environment for tensor computation, allowing for the execution of neural network operations directly on hardware accelerators without the overhead of high-level software abstractions.

The framework distinguishes itself by implementing manual gradient backpropagation and custom hardware-specific kernels, providing granular control over memory mapping and computational precision. It supports distributed training across multiple graphics processors and compute nod
- [reflex-search/reflex](https://awesome-repositories.com/repository/reflex-search-reflex.md) (60 ⭐) — Reflex - The instant, code-aware local search engine.
- [smallcloudai/refact](https://awesome-repositories.com/repository/smallcloudai-refact.md) (3,490 ⭐) — Refact is an autonomous AI software engineering system and code assistant. It functions as an agent orchestrator capable of planning, executing, and managing multi-step development workflows to complete complex software tasks independently.

The system distinguishes itself through agentic state management, using isolated worktrees and versioned checkpoints to allow autonomous agents to experiment with code changes and roll back to stable states if tasks fail. It further extends its capabilities via the Model Context Protocol, connecting the AI engine to external databases, version control syst