30 open-source projects similar to google/codesearch, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Codesearch alternative.
Hound is a self-hosted code search engine that indexes source code repositories and provides fast regular expression search results using a trigram-based index. It is designed to be deployed on your own infrastructure, enabling you to search across multiple public and private code repositories simultaneously. The engine builds its search index by decomposing source code into three-character trigrams, which allows for fast substring matching with regular expressions. It supports searching across multiple repositories in parallel, returning results from the pre-built trigram index. Hound can in
Hound is a self-hosted source code search engine designed to index multiple repositories for high-performance regular expression queries. It serves as a multi-repository code indexer that provides a centralized interface for searching across large-scale, private, and versioned codebases. The system utilizes trigram-based indexing to enable fast pattern matching and regular expression lookups without scanning every file. To maintain current search results, it employs automatic synchronization through a combination of periodic polling of remote version control systems and monitoring of local di
OpenGrok is a Java-based source code search engine and indexer designed to process large source trees and binaries into a searchable index. It functions as a version control browser, allowing for the exploration and searching of revision histories integrated with version control systems. The system provides symbol-based cross-referencing to link code definitions and usages, enabling navigation across a codebase. It utilizes an inverted-index search engine to perform full-text retrieval of source code. The application supports periodic source synchronization and reindexing to keep local data
This is a Rust regular expression library that provides a finite automata engine for searching and matching text patterns. It functions as a Unicode-compliant text scanner designed to guarantee linear time execution on all inputs to prevent catastrophic backtracking. The engine supports both single and multi-pattern search capabilities, allowing it to scan a piece of text for multiple regular expressions simultaneously. It operates on both strings and raw byte slices to identify matching text segments. The library covers text parsing, string validation, and pattern searching. It includes cap
Super-expressive is a zero-dependency JavaScript library and domain-specific language used to construct complex regular expressions. It functions as a pattern generator that uses natural language syntax to produce native regular expression objects or strings, supporting international text standards through Unicode property matching. The library replaces manual string manipulation and escaping with a method-chaining fluent interface. It allows for modular expression composition, enabling the creation of reusable pattern hierarchies where existing expression instances can be nested as subexpres
pysheeet is a technical reference library providing a curated collection of code snippets and implementation patterns for advanced Python development, system integration, and high-performance computing. It serves as a comprehensive guide for implementing low-level network programming, native C extensions, and asynchronous and concurrent programming. The project provides specialized frameworks for the development and deployment of large language models, including tools for distributed GPU inference and high-performance serving. It also includes detailed patterns for high-performance computing
Universal Ctags is a multi-language symbol indexer and regex-based parsing engine used to extract and catalog functions, classes, and variables from source code. It functions as a source code indexer that scans files across diverse programming languages to create searchable catalogs of definitions and declarations. The project is distinguished by its extensible parser framework, which allows users to define new language rules using regular expressions and configuration files. It supports complex parsing scenarios through state-based parsing, stack-oriented scope tracking, and guest-parser del
This project is a Vim IDE configuration and plugin suite designed to transform the Vim text editor into a full development environment. It focuses on C++ development by integrating source code indexing and automated plugin management. The environment utilizes compiler backends and abstract syntax trees for semantic code completion and static code analysis. It employs tag files for symbol indexing, enabling rapid navigation between function definitions, class headers, and implementation files. The workspace includes productivity tools such as shorthand snippet expansion, line bookmarking, and
sourcekit-lsp is a language server protocol implementation and IDE code intelligence backend. It functions as a Swift language server and source code indexer, providing semantic analysis, autocomplete, and navigation features to compatible text editors. The project integrates directly with the Swift toolchain and SourceKit framework to deliver precise type information. It also leverages Clang-integrated parsing to provide semantic analysis for C-family languages, enabling cross-language navigation across project files. The server manages source code indexing and symbol lookups while using a
ripgrep-all is a command-line utility that extends ripgrep to perform regular expression searches across binary files, compressed archives, and media formats. It functions as a universal text extractor that converts non-plain-text formats, such as PDFs, E-books, and Office documents, into searchable text. The tool uses a system of adapters to transform binary data into plain text and utilizes a local database to cache these extracted versions, accelerating repeated search operations. It identifies file types by analyzing header magic bytes rather than relying on file extensions. The project
The Silver Searcher is a high-performance text search utility and regex code search tool designed to locate strings and regular expressions within plain text and source code. It functions as a codebase pattern matcher that provides highlighted results with surrounding line context and respects standard ignore files. The utility includes specialized capabilities for searching inside zlib and lzma compressed archives. It implements high-throughput processing via parallel-threaded file scanning and just-in-time regular expression compilation. The tool's search and indexing surface covers output
This project is a full text search engine and enterprise search infrastructure designed for indexing and retrieving large sets of documents. It provides a comprehensive framework for information discovery using ranked results and linguistic analysis. The system integrates high-dimensional vector similarity search for semantic retrieval alongside traditional full-text capabilities. It distinguishes itself through support for geospatial data retrieval, multilingual text processing, and a search suggestion workflow that includes typo-tolerant query completion and spellchecking. The platform cov
Wukong is a distributed full-text search engine designed for indexing and retrieving text documents. It functions as a customizable search backend that employs a BM25 relevance ranker to order search results based on term frequency and inverse document frequency. The system includes a specialized Chinese text segmenter to break continuous character strings into meaningful words for accurate indexing and retrieval. To handle large datasets and high request volumes, it utilizes a distributed search index that employs hash-based sharding to split documents across multiple nodes. The engine prov
Apkleaks is a static analysis tool and security auditor designed to extract hardcoded secrets, API endpoints, and sensitive data from Android application packages. It operates as a secret scanner that analyzes compiled binaries without executing them to identify potential information leaks and insecure endpoints. The tool utilizes a regex-based data extraction engine to identify sensitive strings within decompiled code. It supports customization through JSON-defined search patterns and provides configuration flags to tune the behavior of the underlying disassembler. The analysis pipeline enc
This project is a specialized instruction set for AI coding agents designed to perform structured, language-specific code reviews. It functions as an automated tool that evaluates source code against predefined checklists to identify security, performance, and architectural inconsistencies across diverse technology stacks. The system distinguishes itself by employing a multi-phase analysis pipeline that moves from high-level architectural assessments to granular, line-by-line inspections. It utilizes a severity-based taxonomy to categorize findings, clearly separating blocking security issues
Open Semantic Search is an open-source enterprise discovery platform designed to index, analyze, and explore large, diverse document collections. It functions as a comprehensive search engine and analytics suite that transforms unstructured data into structured information through automated processing pipelines. The platform distinguishes itself by integrating semantic exploration with traditional retrieval methods. It utilizes knowledge graph entity linking and thesaurus-driven query expansion to connect related concepts, allowing users to navigate datasets beyond simple keyword matching. Th
Toshi is a full-text search engine and library implemented in Rust, designed to manage and query large-scale structured datasets. It functions as a distributed search platform that enables high-speed information retrieval across massive collections of data. The system distinguishes itself through an architecture built for high-throughput ingestion and parallel query execution. It utilizes an actor-model concurrency framework to coordinate worker processes and employs distributed sharding to partition index segments across multiple nodes. To maintain performance and data integrity, the engine
This project is a vulnerability search engine and security knowledge base designed to collect and index public security disclosures. It functions as a vulnerability database crawler that extracts technical reports and security flaws from websites to create a searchable local archive. The system utilizes a security knowledge indexer and a full-text inverted index to convert unstructured crawled data into a structured format. This allows for keyword-based information retrieval, enabling the location of specific security flaws and technical details through a dedicated search interface. The plat
RediSearch is a Redis module that adds secondary indexing, full-text search, aggregation, and vector similarity search directly into the in-memory data store. It operates as an in-process search engine, extending the core key-value store with capabilities for indexing hash and JSON documents, enabling fast field-level lookups beyond primary key access. The module provides a full-text search engine built on inverted indexes, supporting stemming, fuzzy matching, and relevance scoring via tf-idf. It also includes a vector similarity search engine using a Hierarchical Navigable Small World graph
You-Dont-Need-GUI is a curated reference of terminal commands that replace common graphical interface operations with equivalent shell one-liners. It maps everyday GUI actions—file management, archive handling, system monitoring, and network diagnostics—to standard POSIX utilities like find, grep, and awk, all composed as self-contained shell pipelines. The project distinguishes itself by requiring no external dependencies or installations; every solution runs with built-in shell commands and coreutils. Its documentation follows Unix man-page conventions, presenting each command with a
This project is a comprehensive educational resource and technical guide for Bash shell programming and command-line operations. It serves as a programming guide, scripting reference, and tutorial for navigating Unix-like terminal environments. The documentation covers a broad range of system administration and automation tasks, including remote server administration via secure shell connections and the management of system processes and resources. It provides detailed instructions on executing remote commands and performing secure file transfers between hosts. The guide details core scripti
elasticsearch-dump is a command line tool for importing, exporting, and transferring data between Elasticsearch and OpenSearch instances. It functions as an index dump utility that saves documents, mappings, and analyzers to local files or standard output. The tool enables the movement of data between clusters using local files as an intermediary and can flatten nested JSON documents into CSV files for external analysis. It allows for the modification or anonymization of documents during the transfer process through the use of custom JavaScript functions. The utility covers data extraction a
Elasticsearch is a distributed search engine and NoSQL document store designed for full-text search and real-time data retrieval. It functions as a RESTful data indexer and vector database, allowing for the storage and management of structured JSON documents across multiple nodes. The system distinguishes itself through its ability to serve as a log analytics platform for monitoring system health and security events. It incorporates vector search implementation using mathematical embeddings to support generative AI and augmented generation applications. The platform covers a broad range of c
fzf-lua is a fuzzy finder integration for Neovim that utilizes fzf to search files, buffers, and project symbols. It serves as a code navigation framework providing a dynamic result generator that populates search windows using real-time shell commands or custom Lua functions. The project distinguishes itself through specialized integration tools for Git and Language Server Protocols. It includes a Git search interface for navigating commits, branches, stashes, and diffs, alongside an LSP integration tool that bridges language server providers to locate definitions and references across a cod
NeDB is a JavaScript embedded NoSQL document store designed for Node.js and the browser. It functions as an in-memory data store with the option to persist documents to a local file system, ensuring data survives application restarts. The project utilizes a MongoDB-compatible API to perform data operations, allowing it to serve as a lightweight document indexing system and a persistent file database without requiring a separate database server. Capabilities include querying, inserting, updating, and deleting documents, as well as the ability to create indexes on specific fields to accelerate
Payload is a headless content management system and application framework that uses a code-first approach to define data schemas and administrative interfaces. By utilizing a centralized, type-safe configuration object, it automatically generates database schemas, API endpoints, and a fully customizable admin panel. The system is built on a database-agnostic architecture, allowing it to interface with various storage engines while providing a unified, type-safe API for server-side operations, REST, and GraphQL. What distinguishes Payload is its deep extensibility and developer-centric design.
This project is a high-performance command-line utility designed for rapid filesystem navigation and file discovery. It enables users to locate files and directories within large project structures using recursive search, pattern matching, and metadata-aware filtering. By employing multi-threaded parallel traversal, it provides an efficient way to explore complex directory trees. What distinguishes this tool is its ability to integrate directly into terminal workflows and automate file management tasks. It automatically respects version control ignore files and hidden file settings, ensuring
mini.nvim is a comprehensive library of independent modules designed to extend Neovim with a wide array of navigation, user interface, and text manipulation tools. It serves as a modular plugin collection, a UI toolkit for creating custom statuslines and notifications, and a package manager for installing and pinning external plugins from Git. The project provides a specialized fuzzy picker framework for filtering files and symbols, an LSP completion engine with interactive snippet expansion, and a dedicated plugin test framework that uses headless editor instances and remote procedure calls
Vis is a terminal-based modal text editor that utilizes vi keybindings and a system of structural regular expressions. It functions as a scriptable environment where Lua is used for configuration, custom key mappings, and plugin development. The editor distinguishes itself through a syntax highlighting system based on Parsing Expression Grammars and a pattern matching engine that treats text as a structure for complex search and replace operations. It also integrates directly with the system shell, allowing users to pipe text ranges to external commands and capture the resulting output. The
Eloquent-JavaScript is a comprehensive JavaScript programming textbook and interactive coding tutorial designed for web development education. It serves as both a language reference and a practical guide, combining theoretical lessons with an environment where learners can execute and modify code examples. The project focuses on the fundamental principles of the JavaScript language, including lexical scoping, prototype-based inheritance, and asynchronous patterns. It provides detailed instruction on object-oriented programming, functional programming, and the use of the browser DOM to create