22 repos

Awesome GitHub RepositoriesSearch and Indexing Technologies

Specialized tools for indexing, searching, and retrieving information across diverse data stores.

Explore 22 awesome GitHub repositories matching data & databases · Search and Indexing Technologies. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

EbookFoundation/free-programming-books
EbookFoundation/free-programming-books
382,801GitHubView on GitHub
This project is a centralized, open-access repository that serves as a structured directory for technical education and professional development. It functions as a community-driven knowledge base, aggregating high-quality learning materials to support global accessibility to computer science and software engineering re
Pythonbookseducationhacktoberfest
openclaw/openclaw
openclaw/openclaw
211,971GitHubView on GitHub
Openclaw is a platform for managing agent execution environments, providing the infrastructure to control agent lifecycles, session state, and workspace persistence. It features a centralized gateway that handles model loops, tool invocation, and streaming events, while supporting multi-agent routing and persistent mem
TypeScriptaiassistantcrustacean
Significant-Gravitas/AutoGPT
Significant-Gravitas/AutoGPT
181,891GitHubView on GitHub
AutoGPT is an orchestration platform designed for building, managing, and deploying autonomous agents. It provides a visual canvas-based environment where users can assemble agents by connecting modular blocks that represent actions, data flows, and conditional logic. The platform supports the entire agent lifecycle, i
Pythonaiartificial-intelligenceautonomous-agents
f/prompts.chat
f/prompts.chat
145,637GitHubView on GitHub
Prompts.chat is a community-driven repository and management platform for AI prompts and agent skills. It provides a centralized interface for users to search, retrieve, and save prompts, while offering structured storage for multi-file agent skills that include documentation and supporting assets. The platform distin
HTMLaiartificial-intelligenceawesome-list
521xueweihan/HelloGitHub
521xueweihan/HelloGitHub
143,312GitHubView on GitHub
HelloGitHub is a centralized discovery platform and technical knowledge repository designed to help developers identify high-quality open-source projects, libraries, and infrastructure. It functions as a structured directory that aggregates specialized development tools and educational materials, organizing them by tec
Pythonawesomegithubhellogithub
ripienaar/free-for-dev
ripienaar/free-for-dev
118,073GitHubView on GitHub
This project is a community-maintained directory of technical resources, tools, and services that offer free tiers for developers. It serves as a centralized reference point for discovering infrastructure, software, and educational materials, helping individuals and teams minimize operational costs while building and s
HTMLawesome-listfree-for-developers
Anduin2017/HowToCook
Anduin2017/HowToCook
98,028GitHubView on GitHub
HowToCook is a structured culinary knowledge base and computational engine designed for the management and scaling of instructional cooking content. It provides a framework for organizing technical preparation procedures and ingredient data, allowing users to maintain consistent culinary standards across various meal s
Dockerfilechinesecookbookcooking
supabase/supabase
supabase/supabase
97,908GitHubView on GitHub
This project provides an integrated backend platform built around a relational database. It automatically generates REST and GraphQL APIs from database schemas, allowing for direct data interaction through standard requests and client libraries. The platform includes a comprehensive authentication system that manages u
TypeScriptaialternativeauth
firecrawl/firecrawl
firecrawl/firecrawl
84,034GitHubView on GitHub
Firecrawl is a web data extraction platform designed to convert unstructured web content into clean, LLM-ready formats like markdown or JSON. It functions as an autonomous web crawler and scraper, capable of mapping entire domains, performing recursive navigation, and executing complex data gathering tasks. By leveragi
TypeScriptaiai-agentsai-crawler
bregman-arie/devops-exercises
bregman-arie/devops-exercises
81,169GitHubView on GitHub
This project is a comprehensive educational curriculum designed to build proficiency across modern infrastructure, cloud-native technologies, and systems administration. It functions as a reference library and interview preparation resource, offering a structured collection of conceptual questions, practical coding cha
Pythonansibleawsazure
DopplerHQ/awesome-interview-questions
DopplerHQ/awesome-interview-questions
81,035GitHubView on GitHub
This project is a comprehensive, community-sourced repository of technical interview questions and study materials. It serves as a centralized index for software engineers to prepare for technical assessments, benchmark their personal knowledge, and identify gaps in their expertise across a wide range of programming la
android-interview-questionsangularjs-interview-questionsawesome
junegunn/fzf
junegunn/fzf
77,987GitHubView on GitHub
This project is a general-purpose command-line filter that provides an interactive interface for processing standard input streams. It enables real-time fuzzy searching, data selection, and transformation, allowing users to navigate complex information or file systems directly within their terminal. By utilizing a pipe
Gobashclifish
elastic/elasticsearch
elastic/elasticsearch
76,163GitHubView on GitHub
Elasticsearch is a distributed search engine and document store designed for the high-performance indexing and retrieval of massive volumes of unstructured data. It functions as a centralized analytics platform, providing a schema-flexible architecture that organizes information into searchable indices while maintainin
Javaelasticsearchjavasearch-engine
infiniflow/ragflow
infiniflow/ragflow
73,425GitHubView on GitHub
This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin
Pythonagentagenticagentic-ai
redis/redis
redis/redis
73,096GitHubView on GitHub
Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to pr
Ccachecachingdatabase
awesomedata/awesome-public-datasets
awesomedata/awesome-public-datasets
72,846GitHubView on GitHub
This project is a community-maintained, open-access directory of high-quality public datasets. It serves as a centralized reference point for researchers, developers, and data scientists to locate reliable information sources across a wide spectrum of industries and scientific fields. By providing a structured index, t
aaron-swartzawesome-public-datasetsdatasets
binhnguyennus/awesome-scalability
binhnguyennus/awesome-scalability
68,707GitHubView on GitHub
This project is a curated knowledge repository that aggregates high-quality resources, technical documentation, and expert insights focused on distributed systems engineering. It serves as a community-driven learning resource designed to help developers navigate the complexities of building and maintaining large-scale
architectureawesomeawesome-list
pathwaycom/pathway
pathwaycom/pathway
59,684GitHubView on GitHub
Pathway is a high-performance data processing framework designed for building unified batch and streaming pipelines. It functions as an orchestrator for complex data transformations, utilizing a differential dataflow engine to process updates incrementally. By treating static datasets and continuous event streams with
Pythonbatch-processingdata-analyticsdata-pipelines
zylon-ai/private-gpt
zylon-ai/private-gpt
57,116GitHubView on GitHub
This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov
Python
pathwaycom/llm-app
pathwaycom/llm-app
56,311GitHubView on GitHub
This project is a data processing engine and AI application platform designed for building production-grade machine learning workflows. It provides a unified programming model that handles both historical batch data and live stream ingestion, enabling the development of real-time ETL pipelines and scalable data transfo
Jupyter Notebookchatbothugging-facellm

Explore sub-tags

22 repos

Awesome GitHub RepositoriesSearch and Indexing Technologies

Specialized tools for indexing, searching, and retrieving information across diverse data stores.

Explore 22 awesome GitHub repositories matching data & databases · Search and Indexing Technologies. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

EbookFoundation/free-programming-books
EbookFoundation/free-programming-books
382,801GitHubView on GitHub
This project is a centralized, open-access repository that serves as a structured directory for technical education and professional development. It functions as a community-driven knowledge base, aggregating high-quality learning materials to support global accessibility to computer science and software engineering re
Pythonbookseducationhacktoberfest
openclaw/openclaw
openclaw/openclaw
211,971GitHubView on GitHub
Openclaw is a platform for managing agent execution environments, providing the infrastructure to control agent lifecycles, session state, and workspace persistence. It features a centralized gateway that handles model loops, tool invocation, and streaming events, while supporting multi-agent routing and persistent mem
TypeScriptaiassistantcrustacean
Significant-Gravitas/AutoGPT
Significant-Gravitas/AutoGPT
181,891GitHubView on GitHub
AutoGPT is an orchestration platform designed for building, managing, and deploying autonomous agents. It provides a visual canvas-based environment where users can assemble agents by connecting modular blocks that represent actions, data flows, and conditional logic. The platform supports the entire agent lifecycle, i
Pythonaiartificial-intelligenceautonomous-agents
f/prompts.chat
f/prompts.chat
145,637GitHubView on GitHub
Prompts.chat is a community-driven repository and management platform for AI prompts and agent skills. It provides a centralized interface for users to search, retrieve, and save prompts, while offering structured storage for multi-file agent skills that include documentation and supporting assets. The platform distin
HTMLaiartificial-intelligenceawesome-list
521xueweihan/HelloGitHub
521xueweihan/HelloGitHub
143,312GitHubView on GitHub
HelloGitHub is a centralized discovery platform and technical knowledge repository designed to help developers identify high-quality open-source projects, libraries, and infrastructure. It functions as a structured directory that aggregates specialized development tools and educational materials, organizing them by tec
Pythonawesomegithubhellogithub
ripienaar/free-for-dev
ripienaar/free-for-dev
118,073GitHubView on GitHub
This project is a community-maintained directory of technical resources, tools, and services that offer free tiers for developers. It serves as a centralized reference point for discovering infrastructure, software, and educational materials, helping individuals and teams minimize operational costs while building and s
HTMLawesome-listfree-for-developers
Anduin2017/HowToCook
Anduin2017/HowToCook
98,028GitHubView on GitHub
HowToCook is a structured culinary knowledge base and computational engine designed for the management and scaling of instructional cooking content. It provides a framework for organizing technical preparation procedures and ingredient data, allowing users to maintain consistent culinary standards across various meal s
Dockerfilechinesecookbookcooking
supabase/supabase
supabase/supabase
97,908GitHubView on GitHub
This project provides an integrated backend platform built around a relational database. It automatically generates REST and GraphQL APIs from database schemas, allowing for direct data interaction through standard requests and client libraries. The platform includes a comprehensive authentication system that manages u
TypeScriptaialternativeauth
firecrawl/firecrawl
firecrawl/firecrawl
84,034GitHubView on GitHub
Firecrawl is a web data extraction platform designed to convert unstructured web content into clean, LLM-ready formats like markdown or JSON. It functions as an autonomous web crawler and scraper, capable of mapping entire domains, performing recursive navigation, and executing complex data gathering tasks. By leveragi
TypeScriptaiai-agentsai-crawler
bregman-arie/devops-exercises
bregman-arie/devops-exercises
81,169GitHubView on GitHub
This project is a comprehensive educational curriculum designed to build proficiency across modern infrastructure, cloud-native technologies, and systems administration. It functions as a reference library and interview preparation resource, offering a structured collection of conceptual questions, practical coding cha
Pythonansibleawsazure
DopplerHQ/awesome-interview-questions
DopplerHQ/awesome-interview-questions
81,035GitHubView on GitHub
This project is a comprehensive, community-sourced repository of technical interview questions and study materials. It serves as a centralized index for software engineers to prepare for technical assessments, benchmark their personal knowledge, and identify gaps in their expertise across a wide range of programming la
android-interview-questionsangularjs-interview-questionsawesome
junegunn/fzf
junegunn/fzf
77,987GitHubView on GitHub
This project is a general-purpose command-line filter that provides an interactive interface for processing standard input streams. It enables real-time fuzzy searching, data selection, and transformation, allowing users to navigate complex information or file systems directly within their terminal. By utilizing a pipe
Gobashclifish
elastic/elasticsearch
elastic/elasticsearch
76,163GitHubView on GitHub
Elasticsearch is a distributed search engine and document store designed for the high-performance indexing and retrieval of massive volumes of unstructured data. It functions as a centralized analytics platform, providing a schema-flexible architecture that organizes information into searchable indices while maintainin
Javaelasticsearchjavasearch-engine
infiniflow/ragflow
infiniflow/ragflow
73,425GitHubView on GitHub
This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin
Pythonagentagenticagentic-ai
redis/redis
redis/redis
73,096GitHubView on GitHub
Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to pr
Ccachecachingdatabase
awesomedata/awesome-public-datasets
awesomedata/awesome-public-datasets
72,846GitHubView on GitHub
This project is a community-maintained, open-access directory of high-quality public datasets. It serves as a centralized reference point for researchers, developers, and data scientists to locate reliable information sources across a wide spectrum of industries and scientific fields. By providing a structured index, t
aaron-swartzawesome-public-datasetsdatasets
binhnguyennus/awesome-scalability
binhnguyennus/awesome-scalability
68,707GitHubView on GitHub
This project is a curated knowledge repository that aggregates high-quality resources, technical documentation, and expert insights focused on distributed systems engineering. It serves as a community-driven learning resource designed to help developers navigate the complexities of building and maintaining large-scale
architectureawesomeawesome-list
pathwaycom/pathway
pathwaycom/pathway
59,684GitHubView on GitHub
Pathway is a high-performance data processing framework designed for building unified batch and streaming pipelines. It functions as an orchestrator for complex data transformations, utilizing a differential dataflow engine to process updates incrementally. By treating static datasets and continuous event streams with
Pythonbatch-processingdata-analyticsdata-pipelines
zylon-ai/private-gpt
zylon-ai/private-gpt
57,116GitHubView on GitHub
This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov
Python
pathwaycom/llm-app
pathwaycom/llm-app
56,311GitHubView on GitHub
This project is a data processing engine and AI application platform designed for building production-grade machine learning workflows. It provides a unified programming model that handles both historical batch data and live stream ingestion, enabling the development of real-time ETL pipelines and scalable data transfo
Jupyter Notebookchatbothugging-facellm

Awesome Search and Indexing Technologies GitHub Repositories

EbookFoundation/free-programming-books

openclaw/openclaw

Significant-Gravitas/AutoGPT

f/prompts.chat

521xueweihan/HelloGitHub

ripienaar/free-for-dev

Anduin2017/HowToCook

supabase/supabase

firecrawl/firecrawl

bregman-arie/devops-exercises

DopplerHQ/awesome-interview-questions

junegunn/fzf

elastic/elasticsearch

infiniflow/ragflow

redis/redis

awesomedata/awesome-public-datasets

binhnguyennus/awesome-scalability

pathwaycom/pathway

zylon-ai/private-gpt

pathwaycom/llm-app

Explore sub-tags

Awesome Search and Indexing Technologies GitHub Repositories

EbookFoundation/free-programming-books

openclaw/openclaw

Significant-Gravitas/AutoGPT

f/prompts.chat

521xueweihan/HelloGitHub

ripienaar/free-for-dev

Anduin2017/HowToCook

supabase/supabase

firecrawl/firecrawl

bregman-arie/devops-exercises

DopplerHQ/awesome-interview-questions

junegunn/fzf

elastic/elasticsearch

infiniflow/ragflow

redis/redis

awesomedata/awesome-public-datasets

binhnguyennus/awesome-scalability

pathwaycom/pathway

zylon-ai/private-gpt

pathwaycom/llm-app

Explore sub-tags