awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Data Storage · Awesome GitHub Repositories

18 repos

Awesome GitHub RepositoriesData Storage

Components and utilities that facilitate the saving, retrieving, and managing of data within an application environment.

Explore 18 awesome GitHub repositories matching data & databases · Data Storage. Refine with filters or upvote what's useful.

  1. Home
  2. Data & Databases
  3. Data Engineering and Infrastructure
  4. Data Persistence and Storage
  5. Data Storage

Awesome Data Storage GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • vinta/awesome-python

    vinta/awesome-python

    283,687GitHubView on GitHub↗

    This project is a comprehensive, community-curated directory that organizes a vast landscape of Python software libraries, frameworks, and tools. It serves as a centralized knowledge base designed to facilitate ecosystem navigation and accelerate developer discovery across the entire software development lifecycle. Th

    Pythonawesomecollectionspython
  • langchain-ai/langchain

    langchain-ai/langchain

    127,015GitHubView on GitHub↗

    LangChain is an orchestration framework designed for building, managing, and deploying applications powered by large language models. It provides a unified integration layer that normalizes disparate model provider APIs into a consistent set of primitives, enabling developers to build complex, multi-step AI workflows t

    Pythonagentsaiai-agents
  • excalidraw/excalidraw

    excalidraw/excalidraw

    117,138GitHubView on GitHub↗

    This project is a virtual whiteboard component and vector graphics editor designed for creating diagrams with a hand-drawn aesthetic. It provides a canvas-based drawing engine that can be embedded directly into web applications, allowing users to manipulate shapes, upload images, and export visual data into standard fo

    TypeScriptcanvascollaborationdiagrams
  • pytorch/pytorch

    pytorch/pytorch

    97,601GitHubView on GitHub↗

    PyTorch is a machine learning framework centered on a GPU-ready tensor library that supports multi-dimensional array operations across both CPU and accelerator hardware. It provides a foundational infrastructure for mathematical computation and dynamic neural network construction, utilizing a tape-based automatic diffe

    Pythonautograddeep-learninggpu
  • ChatGPTNextWeb/NextChat

    ChatGPTNextWeb/NextChat

    87,317GitHubView on GitHub↗

    NextChat is a self-hosted web application that provides a unified interface for interacting with multiple large language models. It functions as a conversational platform where users can manage and switch between diverse AI providers through configurable API backends, maintaining full control over their data and infras

    TypeScriptcalclaudechatgptclaude
  • netdata/netdata

    netdata/netdata

    77,812GitHubView on GitHub↗

    Netdata is a distributed observability platform designed for real-time infrastructure monitoring and performance tracking. It functions as a high-frequency agent that collects system, container, and application metrics with per-second precision, providing both local visualization and centralized aggregation across comp

    Caialertingcncf
  • elastic/elasticsearch

    elastic/elasticsearch

    76,163GitHubView on GitHub↗

    Elasticsearch is a distributed search engine and document store designed for the high-performance indexing and retrieval of massive volumes of unstructured data. It functions as a centralized analytics platform, providing a schema-flexible architecture that organizes information into searchable indices while maintainin

    Javaelasticsearchjavasearch-engine
  • redis/redis

    redis/redis

    73,096GitHubView on GitHub↗

    Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to pr

    Ccachecachingdatabase
  • awesomedata/awesome-public-datasets

    awesomedata/awesome-public-datasets

    72,846GitHubView on GitHub↗

    This project is a community-maintained, open-access directory of high-quality public datasets. It serves as a centralized reference point for researchers, developers, and data scientists to locate reliable information sources across a wide spectrum of industries and scientific fields. By providing a structured index, t

    aaron-swartzawesome-public-datasetsdatasets
  • Eugeny/tabby

    Eugeny/tabby

    68,976GitHubView on GitHub↗

    Tabby is a cross-platform terminal emulator and desktop application suite designed for managing command-line workflows and remote infrastructure. It provides a comprehensive environment for terminal session orchestration, allowing users to organize multiple active sessions through split panes and custom layouts. The ap

    TypeScriptserialssh-clienttelnet-client
  • OpenHands/OpenHands

    OpenHands/OpenHands

    67,974GitHubView on GitHub↗

    OpenHands is an autonomous agent framework designed for software engineering workflows. It provides a modular platform for orchestrating AI agents that reason, plan, and execute tasks within isolated, containerized development environments. By integrating with standard version control and development tools, the system

    Pythonagentartificial-intelligencechatgpt
  • localstack/localstack

    localstack/localstack

    64,423GitHubView on GitHub↗

    LocalStack is an infrastructure development environment that provides a local simulation of cloud services. By leveraging container-orchestrated service lifecycles, it allows developers to build, test, and debug cloud-native applications on their local machines without requiring remote connectivity or incurring cloud p

    Pythonawscloudcontinuous-integration
  • prometheus/prometheus

    prometheus/prometheus

    62,853GitHubView on GitHub↗

    Prometheus is a comprehensive monitoring and alerting platform designed to track infrastructure health and application performance. It functions as a time series database that ingests, indexes, and queries high-frequency numerical data points. By utilizing a pull-based model, the system periodically collects multi-dime

    Goalertinggraphinghacktoberfest
  • Solido/awesome-flutter

    Solido/awesome-flutter

    59,015GitHubView on GitHub↗

    This project is a community-curated directory of resources, libraries, and tools designed to support developers working with the Flutter framework. It functions as a centralized knowledge base, organizing high-quality external references into a structured, human-readable format to assist in the discovery of technical m

    Dartandroidawesomeawesome-list
  • zylon-ai/private-gpt

    zylon-ai/private-gpt

    57,116GitHubView on GitHub↗

    This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov

    Python
  • pmndrs/zustand

    pmndrs/zustand

    57,057GitHubView on GitHub↗

    Zustand is a state management library that provides a centralized store for managing shared application data. It functions as a reactive container that connects application state to components, allowing them to subscribe to specific slices of data and trigger updates automatically. By utilizing selector-based data acce

    TypeScripthacktoberfesthooksreact
  • laurent22/joplin

    laurent22/joplin

    53,497GitHubView on GitHub↗

    Joplin is an open-source, cross-platform note-taking application designed for secure, private knowledge management. It functions as a local-first productivity platform, maintaining a complete relational database on the user's device to ensure offline availability and high-performance data retrieval. The application pri

    TypeScriptandroiddropboxelectron
  • TryGhost/Ghost

    TryGhost/Ghost

    51,857GitHubView on GitHub↗

    Ghost is an open-source publishing platform and headless content management system designed for professional publishers. It provides a decoupled architecture that separates the content management backend from the front-end delivery layer, allowing users to manage editorial workflows and site data through structured web

    JavaScriptbloggingcmsghost

Explore sub-tags

  • Application CachingMechanisms for storing frequently accessed data in memory.
  • Cache AdaptersComponents that allow replacing default memory storage with external database solutions for improved performance.
  • Client-Side Persistence4 sub-tagsMechanisms for managing data storage directly on user devices or within browser environments, distinct from server-side infrastructure.
  • Data Access Abstractions
2 sub-tags
Middleware and interface layers that decouple application logic from specific underlying storage engines or physical backends.
  • File-Based Storage Systems4 sub-tagsPersistence strategies that utilize local or structured file systems for organizing state, logs, and configuration data.
  • High-Availability ConfigurationsAutomated replication and multi-region clustering for data durability.
  • Metadata and State Management2 sub-tagsSystems focused on the persistence of application configuration, relational metadata, and synchronized state logs.
  • Serialization UtilitiesTools for converting complex objects and tensor structures into persistent storage formats.
  • Specialized Database Engines2 sub-tagsPurpose-built database systems optimized for specific data models like vectors, time-series, or document-oriented structures.
  • Transportation ProtocolsLibraries and tools for data movement, message queuing, and transport-layer communication.