171 repos
This category covers data storage, management, processing, analysis, and various database technologies and their operations.
Explore 171 awesome GitHub repositories matching data & databases · Data & Databases. Refine with filters or upvote what's useful.
Pocketbase is a backend-as-a-service platform that provides a self-contained, single-binary server for building full-stack applications. It integrates a relational database, authentication, and file storage into one executable process, eliminating the need for external infrastructure or complex server management. The
Meilisearch is a Rust-based search engine providing typo-tolerant full-text and vector-based semantic search with real-time conversational capabilities.
This project is a command-line storage manager that provides a unified interface for performing file operations across local filesystems and diverse cloud storage providers. It functions as a cross-platform storage abstraction, utilizing a modular backend architecture to map heterogeneous cloud storage APIs into a stan
Rich is a comprehensive library for building sophisticated command-line interfaces and terminal applications. It provides a robust console formatting engine and a layout framework that enables developers to render rich text, syntax-highlighted code, and complex data structures directly in the terminal. By utilizing a r
This project is a community-driven knowledge base that serves as a comprehensive reference guide for Git and GitHub. It functions as both a command-line cheat sheet for terminal-based version control operations and a collaborative workflow resource detailing platform-specific conventions for managing repositories, issu
Vaultwarden is a self-hosted password management server designed to store and synchronize sensitive credentials, identities, and organizational data across multiple client devices. It functions as a database-backed web application that provides an API layer for secure client-server communication, enabling users to mana
GPT-SoVITS is a text-to-speech synthesis engine and voice cloning toolkit designed for generating natural-sounding human speech. It functions as a neural audio processing pipeline that maps input text to high-fidelity audio waveforms, utilizing conditional variational autoencoders and flow-based decoders to ensure expr
This project is a comprehensive cybersecurity tool collection designed to support security research, penetration testing, and vulnerability assessment. It functions as a unified penetration testing suite, providing a centralized environment where professionals can access a wide range of offensive security utilities to
Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users
Appwrite is a backend-as-a-service platform that provides a unified development environment for building full-stack applications. It integrates essential infrastructure components—including authentication, databases, storage, and serverless functions—into a single, centralized interface to simplify application developm
This platform serves as a comprehensive environment for managing private language models, document knowledge bases, and automated agent workflows within secure local infrastructure. It functions as a document-aware workspace that enables users to ingest diverse file formats into searchable repositories, ensuring that a
MinerU is a document parsing pipeline designed to transform unstructured files into machine-readable, structured data. It utilizes deep learning models to perform layout analysis, identifying document regions and extracting complex content such as mathematical expressions. By combining these neural network inferences w
Maybe is a self-hosted financial platform designed for private deployment, providing a centralized interface to track investments, budgets, and net worth. By running the application on your own infrastructure, you maintain full control over your sensitive financial data and privacy. The platform is delivered as a cont
Marktext is a cross-platform desktop application designed for markdown document authoring and structured note-taking. It functions as a WYSIWYG text processor, providing a distraction-free interface that renders formatted content in real-time while hiding the underlying markup syntax. The application utilizes a multi-
Requests is a high-level HTTP client library designed to simplify web communication and API integration. It provides an intuitive, human-readable interface for performing standard network operations, including request execution, connection pooling, and stateful session management. By encapsulating raw network data into
Gitea is a self-hosted service designed for managing version control repositories, project issue tracking, and software artifact distribution. It provides a collaborative platform that enables teams to host their own source code, manage development tasks through integrated project boards, and store container images or
Docling is a modular framework designed for document parsing, layout analysis, and structured data extraction. It transforms unstructured files and web content into a unified, hierarchical data model that preserves the spatial and semantic relationships between text, tables, images, and layout elements. By normalizing
Joplin is an open-source, cross-platform note-taking application designed for secure, private knowledge management. It functions as a local-first productivity platform, maintaining a complete relational database on the user's device to ensure offline availability and high-performance data retrieval. The application pri
nanoGPT is a lightweight engine for training and fine-tuning transformer-based language models from scratch. It provides a minimalist codebase designed for educational exploration and rapid experimentation with neural network architectures, utilizing self-attention and feed-forward layers to process sequences and predi
This project provides a deep learning architecture designed to identify and isolate distinct objects within images by generating precise pixel-level masks. It functions as a browser-based inference engine, enabling the execution of complex machine learning models directly within web environments without requiring serve