awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Data & Databases · Awesome GitHub Repositories

506 repos

Awesome GitHub RepositoriesData & Databases

Explore 506 awesome GitHub repositories matching category · Data & Databases. Refine with filters or upvote what's useful.

  1. Home
  2. Category
  3. Data & Databases

Awesome Data & Databases GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • ray-project/ray

    ray-project/ray

    41,400View on GitHub↗

    Ray is a distributed computing framework designed to scale Python and Java applications across clusters by abstracting task scheduling and resource management. It functions as a resource-aware execution engine that manages task dependencies, placement, and fault tolerance across networked compute nodes. At its core, the system provides a stateful actor model, allowing developers to define classes that run in dedicated processes to maintain and mutate internal state across remote method calls. The framework distinguishes itself through a robust cross-language interoperability layer, enabling f

    Pythondata-sciencedeep-learningdeployment
    41,400View on GitHub↗
  • psf/black

    psf/black

    41,395View on GitHub↗

    This project is an uncompromising, deterministic code formatter for Python. It functions by parsing source code into an abstract syntax tree and regenerating it according to a rigid, opinionated set of style rules. By automating the formatting process, it eliminates manual style debates and configuration overhead, ensuring that code remains consistent across entire projects regardless of the original input. The tool distinguishes itself through its focus on speed and seamless integration into development workflows. It utilizes content-based file caching and parallel processing to maintain hig

    Pythonautopep8codecodeformatter
    41,395View on GitHub↗
  • dylanaraps/pure-bash-bible

    dylanaraps/pure-bash-bible

    41,355View on GitHub↗

    This project is a curated knowledge base and technical reference for shell scripting, focused on performing common system tasks using only built-in shell features. It serves as a guide for implementing logic and automation without relying on external binaries or dependencies, ensuring scripts remain portable across standard Unix-like environments. The repository distinguishes itself by emphasizing native shell functions and syntax to minimize process forking and improve execution performance. It provides idiomatic patterns for complex string transformations, pattern matching, and data flow ma

    Shellbashbiblebook
    41,355View on GitHub↗
  • siyuan-note/siyuan

    siyuan-note/siyuan

    41,351View on GitHub↗

    Siyuan is a self-hosted knowledge management platform designed for private note-taking and information organization. It functions as a local-first application that stores all user content as plain text files on the local file system, ensuring data ownership and offline availability. The platform utilizes a block-based document model, which structures information as a tree of independent content blocks to facilitate granular manipulation and bidirectional linking. Users can extend the core functionality through a sandboxed plugin architecture, allowing for the development of custom themes and

    TypeScriptankichatgptdeepseek
    41,351View on GitHub↗
  • hpcaitech/ColossalAI

    hpcaitech/ColossalAI

    41,349View on GitHub↗

    ColossalAI is a deep learning system designed to facilitate the training and inference of large-scale artificial intelligence models. It provides a unified framework for distributed computing, enabling the scaling of model parameters and data across multiple hardware accelerators. The project focuses on optimizing memory usage and computational efficiency through advanced parallelization strategies. By integrating techniques for data, pipeline, and tensor parallelism, it allows for the management of models that exceed the capacity of individual devices. The system includes a comprehensive su

    Pythonaibig-modeldata-parallelism
    41,349View on GitHub↗
  • zhayujie/chatgpt-on-wechat

    zhayujie/chatgpt-on-wechat

    41,334View on GitHub↗

    This project is a multi-platform chatbot framework designed to connect large language models to social messaging services for automated interaction and task execution. It functions as an autonomous agent orchestrator that decomposes complex user goals into sequential, multi-step plans, iteratively invoking external tools and managing long-term memory to achieve objectives. The system distinguishes itself through a modular skill architecture and a knowledge-driven memory store. Developers can extend functionality by creating and installing custom logic modules, while the agent maintains contin

    Pythonaiai-agentchatgpt
    41,334View on GitHub↗
  • hexojs/hexo

    hexojs/hexo

    41,251View on GitHub↗

    Hexo is a command-line static site generator designed for content-driven blogging and website creation. It functions as a structured framework that transforms plain text files and markdown into production-ready static websites, utilizing a template-based rendering engine to separate site content from visual presentation. The project is distinguished by its event-driven build pipeline, which manages the entire site lifecycle through a series of hooks for file processing, asset generation, and deployment. Developers can extend the system’s core capabilities through a modular plugin architecture

    TypeScripthacktoberfesthexojavascript
    41,251View on GitHub↗
  • zai-org/ChatGLM-6B

    zai-org/ChatGLM-6B

    41,232View on GitHub↗

    ChatGLM-6B is a generative AI inference engine designed for local execution of transformer-based language models. It provides a comprehensive runtime environment that allows users to load and run pre-trained neural network weights directly on their own hardware, ensuring data privacy and independence from external cloud services. The project distinguishes itself through a hardware-agnostic execution backend that supports deployment across diverse environments, including standard processors, Apple Silicon, and multi-GPU configurations. It incorporates advanced optimization techniques such as w

    Python
    41,232View on GitHub↗
  • logseq/logseq

    logseq/logseq

    41,118View on GitHub↗

    Logseq is a privacy-focused, local-first knowledge base designed for personal information management and networked thought mapping. It functions as a bi-directional graph editor that organizes content into hierarchical, outliner-based structures, allowing users to connect related concepts through automated backlinking and visual relationship mapping. The platform distinguishes itself by maintaining all user data in plain text markdown files stored directly on the local device, ensuring offline availability and long-term portability. It employs a logic-based query engine to perform complex rel

    Clojureclojureclojurescriptgit
    41,118View on GitHub↗
  • ccxt/ccxt

    ccxt/ccxt

    40,990View on GitHub↗

    This library provides a unified interface for interacting with hundreds of global cryptocurrency exchanges. It serves as a standardized framework for building automated trading systems, allowing developers to fetch real-time market data, manage account balances, and execute orders across multiple financial platforms through a single, predictable set of methods. The project distinguishes itself by abstracting the complexities of diverse exchange-specific application programming interfaces into a consistent internal schema. It includes a modular authentication layer that automatically handles c

    Pythonaltcoinapiarbitrage
    40,990View on GitHub↗
  • chubin/cheat.sh

    chubin/cheat.sh

    40,960View on GitHub↗

    Cheat.sh is a command line knowledge base that provides instant access to programming syntax, code snippets, and technical documentation. Designed to minimize context switching, it functions as a developer productivity tool that allows users to retrieve information directly within their terminal or code editor. The service distinguishes itself through a terminal-agnostic interface that relies on standard input and output streams, ensuring compatibility across various shell environments and operating systems. It supports persistent query sessions to maintain workflow continuity and offers a co

    Pythoncheatsheetclicommand-line
    40,960View on GitHub↗
  • curl/curl

    curl/curl

    40,877View on GitHub↗

    Curl is a command-line tool and portable library for transferring data across a wide range of network protocols. It functions as a unified engine that abstracts diverse communication standards, allowing users and developers to move files and information between servers using a consistent interface. The project provides both a versatile command-line client for terminal-based automation and a stable programmatic interface for integrating complex network operations into applications. The system is distinguished by its protocol-agnostic core and its ability to manage both synchronous and asynchro

    Ccclientcurl
    40,877View on GitHub↗
  • Aider-AI/aider

    Aider-AI/aider

    40,753View on GitHub↗

    Aider is a command-line interface tool that enables large language models to directly edit, refactor, and manage source code within a local repository. It functions as an AI-powered coding assistant that integrates into the developer workflow, allowing users to apply code changes through natural language prompts while maintaining repository context and version control. The tool distinguishes itself through a specialized diff-based patching engine that parses model-generated search-and-replace blocks to modify specific file segments without rewriting entire files. It features a provider-agnost

    Pythonanthropicchatgptclaude-3
    40,753View on GitHub↗
  • LC044/WeChatMsg

    LC044/WeChatMsg

    40,544View on GitHub↗

    WeChatMsg is a database forensic parser and local data processor designed to extract and reconstruct structured message data from raw binary files. By operating entirely on the host machine, the tool ensures data sovereignty and privacy, performing all decryption and transformation tasks without requiring network access or external dependencies. The project distinguishes itself through a static analysis-based extraction method that reconstructs message threads by matching unique identifiers and timestamps across fragmented database tables. Its decoupled architecture separates low-level binary

    chatgptllmspyqt
    40,544View on GitHub↗
  • karanpratapsingh/system-design

    karanpratapsingh/system-design

    40,519View on GitHub↗

    This project is a comprehensive educational resource focused on the principles, patterns, and trade-offs required to design scalable, reliable, and high-performance distributed systems. It provides a structured curriculum that covers the fundamental architectural strategies necessary for building modern software infrastructure, ranging from high-level system decomposition to low-level networking and data management. The repository distinguishes itself by offering deep dives into complex architectural patterns, such as microservices-based decomposition, event-driven communication, and command-

    architecturedistributed-systemsengineering
    40,519View on GitHub↗
  • janhq/jan

    janhq/jan

    40,489View on GitHub↗

    Jan is a desktop application that functions as a local artificial intelligence model runtime and an open-standard API server. It enables the execution of large language models directly on local hardware, ensuring that data remains private and accessible offline while providing a unified interface for managing model weights and inference runtimes. The platform distinguishes itself by offering a modular inference backend that allows users to swap execution engines based on hardware compatibility and performance needs. It acts as a cross-platform orchestrator, providing the ability to switch bet

    TypeScriptchatgptgptllamacpp
    40,489View on GitHub↗
  • bradtraversy/50projects50days

    bradtraversy/50projects50days

    40,441View on GitHub↗

    This project is an educational code repository containing a collection of over 50 mini web development exercises. It serves as a front-end learning resource designed to help developers practice foundational skills by building small, interactive projects using standard HTML, CSS, and JavaScript. The repository distinguishes itself by focusing on standalone interactive component prototyping and the implementation of client-side logic. Each project is organized into a decoupled directory structure, allowing users to explore individual interface patterns and visual effects in isolation. These exe

    CSS
    40,441View on GitHub↗
  • chakra-ui/chakra-ui

    chakra-ui/chakra-ui

    40,318View on GitHub↗

    Chakra UI is a design system component library and styling framework that provides a foundation for building consistent, accessible web interfaces. It functions as a centralized theme configuration engine, using a design-token-driven architecture to manage visual properties like color palettes and spacing rules as a single source of truth across an entire application. The framework distinguishes itself through a type-safe styling utility that automatically generates TypeScript definitions from theme configurations, ensuring accurate property referencing and editor autocompletion. It employs a

    TypeScripta11yaccessibleark-ui
    40,318View on GitHub↗
  • saadeghi/daisyui

    saadeghi/daisyui

    40,300View on GitHub↗

    This project is a utility-first component library that provides a comprehensive suite of pre-styled, reusable interface elements. It functions as a build-time engine that generates design-system-compliant styles by mapping semantic tokens to standard HTML elements and utility classes. By compiling all component styles into static CSS at build time, the library eliminates the need for client-side style calculation, ensuring efficient performance. The library distinguishes itself through a configuration-driven architecture that manages color palettes and visual styles, enabling dynamic switchin

    Sveltecomponentcomponent-librarycomponents
    40,300View on GitHub↗
  • calcom/cal.com

    calcom/cal.com

    40,288View on GitHub↗

    Cal.com is a comprehensive scheduling infrastructure platform designed to manage availability, booking workflows, and calendar synchronization across multiple users and external services. It provides a backend service for automated appointment scheduling, enabling the creation, confirmation, and management of booking lifecycles through a centralized state machine. The platform also offers embeddable user interface components that allow developers to integrate interactive booking experiences directly into third-party websites. What distinguishes the platform is its extensible app ecosystem and

    TypeScriptnext-authnextjsopen-source
    40,288View on GitHub↗
Prev1…222324…26Next