321 repos
Awesome GitHub repositories, curated.
A community-curated directory of interesting public GitHub repositories. Ask in plain English — AI ranks by relevance. Save what you find.
Browse repositories
- binhnguyennus/awesome-scalability
binhnguyennus/awesome-scalability
68,707This project is a curated knowledge repository that aggregates high-quality resources, technical documentation, and expert insights focused on distributed systems engineering. It serves as a community-driven learning resource designed to help developers navigate the complexities of building and maintaining large-scale software applications. The repository distinguishes itself through a hierarchical taxonomy that organizes vast amounts of technical information into a structured, searchable format. By utilizing markdown-based content curation and static indexing, the collection remains version-controlled and accessible without the need for complex database queries. This structure relies on distributed contributions to ensure the materials remain aligned with current industry standards. The collection covers a broad range of engineering domains, including system architecture design, performance optimization strategies, and organizational practices for technical teams. It also provides a comprehensive index of materials intended to support professional growth and preparation for technical interviews, encompassing principles of availability, stability, and scalability.
architectureawesomeawesome-list - pathwaycom/pathway
pathwaycom/pathway
59,684Pathway is a high-performance data processing framework designed for building unified batch and streaming pipelines. It functions as an orchestrator for complex data transformations, utilizing a differential dataflow engine to process updates incrementally. By treating static datasets and continuous event streams with identical logic, the platform ensures exactly-once processing semantics and consistent results across diverse data sources. The framework distinguishes itself through its specialized support for real-time artificial intelligence and retrieval-augmented generation. It features integrated vector-aware data ingestion, which automates the creation and maintenance of searchable document indexes that update instantly as new data arrives. Developers can connect language models directly into their pipelines, utilizing built-in capabilities for document chunking, embedding generation, and result reranking to maintain synchronized, context-aware information retrieval. Beyond its core processing capabilities, the platform provides a robust infrastructure for deploying data applications. It supports the transition from batch to streaming workflows by simply updating input connectors, while its containerized deployment model allows for scaling services across local and cloud environments. The system is designed to handle large-scale event-driven tasks, providing a consistent programming model for both analytics and automated content generation workflows.
batch-processingdata-analyticsdata-pipelines - PaddlePaddle/PaddleOCR
PaddlePaddle/PaddleOCR
70,931PaddleOCR is a comprehensive optical character recognition framework designed for detecting and transcribing text from images and documents into structured, machine-readable formats. It provides a modular computer vision pipeline that decouples image preprocessing, text detection, and character recognition into independent, configurable stages. This architecture supports automated document digitization and multilingual text recognition, capable of identifying text in over one hundred languages across diverse environments ranging from scanned documents to industrial scenes. The framework distinguishes itself through a hardware-agnostic inference layer and a high-performance execution engine that enables consistent model deployment across CPUs, GPUs, and mobile hardware. It facilitates high-throughput production environments by utilizing static graph execution and distributed device orchestration, which allow for the scaling of recognition tasks across multiple hardware accelerators and network services. To support flexible integration, the system includes a cross-platform deployment toolkit and utilities for exporting models into universal formats. It provides granular control over resource utilization through multi-process parallelism and custom inference distribution, ensuring efficient performance for both local processing and remote network service deployment.
ai4sciencechineseocrdocument-parsing - typst/typst
typst/typst
51,468Typst is a programmable, markup-based typesetting engine designed for professional document creation. It functions as a scriptable publishing toolchain that transforms plain text and code into complex, paginated outputs. By utilizing a high-performance compiler, the system automates document assembly, mathematical rendering, and dynamic content generation, providing a unified workflow for academic and technical authoring. The engine distinguishes itself through a declarative layout framework that uses cascading rules to manage document structure and visual styling. Unlike traditional systems, it employs an incremental layout engine that performs multiple passes to resolve cross-references, counters, and dynamic content placement. This is supported by a sandboxed functional scripting runtime, which allows users to define custom logic for data processing and layout manipulation, ensuring that document state remains consistent throughout the compilation process. The system provides a comprehensive suite of tools for managing document elements, including automated bibliography generation, structured table creation, and hierarchical sectioning. It supports precise control over page geometry and typography, while its introspection capabilities allow for advanced querying of document state and element locations. These features are complemented by a robust set of foundational data management primitives, enabling users to handle complex collections, numeric data, and time-based logic within their documents. The project provides a command-line interface for compiling source files into portable formats like PDF, with built-in support for accessibility standards. Detailed documentation, including syntax references and architectural overviews, is available to guide users through the installation and implementation of the typesetting environment.
compilermarkuptypesetting - hiyouga/LlamaFactory
hiyouga/LlamaFactory
67,386LlamaFactory is a unified framework for fine-tuning and adapting large language models. It provides a comprehensive platform that standardizes training workflows across diverse machine learning architectures, allowing users to execute both full-tuning and parameter-efficient methods through a single interface. The project distinguishes itself by offering a low-code visual dashboard that enables users to configure experiments and monitor performance metrics in real time without writing extensive custom scripts. It also features a configuration-driven orchestration system that decouples experiment logic from the underlying execution engine, alongside an OpenAPI-compliant server that exposes trained models as standard network endpoints for integration with external software. Beyond its core training capabilities, the platform supports real-time experiment tracking by streaming performance data to external monitoring services. This allows for the evaluation of model progress and the optimization of parameters throughout the development lifecycle. The software is designed to be installed and configured as a standalone environment for managing the end-to-end lifecycle of language model adaptation.
agentaideepseek - tensorflow/tfjs-examples
tensorflow/tfjs-examples
6,783This repository provides a collection of practical demonstrations and implementation guides for machine learning tasks using TensorFlow.js. It serves as a resource for developers to explore model architectures, training workflows, and data manipulation techniques across domains such as computer vision, natural language processing, and reinforcement learning. The project covers the full lifecycle of machine learning development, including tensor-based mathematical operations, model construction via high-level layer APIs or low-level tensor logic, and model serialization for various storage mediums. It includes utilities for converting models into browser-compatible formats and provides infrastructure for executing these models across diverse backends, including WebGL, WebAssembly, and CPU-accelerated environments. Documentation and examples are organized by task type, allowing users to browse implementations for regression, object detection, and generative models. The repository also includes deployment guides for hosting server-side applications on cloud platforms, alongside tools for managing tensor memory and asynchronous training processes.
- LadybirdBrowser/ladybird
LadybirdBrowser/ladybird
58,620Ladybird is an independent, cross-platform web browser built from the ground up with a modular architecture. It functions as a standalone application that fetches, processes, and renders web content directly from the internet. At its core, the project serves as a research platform for browser architecture, focusing on the development of a custom rendering engine and a high-performance JavaScript runtime designed to interpret modern web standards. The browser distinguishes itself through a multi-process architecture that isolates the user interface, network requests, and web content rendering to enhance stability and security. It utilizes a custom layout engine for geometry calculation and hardware-accelerated painting to manage visual composition. To maintain responsiveness, the engine employs event-driven networking and a just-in-time compilation pipeline that translates script code into machine instructions during runtime. The project supports a broad range of web platform capabilities, including media playback, vector graphics rendering, and advanced style sheet processing. It incorporates automated memory management through reference counting and garbage collection to ensure long-term stability. Comprehensive documentation is available to assist with building the application, configuring development environments, and extending the internal rendering pipeline.
browserbrowser-engine - pmndrs/zustand
pmndrs/zustand
57,057Zustand is a state management library that provides a centralized store for managing shared application data. It functions as a reactive container that connects application state to components, allowing them to subscribe to specific slices of data and trigger updates automatically. By utilizing selector-based data access and immutable state updates, the library ensures that components only re-render when their observed data changes, maintaining a predictable and efficient data flow. The library distinguishes itself through a pluggable, middleware-based architecture that allows for the extension of store functionality, such as integrating diagnostic tools or handling complex state transitions. It supports asynchronous action handling and reactive side effect management, enabling developers to orchestrate background tasks and external updates directly within the store. Furthermore, it provides built-in utilities for automatic state persistence to browser storage and diagnostic interfaces for monitoring and reverting state changes during development. Beyond its core reactive capabilities, the project includes comprehensive support for type-safe state definitions, ensuring consistent data access throughout the development lifecycle. It simplifies the management of deeply nested data structures through an immutable interface that handles object copying and replacement. The library is designed to be installed as a dependency and provides hooks to facilitate the binding of functional components to the global state.
hacktoberfesthooksreact - localstack/localstack
localstack/localstack
64,423LocalStack is an infrastructure development environment that provides a local simulation of cloud services. By leveraging container-orchestrated service lifecycles, it allows developers to build, test, and debug cloud-native applications on their local machines without requiring remote connectivity or incurring cloud provider costs. The platform distinguishes itself through sophisticated traffic redirection and request routing, which intercept cloud service calls at the network layer and redirect them to local handlers. This enables seamless integration with existing development workflows, allowing users to mock cloud resources, replicate infrastructure states, and execute ephemeral testing environments within continuous integration pipelines. Beyond core emulation, the platform includes a comprehensive suite of developer tools for managing service lifecycles, monitoring activity, and configuring runtime environments. It supports complex distributed architectures through event-driven simulation, persistent storage mapping, and dynamic configuration injection, ensuring that local environments accurately mirror production requirements. The system is designed for integration into automated build and deployment workflows, providing visual dashboards and terminal-based interfaces for real-time resource management and infrastructure troubleshooting.
awscloudcontinuous-integration - angular/angular.js
angular/angular.js
58,970AngularJS is a structural framework for building dynamic web applications by extending standard HTML with custom tags and attributes. It operates as a client-side template engine that transforms declarative markup into interactive components, organizing application logic through a model-view-controller pattern. By utilizing a centralized dependency injection container, the framework manages the lifecycle of services and components to ensure modularity and maintainable architecture. The framework is defined by its two-way data binding mechanism, which automatically synchronizes data models with the user interface. It achieves this through dirty-checking, where the system periodically compares model snapshots to propagate changes between the view and the underlying data. This process is supported by hierarchical scope inheritance, allowing nested components to access and modify parent data models, and expression-based evaluation that enables dynamic logic directly within the document markup. Beyond its core rendering and binding capabilities, the project provides a comprehensive suite of tools for application development. This includes a service-oriented architecture for encapsulating business logic, built-in data transformation filters, and extensive support for automated testing, covering both isolated unit tests and end-to-end browser workflows. The framework also offers granular control over document elements, including conditional rendering, event handling, and input validation.
- zylon-ai/private-gpt
zylon-ai/private-gpt
57,116This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to provide context-aware responses for chat and completion requests. The system distinguishes itself through a database-agnostic abstraction layer that supports various storage backends, ranging from local disk storage to enterprise-grade vector databases. It offers flexible deployment options, enabling users to run language models entirely on private hardware or connect to external cloud-based providers through a unified interface. To improve the quality of generated output, the engine incorporates reranking logic that refines retrieved document chunks before they are processed by the language model. The platform includes a comprehensive suite of tools for managing document intelligence pipelines, including automated parsing, text chunking, and embedding generation. Users can configure the system through environment-based profiles to match specific hardware capabilities, such as CPU or GPU-accelerated setups, and stream responses in real time to reduce latency. The application is configured via runtime settings files and environment variables, with support for building custom container images to suit specific deployment requirements.
- deepfakes/faceswap
deepfakes/faceswap
54,974Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users to map facial identities between source and destination datasets while maintaining structural alignment and lighting consistency across video frames. The project distinguishes itself through a highly extensible plugin-based architecture that handles hardware-accelerated processing and multi-stage image post-processing. It includes specialized tools for manual alignment verification, allowing users to refine detected facial data through a graphical interface to ensure high-quality results. The system also features robust batch-oriented data processing, which partitions media into standardized chunks to optimize memory usage and throughput during intensive neural network operations. Beyond its core synthesis capabilities, the framework covers a broad range of computer vision tasks including facial landmark detection, pose estimation, and mask generation. It integrates sophisticated model management utilities, such as automated loss calculation, gradient clipping, and snapshot recovery, to ensure stable training sessions. The system also provides extensive diagnostic tools for hardware performance monitoring and environment validation, ensuring compatibility across various compute accelerators. The software is managed through a centralized command-line and graphical toolkit that supports persistent configuration and session state management. It is designed to run on diverse hardware configurations by dynamically querying available compute resources and routing tensor operations to the optimal processor.
deep-face-swapdeep-learningdeep-neural-networks - sindresorhus/awesome-nodejs
sindresorhus/awesome-nodejs
65,038This project is a community-driven directory that aggregates essential software projects and educational content for the Node.js ecosystem. It functions as a centralized knowledge base and discovery index, designed to simplify the navigation of a fragmented technical landscape by providing a structured collection of high-quality links, tools, and learning materials. The repository distinguishes itself through a decentralized, peer-reviewed curation model. By utilizing standard version control workflows and pull requests, the community ensures that all listed resources undergo human verification to maintain relevance and quality. This approach transforms a vast array of external links into a single, searchable, and maintainable static document. The collection covers a broad spectrum of development needs, ranging from backend application infrastructure and web frameworks to command-line tooling and testing utilities. Beyond software packages, it serves as a comprehensive reference for developer skill advancement, offering access to curated articles, books, courses, and newsletters that support ongoing technical proficiency.
awesomeawesome-listjavascript - anthropics/claude-code
anthropics/claude-code
125,543Anthropic's terminal-native AI coding agent.
aiclideveloper-tools - toeverything/AFFiNE
toeverything/AFFiNE
63,081AFFiNE is a collaborative knowledge base and productivity suite designed as a private-first, local-first platform. It provides an integrated workspace that combines structured documents with an infinite digital canvas, allowing users to organize complex information through a block-based model. By prioritizing local data persistence, the platform ensures immediate responsiveness and data sovereignty while maintaining a distributed state for real-time synchronization across multiple devices. The platform distinguishes itself through a canvas-integrated database engine that enables transitions between free-form whiteboarding and structured tabular views. It utilizes conflict-free replicated data types to manage concurrent edits, ensuring consistent collaboration. Users can extend the workspace with modular artificial intelligence integrations, which use natural language prompts to generate, summarize, and transform content into various visual or structured formats. The software is built for self-hosting, allowing teams to maintain full control over their data and infrastructure. It supports container-orchestrated deployment, providing tools for managing private workspaces, authentication, and production-ready environments. The system is designed to be installed and configured on personal or team-managed infrastructure, ensuring that all sensitive information remains within a private, secure, and scalable environment.
appcrdteditor - ytdl-org/youtube-dl
ytdl-org/youtube-dl
139,680This project is a command-line utility for downloading media from various online platforms. It provides comprehensive tools for selecting specific video and audio formats, managing playlist downloads, and filtering content based on metadata such as upload dates and file sizes. The software includes extensive filesystem and output controls, allowing users to define custom naming templates, manage subtitle tracks, and retrieve thumbnails. An automated post-processing pipeline supports tasks like audio extraction, format conversion, and metadata embedding. To ensure reliable operation, the tool offers configurable network settings, including proxy support, retry logic, and rate limiting. It also provides authentication management for restricted content and simulation modes that allow users to extract metadata or URLs without initiating a full download.
- gorhill/uBlock
gorhill/uBlock
61,640uBlock is a browser-based content blocker that functions as a declarative filtering engine to intercept network requests and modify web page content. It operates by parsing standardized filter lists into optimized data structures, allowing it to block network hosts, enforce security policies, and prevent unauthorized data transmission. The extension provides a comprehensive security layer that monitors outgoing traffic and disables intrusive browser features to enhance user privacy. What distinguishes this project is its granular control over filtering behavior through a dynamic rule orchestrator. Users can manage custom rules, apply site-specific overrides, and toggle filtering settings on a per-domain basis. The engine also employs advanced techniques such as CNAME uncloaking, IP address filtering, and response body modification to identify and neutralize trackers that attempt to bypass standard blocking methods. Furthermore, it supports enterprise-grade deployment, enabling organizations to enforce consistent security and filtering configurations across managed environments. The project covers a broad capability surface including cosmetic page modification, which uses CSS injection and sandboxed scriptlets to remove visual clutter and neutralize anti-blocking scripts. It also provides interactive tools for real-time network traffic inspection and manual element removal, ensuring users can debug and customize their browsing experience. The extension is designed to maintain high performance by synchronizing its initialization at startup, ensuring that all security rules are active before any network requests are processed.
blockerbrowser-extensionchromium - Solido/awesome-flutter
Solido/awesome-flutter
59,015This project is a community-curated directory of resources, libraries, and tools designed to support developers working with the Flutter framework. It functions as a centralized knowledge base, organizing high-quality external references into a structured, human-readable format to assist in the discovery of technical materials for cross-platform application development. The directory distinguishes itself through a comprehensive index of the global Flutter ecosystem, including local user groups, meetups, and communication channels that connect developers to international support networks. It maintains a version-controlled, community-driven taxonomy that categorizes diverse technical resources into logical domains, ensuring that developers can efficiently locate relevant packages, architectural guides, and best practices. The collection covers a broad capability surface, ranging from foundational development tools and state management patterns to advanced topics like graphics rendering, testing frameworks, and backend integration. It also provides access to structured learning paths, including roadmaps, tutorials, and expert-led interviews, to help developers advance their technical proficiency. The repository is maintained as a static document, relying on distributed contributions and pull requests from the community to keep the index of tools and community groups current.
androidawesomeawesome-list - opencv/opencv
opencv/opencv
86,238OpenCV is a comprehensive computer vision library designed for real-time performance and cross-platform deployment. It provides a native execution environment that leverages multi-threaded operations and automated memory management to handle intensive computational tasks, including image processing and machine learning model inference. The library distinguishes itself through a data-oriented matrix framework that utilizes proxy-based array abstractions to provide a consistent interface for multidimensional data. By employing factory-pattern algorithm interfaces and runtime type dispatching, it ensures long-term API stability and enables cross-language bindings, allowing developers to integrate high-performance vision capabilities into diverse hardware and software environments. The project covers a broad range of functional requirements, including automated memory allocation, saturation-aware arithmetic for pixel-level operations, and standardized error handling. It maintains a clean integration surface through namespace-encapsulated structures and rigorous coding standards. Technical documentation is generated from standardized inline comments, and the codebase is supported by a comprehensive suite of unit tests to ensure reliability across versions.
c-plus-pluscomputer-visiondeep-learning - AppFlowy-IO/AppFlowy
AppFlowy-IO/AppFlowy
68,167AppFlowy is a local-first knowledge base and collaborative workspace platform designed for structured information management. It functions as a modular productivity suite where users organize content through a block-based document model, allowing for flexible nesting and granular manipulation of data. The system prioritizes data sovereignty by enabling self-hosted storage, ensuring that sensitive information remains under user control while maintaining offline accessibility. The platform distinguishes itself through a decoupled architecture that separates its high-performance, memory-safe core logic from the user interface. This design supports an event-driven synchronization engine that maintains consistency across local caches and collaborative sessions. Users can extend the system via a modular plugin architecture, which facilitates the integration of external or local intelligence models to automate content creation, summarize datasets, and assist with complex organizational tasks. Beyond its core document capabilities, the platform provides tools for structured data management, including relational tables that allow for the categorization, filtering, and visualization of information. The interface is built on a cross-platform rendering framework to ensure consistent performance across desktop and mobile environments.
blogconfluence-alternativecontent-management