521 repos
Browse by top-level category.
Explore 521 awesome GitHub repositories matching category · Category. Refine with filters or upvote what's useful.
Crawl4AI is an AI-powered web crawling and data extraction engine designed to transform complex web content into structured formats. It functions as a headless browser orchestrator, enabling the navigation of dynamic websites, the execution of custom scripts, and the capture of visual assets like screenshots and PDFs. By integrating language models directly into the extraction workflow, the system converts raw HTML into clean, structured data or Markdown files optimized for downstream ingestion. The platform distinguishes itself through a distributed, self-hosted infrastructure that manages l
This project is a comprehensive algorithmic interview resource and coding practice repository. It provides a structured curriculum of programming challenges and source code implementations designed to help software engineers master efficient problem-solving techniques and prepare for technical assessments. The repository functions as a curated roadmap, organizing computer science fundamentals by data structure and algorithm topic to facilitate systematic skill development. By moving away from random practice, it supports career advancement training for those seeking to improve their professio
MinIO is a software-defined, cloud-native object storage server designed to manage large volumes of unstructured data. It functions as a distributed storage cluster that aggregates multiple independent nodes into a unified, scalable pool, providing a high-performance infrastructure compatible with standard cloud storage protocols and application programming interfaces. The system utilizes a shared-nothing architecture that eliminates central metadata servers, relying instead on a decentralized hash table to map objects across the cluster. Data availability and resilience are maintained throug
Markdown Here is a browser extension that enables rich text composition within web-based editors that lack native formatting support. By transforming plain text markdown syntax into rendered HTML, it allows users to draft professional emails and documents using standard markup, including headers, tables, and footnotes, directly inside their browser. The tool distinguishes itself through a bidirectional transformation engine that supports both the conversion of markdown to HTML and the reversion of rendered content back into its original source code. This state-preserving functionality allows
jQuery is a library designed for document object model manipulation and cross-browser interaction. It provides a unified interface for selecting, traversing, and modifying web page elements, ensuring consistent behavior across different rendering engines by abstracting away underlying browser inconsistencies. The library distinguishes itself through a dedicated CSS selector engine that parses strings into executable functions for element location. It incorporates a state machine for managing asynchronous operations and a feature-detection strategy that probes the environment to execute code p
Scrapy is a comprehensive framework designed for automated web data extraction and large-scale crawling. It operates on an asynchronous, event-driven engine that manages non-blocking network requests and data processing tasks, allowing for the efficient retrieval of structured information from web documents using path-based selectors. The system distinguishes itself through a highly modular architecture that supports complex data collection workflows. Users can implement custom middleware and signal handlers to intercept and modify request flows, while a priority-based scheduler manages concu
Pathway is a high-performance data processing framework designed for building unified batch and streaming pipelines. It functions as an orchestrator for complex data transformations, utilizing a differential dataflow engine to process updates incrementally. By treating static datasets and continuous event streams with identical logic, the platform ensures exactly-once processing semantics and consistent results across diverse data sources. The framework distinguishes itself through its specialized support for real-time artificial intelligence and retrieval-augmented generation. It features in
Nuxt is a universal web framework designed for building full-stack applications that seamlessly transition between server-side rendering and client-side interactivity. It provides a comprehensive development environment that automates routing, dependency injection, and type generation, allowing developers to focus on application logic rather than manual configuration. By executing code in a platform-agnostic server engine, it supports deployment across diverse environments, including edge networks, serverless functions, and traditional Node.js runtimes. The framework distinguishes itself thro
This project is a cross-platform desktop application designed for creating, editing, and managing structured diagrams and technical workflows. It provides a visual modeling environment that allows users to construct complex charts through a drag-and-drop interface, supporting the documentation of processes, software architectures, and system flows. The application distinguishes itself by utilizing a layered canvas composition that enables independent manipulation of diagram components, paired with a keyboard-driven workflow that minimizes mouse reliance. It employs scalable vector graphics fo
This project is a neural text-to-speech engine and voice cloning toolkit designed to generate synthetic speech that mimics the vocal characteristics of a target speaker. It functions as a real-time audio synthesizer, utilizing a deep learning pipeline to convert written text into high-fidelity speech output with minimal latency. The system employs a transfer learning framework that leverages pre-trained speaker verification models to adapt synthesis to new, unseen vocal identities. By using an encoder-based speaker embedding process, the toolkit maps variable-length audio samples into a laten
Shadowsocks-Windows is a desktop proxy manager that provides a graphical interface for configuring system-wide network routing. It functions as a local SOCKS5 or HTTP proxy server, intercepting outbound traffic through system-level injection to route requests through secure, encrypted remote tunnels. The application distinguishes itself through a modular architecture that supports plugin-based transport extensibility, allowing users to integrate external binaries for custom traffic obfuscation and specialized cryptographic protocols. It also enables high-availability networking by automatical
Git is a distributed version control system and command-line tool designed for tracking changes in source code and coordinating collaborative software development. It functions as a content-addressable storage platform where project data is maintained as immutable objects indexed by cryptographic hashes, ensuring data integrity and efficient deduplication. The system organizes project history as a directed acyclic graph, where each commit serves as a snapshot linked to its parent to create a verifiable timeline of modifications. The architecture distinguishes itself through an index-based sta
Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on your own hardware. The system distinguishes itself through specialized memory and computation management techniques, including memory-mapped weight loading and quantization-aware inference, which allow for efficient execution on standard consumer hardware. It utilizes a stateles
This project is a community-curated directory of resources, libraries, and tools designed to support developers working with the Flutter framework. It functions as a centralized knowledge base, organizing high-quality external references into a structured, human-readable format to assist in the discovery of technical materials for cross-platform application development. The directory distinguishes itself through a comprehensive index of the global Flutter ecosystem, including local user groups, meetups, and communication channels that connect developers to international support networks. It m
Magisk is an Android rooting framework designed to manage system-level modifications and grant administrative access to mobile devices. It functions by patching boot and recovery images to inject custom code into the operating system initialization sequence, allowing for system-wide control while maintaining compatibility with the underlying hardware. The project distinguishes itself through a systemless modification layer that overlays a virtual file system on top of read-only partitions, enabling changes without altering core system files. It includes a policy daemon to manage security cont
AngularJS is a structural framework for building dynamic web applications by extending standard HTML with custom tags and attributes. It operates as a client-side template engine that transforms declarative markup into interactive components, organizing application logic through a model-view-controller pattern. By utilizing a centralized dependency injection container, the framework manages the lifecycle of services and components to ensure modularity and maintainable architecture. The framework is defined by its two-way data binding mechanism, which automatically synchronizes data models wit
Daytona is an open-source development environment manager designed to automate the creation and orchestration of standardized workspaces. It provides a centralized platform for developers to provision, manage, and share consistent coding environments across various infrastructure providers. The platform focuses on environment reproducibility by enabling the definition of workspace configurations as code. It supports integration with existing version control systems and local development tools, allowing teams to maintain uniform setups that reduce configuration drift and onboarding time. The
Ladybird is an independent, cross-platform web browser built from the ground up with a modular architecture. It functions as a standalone application that fetches, processes, and renders web content directly from the internet. At its core, the project serves as a research platform for browser architecture, focusing on the development of a custom rendering engine and a high-performance JavaScript runtime designed to interpret modern web standards. The browser distinguishes itself through a multi-process architecture that isolates the user interface, network requests, and web content rendering
This project is a community-driven library of structured text inputs designed to guide large language models into specific roles, behaviors, and operational modes. It functions as a comprehensive repository of prompt engineering resources, providing reusable templates that allow users to override default model tendencies and enforce domain-specific response patterns through instruction-following logic. The collection distinguishes itself by offering specialized persona-based directives that constrain model output to simulate professional experts or functional technical environments. By utiliz
This project is a full-stack web framework designed for building database-backed applications through a standardized architectural pattern. It provides a comprehensive suite of integrated libraries that manage the entire request-response lifecycle, from routing incoming web traffic to rendering dynamic server-side templates. By utilizing an object-relational mapping layer, the framework allows developers to define domain models that map database tables directly to application objects, simplifying data persistence, schema migrations, and complex relationship management. The framework is distin