12 مستودعات
Infrastructure configurations designed to isolate sensitive data processing within secure, private network boundaries.
Distinguishing note: Focuses on the deployment of isolated processing services for privacy compliance rather than general encryption libraries.
Explore 12 awesome GitHub repositories matching security & cryptography · Private Data Processing Environments. Refine with filters or upvote what's useful.
Shannon is an integrated security platform designed for autonomous penetration testing, static and dynamic analysis, and automated vulnerability remediation within self-hosted, private infrastructure. It functions as a unified security suite that orchestrates the entire lifecycle of vulnerability management, from initial discovery and reachability prioritization to the generation and verification of code-level patches. The platform distinguishes itself through its agentic approach to security, deploying autonomous agents to execute both black-box and white-box exploits against running applica
Deploys security testing tools within isolated environments to keep sensitive source code and analysis data within the local perimeter.
This repository serves as a comprehensive research platform and toolkit for advancing machine learning, quantum computing, and large-scale scientific data analysis. It provides foundational frameworks for developing complex algorithmic systems, offering the necessary infrastructure for distributed training, computational graph execution, and high-performance model development. The project distinguishes itself by integrating specialized research domains with robust, privacy-preserving methodologies. It supports diverse scientific discovery through tools for quantum simulation, physics-informed
Collects and combines sensitive information from multiple devices using secure, privacy-preserving cryptographic protocols.
Marker is a comprehensive document processing platform designed to automate the conversion, extraction, and structuring of data from complex files. It functions as an orchestration engine that chains modular processing steps into versioned, reusable pipelines, allowing organizations to standardize document handling and automate repetitive business tasks at scale. The platform distinguishes itself through its support for secure, private infrastructure deployment, enabling users to run containerized services within their own environments to maintain strict data privacy. It features specialized
Deploying containerized processing services within private environments to maintain data privacy and control over sensitive document workflows.
Prefect is a workflow orchestration platform designed to define, schedule, and monitor complex data pipelines as Python code. It functions as a container-native engine that wraps individual tasks in isolated environments, ensuring consistent dependencies and resource allocation across diverse infrastructure. By utilizing a state-machine-based orchestration model, the system tracks execution progress through discrete transitions and persistent event logs to maintain reliable and observable task processing. The platform distinguishes itself through a decoupled worker-API architecture, which sep
Stores tokens, usernames, and passwords to enable secure access to private repositories.
DocsGPT is a retrieval-augmented generation platform and private knowledge base used to build AI agents that perform grounded search and analysis. It functions as a multi-model AI orchestrator and enterprise agent builder, allowing for the integration of various local and cloud language models to customize reasoning and text generation. The project provides a visual environment for developing automated assistants using conditional logic and third-party API connectivity. It enables the creation of private AI agents capable of performing enterprise search and detailed document analysis using pr
Enables detailed analysis and insight extraction from private PDFs, office files, and images.
Unstructured is an enterprise-grade data orchestration engine designed to transform raw, unstructured files into structured, machine-readable formats. It functions as a comprehensive platform for document ingestion, partitioning, and enrichment, specifically engineered to prepare complex data for retrieval-augmented generation and agentic AI workflows. The platform distinguishes itself through its sophisticated document processing strategies, which combine rule-based extraction with vision-language models to handle diverse file layouts, tables, and images. It provides a modular architecture t
Hosts data processing pipelines within dedicated or private cloud infrastructure to ensure data security, regulatory compliance, and environment isolation.
Petals is a decentralized framework and inference engine for running large language models across a peer-to-peer network. It enables the execution of models that exceed the memory of any single machine by splitting computations and model layers across a collaborative swarm of GPUs. The system functions as a collaborative compute network where participants share local GPU resources and host model weights. It supports distributed prompt-tuning to adapt massive models to specific tasks and allows for the establishment of private compute swarms to process sensitive data within restricted, trusted
Enables the creation of restricted networks of trusted hardware to process sensitive data in isolation.
Kreuzberg is a document extraction engine that converts PDFs, Office files, images, and over 90 other formats into clean, structured text and metadata. It is built around a compiled Rust core that can be used as a native library, a command-line tool, a REST API server, or a WebAssembly module for browser-based processing. The system is designed to run entirely on self-hosted infrastructure, with no data leaving the user's environment. What distinguishes Kreuzberg is its breadth of integration surfaces and its pipeline architecture. It exposes extraction capabilities through native bindings fo
Processes documents entirely on self-hosted infrastructure with no data leaving the environment.
The Snyk CLI is a command-line security scanner that detects known vulnerabilities across open-source dependencies, proprietary application code, container images, and infrastructure-as-code configuration files. It also serves as a platform management tool, allowing users to configure organizations, users, SSO, and reporting from the terminal rather than the web dashboard. The CLI integrates directly into development workflows, enabling scanning within IDEs, build pipelines, and version control systems. It implements static analysis with interfile data flow analysis to find complex security f
Analyzes private Git repositories by deploying proxies that bridge scanning services and internal code.
Costrict هو وكيل هندسة برمجيات يعمل بالذكاء الاصطناعي ومساعد برمجي مصمم للتطوير على مستوى المؤسسات. يعمل كمنسق ذكاء اصطناعي متعدد النماذج يقوم بإنشاء الكود وإكماله ومراجعته، بينما يعمل كبيئة تطوير عن بُعد تربط واجهات المتصفح بالأدلة البرمجية البعيدة لإدارة الملفات وتنفيذ الأوامر عبر الطرفية. تتميز المنصة بنظام مراجعة كود يعمل بالذكاء الاصطناعي يستخدم التحقق متعدد النماذج وفهرسة المستودعات لضمان جودة الكود. وتوظف نهجاً هيكلياً للوكلاء يقوم بتفكيك متطلبات اللغة الطبيعية المعقدة إلى سير عمل متسلسل للتحليل والتخطيط والاختبار للحفاظ على السيطرة المعمارية. يغطي النظام مجالات قدرات واسعة تشمل إدارة مساحات العمل عن بُعد، وتكامل نماذج الذكاء الاصطناعي المخصصة، ومراجعة الكود المؤتمتة للمستودعات وgit diffs. كما يوفر قابلية للتوسع من خلال نظام وكلاء قائم على المهارات وتكامل مع أدوات خارجية عبر بروتوكول سياق قياسي. تم تنفيذ المشروع بلغة TypeScript ويوفر تكاملاً مع المحررات البرمجية عبر إضافات لتوحيد سير العمل داخل المحررات المدعومة.
Implements isolated infrastructure configurations and end-to-end encryption to ensure data privacy.
Starred is a utility that automates the management and documentation of starred repositories. It functions by fetching repository metadata through the GitHub API and organizing these projects into structured, categorized lists based on programming language or topic. The tool distinguishes itself by maintaining these lists through automated, scheduled workflows that synchronize data directly to a dedicated repository. It supports the inclusion of private repositories in the generated output, ensuring that a user's complete collection is documented and backed up. The project provides a configu
Securely processes private codebases using temporary authentication tokens.
هذا المشروع عبارة عن أداة خاصة لتحليل المستندات تمكن من التفاعل الحواري مع ملفات PDF عن طريق تنفيذ جميع عمليات استنتاج ومعالجة نماذج اللغة بالكامل على الجهاز المحلي. من خلال تشغيل النماذج مباشرة داخل المتصفح أو البيئة المحلية، فإنه يضمن بقاء بيانات المستخدم الحساسة دون اتصال بالإنترنت وغير قابلة للوصول إلى الخوادم الخارجية أو موفري السحابة التابعين لجهات خارجية. يستخدم النظام التوليد المعزز بالاسترجاع (RAG) لتقديم إجابات واعية بالسياق، مدعوماً باستخراج نص المستند المحلي وفهرسة تضمين المتجهات (vector embedding). تسمح هذه البنية بالبحث الدلالي واسترجاع المعلومات دون الاعتماد على خدمات قواعد البيانات الخارجية أو الاتصال بالإنترنت. بعيداً عن القدرات الحوارية الأساسية، تتضمن الأداة ميزات مراقبة تسجل الخطوات الداخلية لمنطق النموذج وسلاسل الاسترجاع. يسمح تتبع التنفيذ هذا بتصحيح مشكلات الأداء وتحسين جودة الاستجابة أثناء عملية تحليل المستند.
Processes sensitive PDF files locally to answer questions without sending data to external servers.