What are the best open-source GitHub repositories for التحليلات، إطارات البيانات (Dataframes)، ودفاتر الملاحظات (Notebooks)?

jakevdp/pythondatasciencehandbook is the closest match — This project is an interactive data science environment that combines code execution, rich media visualization, and narrative documentation into a persistent, browser-based platform. It serves as a comprehensive educational resource for scientific computing, providing a framework for iterative data analysis and machine learning prototyping. The environment is distinguished by its focus on high-performance numerical computing, utilizing v…

Why does jakevdp/pythondatasciencehandbook match “التحليلات، إطارات البيانات (Dataframes)، ودفاتر الملاحظات (Notebooks)”?

This project is an interactive data science environment that combines code execution, rich media visualization, and narrative documentation into a persistent, browser-based platform. It serves as a comprehensive educational resource for scientific computing, providing a framework for iterative data…

Why does fastai/fastbook match “التحليلات، إطارات البيانات (Dataframes)، ودفاتر الملاحظات (Notebooks)”?

This project is an interactive educational textbook and comprehensive machine learning resource designed for deep learning education. It provides a structured curriculum that combines narrative prose with executable code, utilizing literate programming to create reproducible learning experiences wi…

Why does vonng/ddia match “التحليلات، إطارات البيانات (Dataframes)، ودفاتر الملاحظات (Notebooks)”?

This project serves as a comprehensive technical reference for the architecture and design of data-intensive applications. It provides a structured analysis of the fundamental principles required to build reliable, scalable, and maintainable software systems, covering the core trade-offs inherent i…

Why does metabase/metabase match “التحليلات، إطارات البيانات (Dataframes)، ودفاتر الملاحظات (Notebooks)”?

Metabase is a business intelligence platform designed to connect to various storage systems and relational databases for data exploration, visualization, and reporting. It provides a centralized environment where users can build queries through a graphical interface or raw code, transforming raw in…

Why does lancedb/lancedb match “التحليلات، إطارات البيانات (Dataframes)، ودفاتر الملاحظات (Notebooks)”?

LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The syst…

التحليلات، إطارات البيانات ودفاتر الملاحظات

استكشف الأدوات مفتوحة المصدر لمعالجة البيانات، والتحليل الإحصائي، وبيئات دفاتر الملاحظات الحسابية التفاعلية.

اعثر على أفضل المستودعات باستخدام الذكاء الاصطناعي.سنبحث عن أفضل المستودعات المطابقة باستخدام الذكاء الاصطناعي.

jakevdp/pythondatasciencehandbook
jakevdp/PythonDataScienceHandbook
48,561عرض على GitHub
This project is an interactive data science environment that combines code execution, rich media visualization, and narrative documentation into a persistent, browser-based platform. It serves as a comprehensive educational resource for scientific computing, providing a framework for iterative data analysis and machine learning prototyping. The environment is distinguished by its focus on high-performance numerical computing, utilizing vectorized array operations and memory-mapped data structures to handle large-scale computations efficiently. It features a unified estimator interface that st
Jupyter NotebookInteractive Data Science EnvironmentsInteractive NotebooksInteractive Shells
عرض على GitHub48,561
fastai/fastbook
fastai/fastbook
24,587عرض على GitHub
This project is an interactive educational textbook and comprehensive machine learning resource designed for deep learning education. It provides a structured curriculum that combines narrative prose with executable code, utilizing literate programming to create reproducible learning experiences within a collection of Jupyter Notebooks. The repository distinguishes itself by teaching machine learning through applied research and modular design. It demonstrates a callback-driven training loop, a declarative data-block pipeline, and a layered abstraction API that allows users to transition betw
Jupyter NotebookComputational NotebooksDeep Learning EducationInteractive Textbooks
عرض على GitHub24,587
vonng/ddia
Vonng/ddia
22,648عرض على GitHub
This project serves as a comprehensive technical reference for the architecture and design of data-intensive applications. It provides a structured analysis of the fundamental principles required to build reliable, scalable, and maintainable software systems, covering the core trade-offs inherent in modern data infrastructure. The repository explores the mechanics of distributed data management, including strategies for replication, partitioning, and achieving consensus across multiple nodes. It details the design of storage engines, indexing techniques, and transaction management models, whi
PythonData System Design PrinciplesSystem Architecture GuidesArchitectural Trade-offs
عرض على GitHub22,648
metabase/metabase
metabase/metabase
47,696عرض على GitHub
Metabase is a business intelligence platform designed to connect to various storage systems and relational databases for data exploration, visualization, and reporting. It provides a centralized environment where users can build queries through a graphical interface or raw code, transforming raw information into interactive dashboards and charts. The platform is built to support self-service analytics, allowing non-technical team members to extract insights without requiring deep knowledge of database syntax. The platform distinguishes itself through a metadata-driven modeling layer that abst
ClojureBusiness Intelligence PlatformsData Query BuildersInteractive Dashboards
عرض على GitHub47,696
lancedb/lancedb
lancedb/lancedb
9,031عرض على GitHub
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
HTMLDataset Versioning PlatformsVector DatabasesVector Similarity Search
عرض على GitHub9,031
pola-rs/polars
pola-rs/polars
38,855عرض على GitHub
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
RustAnalytical Data EnginesColumnar Data ProcessorsDistributed Query Engines
عرض على GitHub38,855
modin-project/modin
modin-project/modin
10,389عرض على GitHub
Modin is a distributed dataframe library and parallel data processing engine designed to handle large datasets that exceed system memory. It functions as a distributed computing framework that parallelizes data manipulation tasks across multiple CPU cores or clusters to increase throughput and avoid memory errors. The project mirrors the Pandas API, allowing for the distribution of data workflows without changing core code logic. It utilizes a pluggable backend interface, which enables users to switch between different distributed execution engines to optimize performance based on available h
PythonDistributed Compute FrameworksDistributed Data Processing FrameworksAPI Compatibility Layers
عرض على GitHub10,389
clickhouse/clickhouse
ClickHouse/ClickHouse
48,229عرض على GitHub
ClickHouse is a high-performance, columnar analytical database designed for real-time query execution and large-scale data aggregation. It functions as a distributed data warehouse capable of processing petabytes of information, while also providing an embedded engine that integrates directly into applications for native query capabilities without external dependencies. The system is built to handle high-throughput ingestion and complex analytical workloads, delivering millisecond-level latency for interactive dashboards and operational monitoring. The platform distinguishes itself through ad
C++Access Control SystemsAgent AnalyticsAgentic Architectures
عرض على GitHub48,229
wesm/pydata-book
wesm/pydata-book
24,668عرض على GitHub
This project serves as a comprehensive textbook and educational resource for data analysis using the Python ecosystem. It provides a structured guide to manipulating, cleaning, and processing datasets, focusing on the core tools required for numerical computing and statistical analysis. The repository distinguishes itself by offering a collection of practical code examples and workflows that demonstrate how to perform complex data tasks. It covers the application of vectorized numerical computations, the management of time-indexed data, and the creation of statistical visualizations to commun
Jupyter NotebookData Analysis GuidesData Analysis LibrariesData Science Tutorials
عرض على GitHub24,668
openbb-finance/openbb
OpenBB-finance/OpenBB
69,583عرض على GitHub
OpenBB is a financial data platform and investment research terminal designed to aggregate, normalize, and distribute market data across analytical workflows. It functions as a comprehensive ecosystem that bridges disparate financial data providers with custom applications, spreadsheets, and internal modeling infrastructure. The platform distinguishes itself through a provider-based data abstraction layer that normalizes heterogeneous financial APIs into a consistent, schema-driven format. This architecture supports quantitative research automation and the construction of interactive, widget-
PythonFinancial Data PlatformsInvestment Research TerminalsData Access & Abstraction
عرض على GitHub69,583
nushell/nushell
nushell/nushell
39,743عرض على GitHub
Nushell is a cross-platform shell and programming language designed to treat all input and output as structured data rather than raw text streams. By enforcing data types and command signatures, it provides a consistent environment for building robust, pipeline-oriented workflows. The shell allows users to chain commands that pass structured objects between stages, enabling complex data processing and automation tasks that remain predictable across different operating systems. What distinguishes the project is its focus on interactive data exploration and modular extensibility. Users can quer
RustData PipelinesData Structure DefinitionsInteractive Data Querying Tools
عرض على GitHub39,743
plausible/analytics
plausible/analytics
24,245عرض على GitHub
This project is an open-source, privacy-focused web analytics platform designed for high-throughput data ingestion and multi-tenant data management. It provides a cookie-less tracking engine that captures visitor interactions using ephemeral request metadata, ensuring comprehensive traffic visibility while maintaining strict privacy standards. The architecture utilizes an event-driven ingestion pipeline and aggregated metric storage to decouple data collection from processing, enabling efficient long-term retrieval and responsive dashboard performance. What distinguishes this platform is its
ElixirPrivacy-Preserving AnalyticsAnalytics ProxyingFirst-Party Collection
عرض على GitHub24,245
bukosabino/ta
bukosabino/ta
4,890عرض على GitHub
This is a pandas-based technical analysis library and financial feature engineering tool. It serves as a vectorized indicator calculator that transforms raw price and volume data into derived metrics for time series analysis. The library uses a NumPy-based engine to perform mathematical operations across entire arrays, avoiding iterative loops to maintain high performance. It organizes technical indicators into a modular class hierarchy with a consistent interface, allowing for bulk feature generation and the direct appending of results as new columns to a pandas DataFrame. The system covers
Jupyter NotebookFeature Engineering ToolsMomentum IndicatorsPandas Financial Frameworks
عرض على GitHub4,890
umami-software/umami
umami-software/umami
37,285عرض على GitHub
Umami is a self-hosted, privacy-focused web analytics platform designed to provide full control over infrastructure and user data. It captures website traffic and visitor behavior through anonymous tracking methods that avoid cookies, browser fingerprinting, and the storage of personally identifiable information. The platform distinguishes itself through a comprehensive suite of behavioral analysis tools, including session replays, heatmaps, and cohort-based retention reporting. It features a multi-tenant architecture that allows teams to manage multiple websites within a single, collaborativ
TypeScriptPrivacy-Focused AnalyticsPrivacy-Preserving AnalyticsAnalytics Tracking
عرض على GitHub37,285
polakowo/vectorbt
polakowo/vectorbt
6,720عرض على GitHub
VectorBT is a vectorized trading strategy backtesting framework that simulates thousands of strategy configurations in a single pass over historical price data. It operates as a parameter optimization engine, a portfolio performance analyzer, a technical indicator calculator, and a financial data fetcher, all built around a DataFrame-centric data model that uses NumPy broadcasting for signal alignment and compiled code acceleration for performance. The framework distinguishes itself through its ability to run large-scale parameter sweeps by constructing every combination of strategy parameter
PythonTrading Strategy BacktestersVectorized BacktestersCrossover Signal Generators
عرض على GitHub6,720
duckdb/duckdb
duckdb/duckdb
38,805عرض على GitHub
DuckDB is an in-process analytical database engine designed to run directly within an application process. As a zero-dependency, embedded system, it provides enterprise-grade SQL data processing capabilities without the overhead of managing a dedicated database server. It is built to handle complex analytical and aggregation tasks by storing and retrieving information in columns, allowing for high-performance relational data manipulation. The engine distinguishes itself through a columnar vectorized execution model that maximizes CPU cache efficiency during query operations. It employs adapti
C++Analytical DatabasesColumnar EnginesEmbedded Databases
عرض على GitHub38,805
saulpw/visidata
saulpw/visidata
8,834عرض على GitHub
VisiData is a terminal-based interactive data analysis tool and browser designed for exploring, filtering, and sorting large tabular datasets. It functions as a structured data inspector that loads and flattens complex formats like JSON, XML, and PCAP into interactive sheets, as well as a terminal file manager for navigating directories and performing staged filesystem operations. The project distinguishes itself by rendering data visualizations, such as scatter plots and histograms, directly in the terminal using Unicode Braille characters. It provides a Python-based data wrangling environme
PythonDataset ExplorersTabular Data WranglingTerminal Data Visualizations
عرض على GitHub8,834
pandas-dev/pandas
pandas-dev/pandas
49,039عرض على GitHub
Pandas is a high-performance data analysis library that provides a comprehensive framework for manipulating, cleaning, and transforming structured datasets. It centers on labeled one-dimensional and two-dimensional data structures, allowing users to construct, filter, and reshape tabular information while performing complex arithmetic and logical operations. The library distinguishes itself through a sophisticated indexing engine that enables automatic data alignment during calculations and relational merges. By utilizing a block-based memory layout, it optimizes cache locality for vectorized
PythonData Analysis LibrariesData Manipulation FrameworksDataframe Constructors
عرض على GitHub49,039
kanaries/pygwalker
Kanaries/pygwalker
15,628عرض على GitHub
Pygwalker is a library that transforms tabular data into interactive, drag-and-drop interfaces for exploratory analysis and visualization. It functions as a grammar-based framework that translates user interactions into declarative chart definitions, allowing for the creation of dynamic data exploration environments directly within notebooks or embedded web applications. The system distinguishes itself by offloading heavy analytical computations to backend kernels, which maintains responsiveness when visualizing large datasets. It supports the serialization of visual states into portable conf
PythonData ExplorationData VisualizationDataframe Visualizers
عرض على GitHub15,628
jujumilk3/leaked-system-prompts
jujumilk3/leaked-system-prompts
14,134عرض على GitHub
This project is a research-oriented repository that serves as a centralized database for system-level prompts and internal behavioral instructions extracted from various large language models. Its primary purpose is to provide a transparent, accessible reference for researchers and developers to study how artificial intelligence models are configured, constrained, and governed. The repository distinguishes itself by cataloging the hidden directives and operational guidelines that define model personas and safety boundaries. By archiving these instruction sets, it enables comparative analysis
System Prompt ArchivesAI System InstructionsInstructional
عرض على GitHub14,134
ambv/black
ambv/black
41,560عرض على GitHub
Black is a deterministic Python code formatter and style guide enforcer. It automatically reformats source code and Jupyter notebook cells into a consistent style to eliminate manual debates over code layout and reduce noise in version control diffs. The tool uses abstract syntax tree analysis to restructure code layout while ensuring that the underlying functional logic remains unchanged. It employs a deterministic engine that produces a single consistent output for any given input, removing subjective styling choices. The system provides capabilities for in-place file mutation, automated s
PythonDeterministic FormattersAST Transformation ToolsAST-Based Formatters
عرض على GitHub41,560
grafana/grafana
grafana/grafana
74,456عرض على GitHub
Grafana is an observability data platform designed to aggregate metrics, logs, and traces from diverse sources into a unified environment. It functions as a centralized interface for visualizing complex telemetry data, transforming raw streams into interactive dashboards that support real-time system health tracking and performance monitoring. The platform distinguishes itself through a plugin-based modular architecture that integrates disparate databases, cloud services, and monitoring tools via a standardized data abstraction layer. This framework allows for the dynamic loading of external
TypeScriptObservability Data PlatformsObservability DashboardsTelemetry Collection and Aggregation
عرض على GitHub74,456
psf/black
psf/black
41,578عرض على GitHub
This project is an uncompromising, deterministic code formatter for Python. It functions by parsing source code into an abstract syntax tree and regenerating it according to a rigid, opinionated set of style rules. By automating the formatting process, it eliminates manual style debates and configuration overhead, ensuring that code remains consistent across entire projects regardless of the original input. The tool distinguishes itself through its focus on speed and seamless integration into development workflows. It utilizes content-based file caching and parallel processing to maintain hig
PythonCode FormattersPython Development ToolsAutomated Formatting Frameworks
عرض على GitHub41,578
microsoft/qlib
microsoft/qlib
44,490عرض على GitHub
This project is a comprehensive platform for quantitative investment research, machine learning, and algorithmic trading. It provides an end-to-end environment for developing, testing, and executing financial strategies, supporting the entire lifecycle from data ingestion and feature engineering to model training and backtesting. The system is distinguished by its configuration-driven workflow orchestration, which allows researchers to automate complex pipelines and manage experiments through declarative files. It features a high-performance data infrastructure that utilizes custom binary for
PythonAlgorithmic Trading FrameworksAlgorithmic Trading PlatformsAlgorithmic Trading Simulators
عرض على GitHub44,490
prefecthq/prefect
PrefectHQ/prefect
21,640عرض على GitHub
Prefect is a workflow orchestration platform designed to define, schedule, and monitor complex data pipelines as Python code. It functions as a container-native engine that wraps individual tasks in isolated environments, ensuring consistent dependencies and resource allocation across diverse infrastructure. By utilizing a state-machine-based orchestration model, the system tracks execution progress through discrete transitions and persistent event logs to maintain reliable and observable task processing. The platform distinguishes itself through a decoupled worker-API architecture, which sep
PythonData Pipeline OrchestrationWorkflow OrchestrationContainer-Native Infrastructure
عرض على GitHub21,640
anuraghazra/github-readme-stats
anuraghazra/github-readme-stats
79,661عرض على GitHub
This project is a serverless service that generates dynamic, themeable visual summaries of software development activity. It functions as an automated metadata visualizer, transforming raw platform logs and repository metrics into resolution-independent vector graphics that can be embedded directly into markdown environments. The service distinguishes itself by offering highly configurable, query-parameter-driven rendering that allows users to customize the visual presentation of their coding patterns, language proficiency, and repository details. It supports both real-time generation via ser
JavaScriptGitHub Stats CardsLanguage Distribution CardsProfile Personalization Suites
عرض على GitHub79,661
recommenders-team/recommenders
recommenders-team/recommenders
21,769عرض على GitHub
This project is a recommendation system framework designed for building, evaluating, and operationalizing personalized item suggestion engines. It provides a comprehensive toolkit for implementing collaborative filtering and content-based algorithms, supported by an end-to-end machine learning pipeline for preparing datasets and deploying predictive models. The framework distinguishes itself through the integration of knowledge graphs to provide richer context for recommendations and the use of industry-specific patterns to accelerate system deployment. It also includes a specialized model ev
PythonRecommender SystemsCollaborative Filtering ModelsCollaborative Filtering Utilities
عرض على GitHub21,769
wshobson/agents
wshobson/agents
36,830عرض على GitHub
This project is an automated trading and agentic workflow platform designed to orchestrate complex financial tasks through state-based graphs. It provides a comprehensive framework for building, deploying, and managing autonomous agents that execute multi-step analytical processes, monitor real-time market conditions, and perform high-speed trade execution. The platform distinguishes itself through a robust agentic plugin ecosystem that integrates directly with popular AI-powered development environments and command-line interfaces. It features a specialized financial analysis engine capable
PythonAlgorithmic Trading EnginesAutomated Trading PlatformsFinancial Analysis Tools
عرض على GitHub36,830
kaggle/kaggle-cli
Kaggle/kaggle-cli
7,417عرض على GitHub
The Kaggle API command line interface is a suite of utilities for managing datasets, machine learning models, and competition entries from a terminal. It functions as a command line wrapper that translates user input into API calls to control remote cloud resources. The project differentiates itself by providing specialized tools for automating the execution of notebook kernels and managing the lifecycle of machine learning models, including version iteration and performance tracking. It also includes a utility for executing evaluation tasks against large language models and downloading the r
PythonCommand Line InterfacesKaggle API ClientsCompetition Management Systems
عرض على GitHub7,417
pathwaycom/llm-app
pathwaycom/llm-app
59,341عرض على GitHub
This project is a data processing engine and AI application platform designed for building production-grade machine learning workflows. It provides a unified programming model that handles both historical batch data and live stream ingestion, enabling the development of real-time ETL pipelines and scalable data transformation workflows. The framework distinguishes itself through differential dataflow execution, which propagates only changes through a pipeline rather than recomputing entire datasets. It supports distributed state management across worker nodes and utilizes incremental stream p
Jupyter NotebookData Processing FrameworksDifferential Dataflow EnginesDistributed State Management
عرض على GitHub59,341
jupyter/notebook
jupyter/notebook
13,204عرض على GitHub
This project is a browser-based interactive computing environment and data science IDE. It serves as a literate programming tool that allows users to create documents combining live code, mathematical equations, visualizations, and narrative text. As a polyglot notebook interface, it connects to various language kernels to execute code and render output within a single interface. The application distinguishes itself by separating the frontend interface from a remote compute engine through a language-agnostic kernel interface. This allows it to support multiple programming languages while main
Jupyter NotebookExecution KernelsInteractive Data Science EnvironmentsBlock-Based Document Models
عرض على GitHub13,204
dbeaver/dbeaver
dbeaver/dbeaver
50,678عرض على GitHub
DBeaver is a universal database client and administration environment designed for managing diverse relational and non-relational database systems. It provides a unified graphical interface that enables users to perform data manipulation, schema migration, and performance monitoring across multiple platforms. By utilizing a standardized driver abstraction layer, the application translates generic requests into database-specific commands, ensuring consistent interaction regardless of the underlying technology. The project distinguishes itself through an extensible, plugin-based architecture th
JavaDatabase Management ClientsDatabase Management SystemsDatabase Administration Tools
عرض على GitHub50,678
sansan0/trendradar
sansan0/TrendRadar
59,513عرض على GitHub
TrendRadar is a market intelligence tool designed to aggregate and analyze external information sources for monitoring shifts in consumer behavior and industry patterns. It functions as a visual data analytics dashboard, transforming raw market data into interactive charts and insights through a component-based interface. The platform utilizes a declarative state management system where application behavior is governed by a centralized configuration object. This architecture supports interactive dashboard development, allowing users to manipulate data sets and visualize emerging trends over t
PythonMarket Intelligence PlatformsAnalytics DashboardsComponent Architectures
عرض على GitHub59,513
finos/perspective
finos/perspective
10,967عرض على GitHub
Perspective is a columnar data analytics library and streaming data visualization engine. It provides an interactive data grid component and notebook analytics widgets designed for processing high-volume data and rendering interactive charts and grids. The system utilizes a high-performance query engine to enable real-time data analysis and streaming dataset visualization. It supports the creation of customizable dashboards and reports that update automatically as new data arrives without requiring full dataset reloads. The project covers large-scale dataset analytics through a schema-driven
C++Columnar Data ProcessorsReal-Time Charting EnginesClient-Side Incremental State Updates
عرض على GitHub10,967
pathwaycom/pathway
pathwaycom/pathway
62,959عرض على GitHub
Pathway is a high-performance data processing framework designed for building unified batch and streaming pipelines. It functions as an orchestrator for complex data transformations, utilizing a differential dataflow engine to process updates incrementally. By treating static datasets and continuous event streams with identical logic, the platform ensures exactly-once processing semantics and consistent results across diverse data sources. The framework distinguishes itself through its specialized support for real-time artificial intelligence and retrieval-augmented generation. It features in
PythonData Processing FrameworksData Stream ProcessorsDeclarative Pipeline Construction
عرض على GitHub62,959
altair-viz/altair
altair-viz/altair
10,410عرض على GitHub
Altair is a declarative data visualization library for Python based on the Vega-Lite grammar. It allows users to create statistical visualizations by mapping data fields to visual properties rather than writing imperative drawing code. The library focuses on interactive charting through a system of linked selections and filters that update multiple visualizations based on user input. It renders charts as JSON and HTML for display in web browsers and interactive notebooks. The project covers statistical data analysis and interactive data exploration, providing capabilities to export visuals a
PythonDeclarative Visualization LanguagesDeclarative Visualization GrammarsInteractive Data Charting
عرض على GitHub10,410
aymericdamien/tensorflow-examples
aymericdamien/TensorFlow-Examples
43,749عرض على GitHub
This repository serves as a structured educational resource for machine learning and deep learning, providing a library of executable scripts and notebooks. It is designed to help users master the practical application of data processing, model evaluation, and neural network construction through annotated code samples and guided tutorials. The collection focuses on translating theoretical mathematical concepts into functional code, offering proven patterns for common tasks such as classification and regression. By providing curated examples of layer construction and training loops, the reposi
Jupyter NotebookAutomatic Differentiation EnginesDeep Learning Code LibrariesTensor Processing Libraries
عرض على GitHub43,749
virattt/ai-hedge-fund
virattt/ai-hedge-fund
60,143عرض على GitHub
This project is an algorithmic trading platform designed to automate financial market analysis and the execution of investment strategies. It provides an end-to-end environment for processing real-time market data through automated decision models, allowing for the triggering of financial transactions based on predefined quantitative signals and risk parameters without manual intervention. The platform distinguishes itself through a modular pipeline architecture that decouples data ingestion, signal generation, and trade execution, facilitating the iterative refinement of investment models. I
PythonAlgorithmic Trading PlatformsAlgorithmic TradingBacktesting Engines
عرض على GitHub60,143
marimo-team/marimo
marimo-team/marimo
21,468عرض على GitHub
Marimo is a reactive Python notebook environment and data science integrated development environment. It functions as a scripting tool that maintains state consistency by automatically tracking variable dependencies and re-executing downstream code blocks whenever upstream inputs are modified. The platform distinguishes itself by storing notebooks as standard, portable Python scripts rather than proprietary formats, ensuring compatibility with version control systems. It integrates artificial intelligence to assist with code generation and debugging based on the current execution context, whi
PythonNotebook EnvironmentsInteractive Data Science EnvironmentsReactive Execution Models
عرض على GitHub21,468
scikit-learn/scikit-learn
scikit-learn/scikit-learn
66,344عرض على GitHub
Scikit-learn is a machine learning library for predictive data analysis that provides a collection of algorithms for supervised and unsupervised learning. It functions as a comprehensive toolkit for data preprocessing, dimensionality reduction, and model selection, allowing users to classify data objects, predict continuous values, and cluster similar items based on historical patterns. The project is defined by a unified interface design where objects either learn from data, transform data, or chain these operations into sequential workflows. To ensure performance on large or high-dimensiona
PythonDimensionality Reduction EnginesFrameworksPipeline Patterns
عرض على GitHub66,344
aishwaryanr/awesome-generative-ai-guide
aishwaryanr/awesome-generative-ai-guide
24,755عرض على GitHub
This project is a community-driven knowledge repository and technical learning resource focused on the field of generative artificial intelligence. It serves as a centralized hub for developers and practitioners to access curated research, tutorials, and foundational concepts necessary for building and deploying modern artificial intelligence applications. The platform distinguishes itself through a collaborative, distributed contribution model that aggregates diverse learning materials into a structured, searchable knowledge base. It covers a wide range of specialized topics, including retri
HTMLAwesome ListGenerative AI Skill PathsLarge Language Model Tutorials
عرض على GitHub24,755
alibaba/druid
alibaba/druid
28,221عرض على GitHub
Druid is a database connection management and monitoring framework designed to maintain persistent, high-performance links between applications and relational databases. It functions as a resource manager that automates the lifecycle of connection pools, reducing the overhead associated with repeatedly opening and closing network connections. The project distinguishes itself through an integrated query analysis engine that decomposes database statements into structured components. This capability enables real-time security auditing, syntax validation, and metadata extraction, allowing for the
JavaConnection PoolsDatabase Abstraction LayersQuery Analyzers
عرض على GitHub28,221
pierian-data/complete-python-3-bootcamp
Pierian-Data/Complete-Python-3-Bootcamp
29,604عرض على GitHub
This project is a beginner coding bootcamp and Python programming curriculum. It provides a structured set of educational materials and exercise files designed to guide students through the Python language from basic to advanced levels. The curriculum is delivered as Jupyter Notebook courseware, combining live code execution with explanatory text for technical demonstrations. It also functions as a project repository, offering a collection of milestone coding exercises and source files for practicing software development and core syntax. The materials are organized into sequential modules an
Jupyter NotebookPython ExercisesCoursewareEducational Code Notebooks
عرض على GitHub29,604
shap/shap
shap/shap
25,049عرض على GitHub
SHAP is an explainable AI toolkit that provides a game theoretic framework for interpreting machine learning model predictions. It functions as a feature attribution engine, decomposing model outputs into the sum of individual feature effects to clarify how specific input variables influence a final decision. By assigning importance values to these inputs, the library enables users to understand the logic behind complex predictive models. The project distinguishes itself through its versatility and specialized calculation methods. It operates as a model-agnostic diagnostic library, capable of
Jupyter NotebookExplainable AI ToolkitsFeature Attribution MethodsGame Theoretic Explainability
عرض على GitHub25,049
ageron/handson-ml
ageron/handson-ml
25,608عرض على GitHub
This is a machine learning educational repository consisting of a collection of notebooks and code examples. It provides practical implementations of diverse machine learning algorithms and workflows, ranging from traditional scientific computing to deep learning. The project features specific implementations of Scikit-Learn models, such as decision trees, random forests, and support vector machines, as well as TensorFlow examples for building neural networks, convolutional layers, and recurrent architectures. It also includes tutorials on reinforcement learning development and the creation o
Jupyter NotebookEducational Code NotebooksMachine Learning EducationComputational Graphs
عرض على GitHub25,608
sheetjs/sheetjs
SheetJS/sheetjs
36,278عرض على GitHub
SheetJS is a comprehensive library for parsing, manipulating, and generating complex spreadsheet file formats. It functions as a universal data processor that maps diverse binary, XML, and text-based file structures into a unified internal object model, allowing developers to create, read, and transform workbook data programmatically. The library distinguishes itself through a portable logic layer that provides a consistent execution environment across web browsers, server-side runtimes, and native desktop or mobile applications. By utilizing stream-based processing, it handles large files in
Spreadsheet Generation LibrariesSpreadsheet Processing EnginesBrowser-Based Data Processing
عرض على GitHub36,278
elastic/elasticsearch
elastic/elasticsearch
77,012عرض على GitHub
Elasticsearch is a distributed search engine and document store designed for the high-performance indexing and retrieval of massive volumes of unstructured data. It functions as a centralized analytics platform, providing a schema-flexible architecture that organizes information into searchable indices while maintaining global cluster state through a distributed consensus mechanism. The platform distinguishes itself through its integrated approach to observability, security, and advanced analytics. It combines full-text, vector, and hybrid search capabilities with machine learning-driven insi
JavaDistributed Search EnginesData Analytics EnginesDistributed Document Stores
عرض على GitHub77,012
d3/d3
d3/d3
113,118عرض على GitHub
D3 is a modular library providing low-level primitives for creating data-driven visualizations. It functions as a flexible framework that allows for direct control over visual presentation by mapping abstract data dimensions to graphical properties, such as position, color, and size, without imposing predefined chart abstractions. The library distinguishes itself by offering specialized tools for complex data representation, including algorithmic layouts for hierarchical structures and geographic projection utilities for mapping spherical coordinates. It also includes a comprehensive suite fo
ShellData Visualization LibrariesData Visualization ScalesDOM Manipulation Libraries
عرض على GitHub113,118

التحليلات، إطارات البيانات ودفاتر الملاحظات

jakevdp/PythonDataScienceHandbook

fastai/fastbook

Vonng/ddia

metabase/metabase

lancedb/lancedb

pola-rs/polars

modin-project/modin

ClickHouse/ClickHouse

wesm/pydata-book

OpenBB-finance/OpenBB

nushell/nushell

plausible/analytics

bukosabino/ta

umami-software/umami

polakowo/vectorbt

duckdb/duckdb

saulpw/visidata

pandas-dev/pandas

Kanaries/pygwalker

jujumilk3/leaked-system-prompts

ambv/black

grafana/grafana

psf/black

microsoft/qlib

PrefectHQ/prefect

anuraghazra/github-readme-stats

recommenders-team/recommenders

wshobson/agents

Kaggle/kaggle-cli

pathwaycom/llm-app

jupyter/notebook

dbeaver/dbeaver

sansan0/TrendRadar

finos/perspective

pathwaycom/pathway

altair-viz/altair

aymericdamien/TensorFlow-Examples

virattt/ai-hedge-fund

marimo-team/marimo

scikit-learn/scikit-learn

aishwaryanr/awesome-generative-ai-guide

alibaba/druid

Pierian-Data/Complete-Python-3-Bootcamp

shap/shap

ageron/handson-ml

SheetJS/sheetjs

elastic/elasticsearch

d3/d3