Why is pandas-dev/pandas a recommended Tabular Data Frameworks GitHub Repositories repository?

Provides a robust environment for managing heterogeneous tabular data.

Why is apache/spark a recommended Tabular Data Frameworks GitHub Repositories repository?

Performs distributed relational transformations on structured data using SQL and programmatic interfaces.

Why is google-research/google-research a recommended Tabular Data Frameworks GitHub Repositories repository?

Forecasts riverine and flash floods using hydrologic models and satellite-derived datasets to provide early warnings.

Why is kodecocodes/swift-algorithm-club a recommended Tabular Data Frameworks GitHub Repositories repository?

Provides systems for predicting data categories based on features and training sets.

Why is d2l-ai/d2l-en a recommended Tabular Data Frameworks GitHub Repositories repository?

Provides recursive multistep forecasting by feeding model-generated predictions back into the input window.

Why is dmlc/xgboost a recommended Tabular Data Frameworks GitHub Repositories repository?

Provides a framework for building predictive models on structured tabular data using boosted trees and random forests.

Why is fastai/fastai a recommended Tabular Data Frameworks GitHub Repositories repository?

Processes structured datasets with missing value imputation, categorical encoding, and embedding layers for predictive modeling.

Why is bvaughn/react-virtualized a recommended Tabular Data Frameworks GitHub Repositories repository?

Provides framework-level support for managing and sorting heterogeneous two-dimensional tabular data.

Why is cinnamon/kotaemon a recommended Tabular Data Frameworks GitHub Repositories repository?

Isolates table structures from raw CSV content for document integration.

105 repositorios

Awesome GitHub RepositoriesTabular Data Frameworks

Environments for managing heterogeneous two-dimensional arrays.

Distinguishing note: Focuses on the framework level for tabular data rather than specific algorithms.

Explore 105 awesome GitHub repositories matching data & databases · Tabular Data Frameworks. Refine with filters or upvote what's useful.

Encuentra los mejores repositorios con IA.Buscaremos los repositorios que mejor coincidan usando IA.

pandas-dev/pandas
pandas-dev/pandas
49,039Ver en GitHub
Pandas is a high-performance data analysis library that provides a comprehensive framework for manipulating, cleaning, and transforming structured datasets. It centers on labeled one-dimensional and two-dimensional data structures, allowing users to construct, filter, and reshape tabular information while performing complex arithmetic and logical operations. The library distinguishes itself through a sophisticated indexing engine that enables automatic data alignment during calculations and relational merges. By utilizing a block-based memory layout, it optimizes cache locality for vectorized
Provides a robust environment for managing heterogeneous tabular data.
Pythonalignmentdata-analysisdata-science
Ver en GitHub49,039
apache/spark
apache/spark
43,467Ver en GitHub
Apache Spark is a unified distributed data processing engine designed for large-scale data analysis and computation graphs. It functions as a distributed machine learning framework, a graph processing system, a real-time stream processor, and a SQL analytics engine. The system enables the execution of distributed SQL querying, large-scale graph analysis, and real-time stream analytics across clusters of machines. It also provides a scalable environment for implementing machine learning algorithms and predictive model development on massive datasets. The engine incorporates relational query e
Performs distributed relational transformations on structured data using SQL and programmatic interfaces.
Scalabig-datajavajdbc
Ver en GitHub43,467
google-research/google-research
google-research/google-research
38,139Ver en GitHub
This repository serves as a comprehensive research platform and toolkit for advancing machine learning, quantum computing, and large-scale scientific data analysis. It provides foundational frameworks for developing complex algorithmic systems, offering the necessary infrastructure for distributed training, computational graph execution, and high-performance model development. The project distinguishes itself by integrating specialized research domains with robust, privacy-preserving methodologies. It supports diverse scientific discovery through tools for quantum simulation, physics-informed
Forecasts riverine and flash floods using hydrologic models and satellite-derived datasets to provide early warnings.
Jupyter Notebookaimachine-learningresearch
Ver en GitHub38,139
donnemartin/data-science-ipython-notebooks
donnemartin/data-science-ipython-notebooks
29,166Ver en GitHub
This project is a collection of interactive Python notebooks and educational resources designed for mastering data science, machine learning, and numerical computing. It provides a series of practical guides and tutorials covering deep learning, big data processing, and statistical analysis. The repository features specialized instructional suites for implementing classical machine learning algorithms, building deep learning model architectures, and managing AWS cloud infrastructure. It includes dedicated notebooks for data visualization and numerical computing exercises. The project covers
Provides instructional material on managing heterogeneous two-dimensional arrays for data manipulation using pandas.
Pythonawsbig-datacaffe
Ver en GitHub29,166
kodecocodes/swift-algorithm-club
kodecocodes/swift-algorithm-club
29,099Ver en GitHub
This project is a comprehensive collection of common computer science algorithms and data structures implemented in Swift. It serves as an educational reference and library for studying computational complexity, algorithmic logic, and data structure engineering through practical code examples. The repository provides a wide suite of data structure implementations, including various types of linked lists, heaps, hash tables, and an extensive range of hierarchical trees such as Red-Black, B-Tree, and Splay trees. It also covers diverse sorting and searching techniques, from basic bubble sort to
Provides systems for predicting data categories based on features and training sets.
Swiftalgorithmsdata-structuresswift
Ver en GitHub29,099
d2l-ai/d2l-en
d2l-ai/d2l-en
29,001Ver en GitHub
This project is an educational platform and research toolkit designed to teach deep learning through a combination of mathematical theory, visual diagrams, and executable code. It provides a comprehensive environment for building, training, and evaluating neural networks, grounding complex concepts in interactive computational notebooks that allow for hands-on experimentation. The framework distinguishes itself by interleaving theoretical foundations—including linear algebra, calculus, and probability—with practical implementations across multiple industry-standard libraries. It supports flex
Provides recursive multistep forecasting by feeding model-generated predictions back into the input window.
Pythonbookcomputer-visiondata-science
Ver en GitHub29,001
dmlc/xgboost
dmlc/xgboost
28,471Ver en GitHub
XGBoost is a distributed machine learning library for implementing scalable gradient boosting decision trees used for regression, classification, and ranking. It functions as a predictive model framework and a cross-language toolkit, providing a core implementation with native bindings for Python, R, Java, Scala, and C++. The system is designed as a GPU-accelerated library that utilizes CUDA and NCCL to speed up the training of decision tree ensembles. It operates as a distributed framework capable of scaling training and prediction across multi-node clusters and GPU environments to process m
Provides a framework for building predictive models on structured tabular data using boosted trees and random forests.
C++distributed-systemsgbdtgbm
Ver en GitHub28,471
fastai/fastai
fastai/fastai
27,862Ver en GitHub
Fastai is a high-level deep learning library built on PyTorch that provides a unified interface for managing the entire machine learning lifecycle. It functions as a comprehensive training toolkit, abstracting hardware management and automating complex training loops to simplify the construction and execution of neural network models. The framework is distinguished by its notebook-centric development environment and a type-dispatching data pipeline that automatically applies transformations based on input data formats. It emphasizes transfer learning through discriminative layer-wise optimiza
Processes structured datasets with missing value imputation, categorical encoding, and embedding layers for predictive modeling.
Jupyter Notebookcolabdeep-learningfastai
Ver en GitHub27,862
bvaughn/react-virtualized
bvaughn/react-virtualized
27,072Ver en GitHub
react-virtualized is a library of components for rendering massive lists and tables by drawing only the elements visible in the viewport. It provides specialized layout managers including a windowed grid component and a dynamic height list manager. The project includes a masonry layout engine for packing items of varying heights and widths, as well as an infinite scroll interface for incrementally fetching and appending data. The library covers a broad range of virtualization capabilities, including frozen grid elements, reverse list rendering, and synchronized viewport scrolling. It also su
Provides framework-level support for managing and sorting heterogeneous two-dimensional tabular data.
JavaScriptgridlistlistview
Ver en GitHub27,072
cinnamon/kotaemon
Cinnamon/kotaemon
25,139Ver en GitHub
Kotaemon is an orchestration framework designed for building modular, agentic workflows that integrate document processing, retrieval-augmented generation, and multi-step reasoning. It provides a comprehensive platform for developing document-based question answering systems, allowing users to chain language models, prompt templates, and external tools into complex, automated pipelines. The system distinguishes itself through a highly modular architecture that emphasizes component-based composition and schema-driven data exchange. It supports autonomous agents capable of decomposing complex q
Isolates table structures from raw CSV content for document integration.
Pythonchatbotllmsopen-source
Ver en GitHub25,139
wesm/pydata-book
wesm/pydata-book
24,668Ver en GitHub
This project serves as a comprehensive textbook and educational resource for data analysis using the Python ecosystem. It provides a structured guide to manipulating, cleaning, and processing datasets, focusing on the core tools required for numerical computing and statistical analysis. The repository distinguishes itself by offering a collection of practical code examples and workflows that demonstrate how to perform complex data tasks. It covers the application of vectorized numerical computations, the management of time-indexed data, and the creation of statistical visualizations to commun
Provides frameworks for loading and processing structured tabular data to extract insights.
Jupyter Notebook
Ver en GitHub24,668
krayin/laravel-crm
krayin/laravel-crm
21,404Ver en GitHub
This project is a modular, open-source customer relationship management platform built on the Laravel framework. It serves as a comprehensive business application framework designed for tracking sales pipelines, managing business entities, and automating marketing workflows. By providing a self-hosted solution, it enables organizations to maintain full control over their contact data, sales leads, and communication history. The platform distinguishes itself through a highly extensible architecture that allows developers to modify core behavior without altering the underlying source code. It u
Ships sortable, paginated tabular data grids that utilize AJAX for efficient server-side record management.
PHPcrmcrm-multi-tenant-saascrm-platform
Ver en GitHub21,404
apache/mxnet
apache/mxnet
20,829Ver en GitHub
This project is a deep learning framework designed for constructing, training, and deploying neural networks across diverse hardware environments. It functions as a high-performance tensor computation library that provides both imperative and symbolic programming interfaces, allowing developers to balance flexible, step-by-step model building with the efficiency of compiled computation graphs. The framework distinguishes itself through a hybrid execution engine that integrates declarative graph compilation with imperative runtime logic. It supports scalable, distributed training across multip
Compiles inference engines into single source files to simplify deployment across platforms.
C++mxnet
Ver en GitHub20,829
fengdu78/lihang-code
fengdu78/lihang-code
19,548Ver en GitHub
This repository is a collection of foundational machine learning models and predictive analysis tools designed for the study of statistical learning methods. It serves as an educational resource that demonstrates the mathematical principles of classic algorithms through direct, first-principles implementation. The project distinguishes itself by constructing models from the ground up, relying on fundamental linear algebra and calculus operations rather than high-level abstraction frameworks. Each algorithm is organized into modular, standalone scripts that mirror the sequence of mathematical
Provides a toolkit of modular scripts for predictive data modeling using fundamental mathematical operations.
Jupyter Notebook
Ver en GitHub19,548
tensorflow/tfjs
tensorflow/tfjs
19,134Ver en GitHub
TensorFlow.js is a JavaScript machine learning library used for training and deploying models in web browsers and server-side environments. It functions as a browser-based model trainer, a WebAssembly inference engine, and a WebGPU accelerated tensor library for low-level linear algebra. The project also includes a model converter to transform Python-based models into optimized formats for JavaScript execution. The library distinguishes itself through a pluggable backend architecture that allows mathematical operations to be executed via CPU, WebGL, or WebGPU. It supports the conversion of Py
Imports datasets from disk or web sources in various formats for machine learning use.
TypeScript
Ver en GitHub19,134
google/libphonenumber
google/libphonenumber
18,077Ver en GitHub
This project is an international phone number library used for parsing, formatting, and validating phone numbers based on the E.164 standard. It provides a validation engine and parser to convert raw strings into structured objects and verify if numbers conform to regional numbering rules. The library includes a metadata provider that maps phone numbers to geographic locations, time zones, and network carriers. It can distinguish between line types, such as fixed-line or mobile, to verify SMS compatibility and identify original network operators. Additional capabilities include extracting ph
Implements a metadata engine that loads regional phone number rules from CSV files.
C++
Ver en GitHub18,077
fivethirtyeight/data
fivethirtyeight/data
17,394Ver en GitHub
This repository serves as a public archive for the raw datasets and analytical code used to support journalistic reporting. It functions as a platform for reproducible research, providing the necessary materials for users to verify published findings and conduct independent statistical analysis. The collection utilizes a versioned storage model to track historical changes to both data and processing scripts. By organizing information into a structured directory hierarchy, the repository maps specific journalistic projects to their corresponding inputs and outputs, ensuring that the methodolog
Delivers structured information in lightweight, human-readable CSV formats for broad analytical compatibility.
Jupyter Notebookdata
Ver en GitHub17,394
lllyasviel/framepack
lllyasviel/FramePack
17,028Ver en GitHub
FramePack is a neural video synthesis engine and generation framework designed to produce long, temporally consistent video sequences. It functions as a diffusion model optimizer, providing a suite of techniques to manage the computational demands of high-parameter video models while maintaining visual stability during extended generation tasks. The system distinguishes itself through a hierarchical approach to frame prediction, which plans distant anchor frames before filling in intermediate content to prevent cumulative temporal drift. By utilizing constant-length context compression and to
Implements hierarchical anchor frame prediction to prevent temporal drift and ensure visual stability.
Python
Ver en GitHub17,028
refactoringhq/tolaria
refactoringhq/tolaria
16,851Ver en GitHub
Tolaria is a markdown knowledge base manager and bidirectional note linking system. It functions as an integrated environment for organizing notes and structured data, utilizing YAML frontmatter and wikilinks to establish relational mappings between documents. The project distinguishes itself by integrating language model capabilities directly into the editor for content generation and analysis. It further combines prose with structured data through a markdown spreadsheet editor that renders CSV-formatted files as interactive grids with formula support and cross-sheet referencing. The platfo
Provides an interface for managing and editing two-dimensional numeric data stored as CSV in markdown files.
TypeScript
Ver en GitHub16,851
ddbourgin/numpy-ml
ddbourgin/numpy-ml
16,275Ver en GitHub
This library is a collection of machine learning algorithms and neural network components implemented from scratch using only NumPy. It serves as an educational toolkit for constructing and experimenting with machine learning architectures, emphasizing a modular approach where algorithms are organized into self-contained, object-oriented classes. The project distinguishes itself by relying exclusively on array-oriented programming to perform mathematical operations, ensuring that all computations are vectorized for performance. By utilizing a standardized interface for forward and backward pa
Implements flexible nonparametric predictive models like kernel regression and Gaussian processes.
Pythonattentionbayesian-inferencegaussian-mixture-models
Ver en GitHub16,275

Awesome Tabular Data Frameworks GitHub Repositories

pandas-dev/pandas

apache/spark

google-research/google-research

donnemartin/data-science-ipython-notebooks

kodecocodes/swift-algorithm-club

d2l-ai/d2l-en

dmlc/xgboost

fastai/fastai

bvaughn/react-virtualized

Cinnamon/kotaemon

wesm/pydata-book

krayin/laravel-crm

apache/mxnet

fengdu78/lihang-code

tensorflow/tfjs

google/libphonenumber

fivethirtyeight/data

lllyasviel/FramePack

refactoringhq/tolaria

ddbourgin/numpy-ml

Explorar subetiquetas