# hosseinmoein/dataframe

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/hosseinmoein-dataframe).**

2,917 stars · 352 forks · C++ · bsd-3-clause

## Links

- GitHub: https://github.com/hosseinmoein/DataFrame
- Homepage: https://hosseinmoein.github.io/DataFrame/
- awesome-repositories: https://awesome-repositories.com/repository/hosseinmoein-dataframe.md

## Topics

`ai` `cpp` `data-analysis` `data-science` `dataframe` `financial-data-analysis` `financial-engineering` `large-data` `machine-learning` `multidimensional-data` `numerical-analysis` `statistical` `statistical-analysis` `tensor` `tensorboard` `trading-algorithms` `trading-strategies`

## Description

DataFrame is a C++ tabular data library and manipulation engine designed for managing heterogeneous data in contiguous memory. It functions as a statistical analysis framework and time series analysis toolkit, providing the means to store, index, and transform multidimensional datasets.

The project distinguishes itself through a high-performance execution model that utilizes column-major storage, SIMD-aligned memory allocation, and a thread-pool for parallel computations. It employs a visitor-based algorithm dispatch system and policy-driven transformations to decouple data processing logic from the underlying storage.

The library covers a broad range of capability areas, including multivariate data analysis, signal processing workflows using Fast Fourier Transforms, and machine learning tasks such as clustering and dimensionality reduction. It also provides extensive tools for data cleaning, preprocessing, and the calculation of descriptive statistics and hypothesis tests.

The system supports data serialization and import/export via CSV, JSON, and high-performance binary formats.

## Tags

### Data & Databases

- [Column-Major Storage](https://awesome-repositories.com/f/data-databases/column-major-storage.md) — Uses column-major storage to maintain heterogeneous data in contiguous memory for high cache efficiency. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DataFrame.html))
- [In-Memory Data Stores](https://awesome-repositories.com/f/data-databases/in-memory-data-stores.md) — Provides a high-performance in-memory data store for heterogeneous tabular data using contiguous memory allocation. ([source](https://cdn.jsdelivr.net/gh/hosseinmoein/dataframe@master/README.md))
- [Tabular Data Frames](https://awesome-repositories.com/f/data-databases/tabular-data-frames.md) — Implements high-performance in-memory structured grids for manipulating tabular data and performing matrix operations.
- [Column Data Appending](https://awesome-repositories.com/f/data-databases/column-data-appending.md) — Adds a single value or range of values to the end of an existing named column. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/append_column.html))
- [Conditional Data Filters](https://awesome-repositories.com/f/data-databases/conditional-data-filters.md) — Generates boolean vectors based on whether column elements satisfy user-defined predicates. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/FactorizeVisitor.html))
- [Custom Column Functions](https://awesome-repositories.com/f/data-databases/custom-column-functions.md) — Applies custom user-defined functions across multiple columns of a tabular dataset. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/apply.html))
- [Data Concatenations](https://awesome-repositories.com/f/data-databases/data-concatenations.md) — Appends rows from one data structure to another while handling missing columns via filling or filtering. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/concat.html))
- [Data Joins](https://awesome-repositories.com/f/data-databases/data-joins.md) — Implements general purpose merging of data structures based on shared keys or indices. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/FirstVisitor.html))
- [Data Manipulation Frameworks](https://awesome-repositories.com/f/data-databases/data-manipulation-frameworks.md) — Implements a comprehensive framework for slicing, joining, merging, grouping, sorting, and filtering structured tabular data. ([source](https://cdn.jsdelivr.net/gh/hosseinmoein/dataframe@master/README.md))
- [Data Serialization](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-serialization.md) — Reconstructs data frames from binary buffers or strings via serialization. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/from_string.html))
- [View Data Assignment](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-transformation/array-tensor-manipulation/array-filtering/array-view-creation/view-data-assignment.md) — Provides read-only and read-write views of contiguous or disjoined data slices. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DataFrame.html))
- [Array View Creation](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-transformation/array-tensor-manipulation/array-filtering/array-view-creation/view-data-assignment/array-view-creation.md) — Implements memory-efficient data views for reading or modifying subsets without copying. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_idx.html))
- [Record Counting](https://awesome-repositories.com/f/data-databases/database-record-management/record-counting.md) — Calculates the number of rows that satisfy a user-defined predicate across specified columns. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/count.html))
- [Positional Data Extraction](https://awesome-repositories.com/f/data-databases/distributed-data-processing/positional-data-extraction.md) — Extracts subsets of data using a range of indices or a list of locations. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_loc.html))
- [File-Based Data Exports](https://awesome-repositories.com/f/data-databases/file-based-data-exports.md) — Writes multidimensional data to files using CSV, JSON, and high-performance binary formats. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/write.html))
- [Byte Alignment Optimizations](https://awesome-repositories.com/f/data-databases/high-performance-data-infrastructures/memory-optimized-processing/image-optimized-memory-regions/byte-alignment-optimizations.md) — Allocates data on custom byte boundaries to enable SIMD instructions and prevent cache-line sharing. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DataFrame.html))
- [Incremental Data Appending](https://awesome-repositories.com/f/data-databases/incremental-data-appending.md) — Allows adding new records across multiple named columns to grow the tabular dataset. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/append_row.html))
- [Index Accessors](https://awesome-repositories.com/f/data-databases/index-accessors.md) — Accesses the internal container holding row or column labels to inspect the indexing structure. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_index.html))
- [Index-Based Data Alignment](https://awesome-repositories.com/f/data-databases/index-based-data-alignment.md) — Uses a primary metadata column to synchronize rows across multiple heterogeneous columns during joins and merges.
- [Index Resampling](https://awesome-repositories.com/f/data-databases/index-resampling.md) — Adjusts the sampling interval of the index and fills data when changing frequency. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/change_freq.html))
- [Index-Based Extraction](https://awesome-repositories.com/f/data-databases/list-index-locations/index-based-reorganizations/index-based-extraction.md) — Retrieves subsets of data and indices based on specified ranges or location lists. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_idx.html))
- [Data Column Extraction](https://awesome-repositories.com/f/data-databases/market-data-providers/data-column-extraction.md) — Creates new data structures containing only specified lists of columns. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data.html))
- [Extremum Value Detectors](https://awesome-repositories.com/f/data-databases/maximum-value-calculators/extremum-value-detectors.md) — Calculates the maximum or minimum values of a specific column while handling missing data. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/MaxVisitor.html))
- [Missing Data Imputation](https://awesome-repositories.com/f/data-databases/missing-data-imputation.md) — Fills gaps in datasets using interpolation or defined policies to ensure data continuity. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/FirstVisitor.html))
- [Missing Value Imputation](https://awesome-repositories.com/f/data-databases/missing-value-imputation.md) — Replaces missing values using static substitutes, forward/backward fills, or interpolation. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/drop_missing.html))
- [Tabular Data Preprocessing](https://awesome-repositories.com/f/data-databases/tabular-data-frameworks/tabular-predictive-models/tabular-data-preprocessing.md) — Implements tools for handling missing values, removing outliers, and normalizing continuous columns in structured data. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/self_shift.html))
- [Tabular Data Sorting](https://awesome-repositories.com/f/data-databases/tabular-data-sorting.md) — Implements efficient row ordering based on specified columns in ascending or descending directions. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/sort.html))
- [Time Series Analysis](https://awesome-repositories.com/f/data-databases/time-series-analysis.md) — Processes temporal data using smoothing, detrending, and seasonality detection to identify cycles and trends.
- [Time Series Analysis Toolkits](https://awesome-repositories.com/f/data-databases/time-series-analysis-tools/time-series-analysis-toolkits.md) — Offers a toolkit for decomposing temporal trends and identifying seasonality in scalar and multidimensional datasets.
- [Sequential Time Index Appenders](https://awesome-repositories.com/f/data-databases/time-series-indexing/columnar-time-series-indexing/time-index-definitions/sequential-time-index-appenders.md) — Appends values to the primary index column to extend the dataset's identifier sequence. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/append_index.html))
- [Unique Value Counting](https://awesome-repositories.com/f/data-databases/unique-value-counting.md) — Identifies all unique entries in a column and calculates the frequency of each occurrence. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/value_counts.html))
- [Anomaly Detection](https://awesome-repositories.com/f/data-databases/anomaly-detection.md) — Identifies unusual data points using statistical methods and replaces them via fill policies. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/detect_and_change.html))
- [Anomaly Detection Algorithms](https://awesome-repositories.com/f/data-databases/anomaly-detection-algorithms.md) — Flags outlier values that exceed a defined number of standard deviations from the mean. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/AnomalyDetectByZScoreVisitor.html))
- [Binary Serialization](https://awesome-repositories.com/f/data-databases/binary-serialization.md) — Implements tools for converting datasets into compact binary formats for efficient storage and transmission. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/to_string.html))
- [Categorical Encodings](https://awesome-repositories.com/f/data-databases/categorical-encodings.md) — Transforms categorical columns into binary numerical indicator columns using one-hot encoding. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/load_indicators.html))
- [Column Data Merging](https://awesome-repositories.com/f/data-databases/column-data-merging.md) — Merges contents of the same named column from multiple dataframes using a custom functor for logic. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/combine.html))
- [Multi-Column Processing](https://awesome-repositories.com/f/data-databases/column-mappings/dynamic-column-references/multi-column-processing.md) — Passes references of the index and multiple named columns to a visitor functor for complex calculations. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/single_act_visit.html))
- [Column Value Aggregations](https://awesome-repositories.com/f/data-databases/column-value-extraction/column-value-aggregations.md) — Provides column-based aggregation to compute total sums while optionally ignoring missing data. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/SumVisitor.html))
- [Column Value Enumerators](https://awesome-repositories.com/f/data-databases/column-value-extraction/column-value-enumerators.md) — Retrieves a list of all distinct values from a specific column while preserving order. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_col_unique_values.html))
- [Dataset Comparators](https://awesome-repositories.com/f/data-databases/data-collections-datasets/dataset-comparators.md) — Identifies differing values between two data structures with identical indices to produce a divergent set. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/difference.html))
- [Predicate-Based Joins](https://awesome-repositories.com/f/data-databases/data-joins/predicate-based-joins.md) — Supports merging two data structures using user-defined predicate functions to determine which rows to include. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/gen_join.html))
- [Numerical Scaling](https://awesome-repositories.com/f/data-databases/data-normalization/numerical-scaling.md) — Scales data using Z-score, min-max, or Euclidean normalization to standardize values across a dataset. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/NormalizeVisitor.html))
- [String Converters](https://awesome-repositories.com/f/data-databases/data-serialization-formats/output-formatting-systems/string-converters.md) — Provides functions to cast tabular data and indices into string representations for storage or transmission. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/to_string.html))
- [Transformation Chains](https://awesome-repositories.com/f/data-databases/data-transformation-functions/transformation-chains.md) — Sequences modular functions into a transformation pipeline to process datasets incrementally. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/pipe.html))
- [Mutable Data Views](https://awesome-repositories.com/f/data-databases/data-type-managers/mutable-types/mutable-data-views.md) — Provides lightweight read-only or mutable views of data slices to enable subset manipulation without copying underlying memory.
- [Upsert by Key](https://awesome-repositories.com/f/data-databases/data-update-apis/upsert-by-key.md) — Modifies values in a dataset by matching indices from a source container, effectively performing an upsert. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/modify_by_idx.html))
- [File-Based Data Import](https://awesome-repositories.com/f/data-databases/file-based-data-import.md) — Provides capabilities for loading heterogeneous data from CSV, JSON, and binary files into the data structure. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/FirstVisitor.html))
- [Generated Columns](https://awesome-repositories.com/f/data-databases/generated-columns.md) — Creates new columns by computing values from existing data using a visitor functor. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/load_column.html))
- [Group-By Aggregations](https://awesome-repositories.com/f/data-databases/group-by-aggregations.md) — Aggregates data into a new structure based on equal values in one or more columns. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/groupby.html))
- [Quantile-Based Row Filters](https://awesome-repositories.com/f/data-databases/grouped-aggregations/quantile-digest-aggregators/empirical-quantile-calculators/quantile-based-row-filters.md) — Extracts rows that fall above or below a specific quantile of a named column. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_above_quantile_data.html))
- [Heterogeneous Data Loading](https://awesome-repositories.com/f/data-databases/heterogeneous-data-loading.md) — Populates the data structure with indices and columns of varying types by moving external vectors. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/load_data.html))
- [K-Nearest Neighbor Retrieval](https://awesome-repositories.com/f/data-databases/k-nearest-neighbor-retrieval.md) — Implements algorithms to identify the k most similar data points using configurable distance functions. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/knn.html))
- [Long-to-Wide Reshaping](https://awesome-repositories.com/f/data-databases/long-to-wide-reshaping.md) — Provides the ability to rotate long-format data into wide matrix-like structures by converting column values into headers. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/explode.html))
- [Matrix Transposition](https://awesome-repositories.com/f/data-databases/matrix-transposition.md) — Provides the ability to swap rows and columns for datasets with a single data type. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/transpose.html))
- [Missing Data Removal](https://awesome-repositories.com/f/data-databases/missing-data-removal.md) — Filters out rows containing missing values based on customizable thresholds. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/drop_missing.html))
- [Policy-Driven Transformations](https://awesome-repositories.com/f/data-databases/policy-driven-transformations.md) — Employs configurable strategy objects to standardize how missing values, interpolation, and ranking logic are handled.
- [Range-Based Column Filters](https://awesome-repositories.com/f/data-databases/range-based-column-filters.md) — Identifies elements in a column that fall between two bound values and returns a boolean vector. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/in_between.html))
- [Rolling Linear Regressions](https://awesome-repositories.com/f/data-databases/rolling-linear-regressions.md) — Computes rolling means, slopes, and intercepts over specified periods to analyze temporal trends. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/LinregMovingMeanVisitor.html))
- [Row Deletions](https://awesome-repositories.com/f/data-databases/row-deletions.md) — Supports removing data rows within specified index ranges to maintain dataset consistency. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/remove_data_by_idx.html))
- [Date-Based Filters](https://awesome-repositories.com/f/data-databases/search-indexing-technologies/search-indexing/search-and-indexing/content-search-filters/date-based-filters.md) — Selects rows based on specific days of the month using a date-time index. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_on_days_in_month.html))
- [String Pattern Filters](https://awesome-repositories.com/f/data-databases/string-pattern-filters.md) — Filters column data by checking if items start or end with a specific string pattern. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/starts_with.html))
- [Clusterings](https://awesome-repositories.com/f/data-databases/time-series-analysis/clusterings.md) — Groups time-series sequences into clusters based on pattern similarity using normalized cross-correlation. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/kshape_groups.html))
- [Time-Series Statistical Profiling](https://awesome-repositories.com/f/data-databases/time-series-data-modeling/time-series-statistical-profiling.md) — Evaluates time-series stationarity using KPSS and Augmented Dickey-Fuller tests to identify statistical properties. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/StationaryCheckVisitor.html))
- [Time Bucketing](https://awesome-repositories.com/f/data-databases/time-series-data-modeling/time-series-statistical-profiling/time-series-aggregations/time-bucketing.md) — Aggregates data and indices into fixed intervals based on distance, count, or time. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/bucketize.html))
- [Statistical Bucketing](https://awesome-repositories.com/f/data-databases/time-series-data-modeling/time-series-statistical-profiling/time-series-aggregations/time-bucketing/statistical-bucketing.md) — Aggregates data into fixed-size intervals and calculates summary statistics for each bucket. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/FirstVisitor.html))
- [Time-Series Index Generators](https://awesome-repositories.com/f/data-databases/time-series-index-generators.md) — Generates a vector of timestamps between two dates based on a specified frequency. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/gen_datetime_index.html))
- [Time Series Decomposition](https://awesome-repositories.com/f/data-databases/time-series-toolkits/time-series-decomposition.md) — Separates observed time series data into trend, seasonal, and residual components. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DecomposeVisitor.html))
- [Time Series Transformations](https://awesome-repositories.com/f/data-databases/time-series-transformations.md) — Provides tools for removing trends and seasonality from temporal sequences to prepare them for analysis. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/make_stationary.html))
- [Vector Magnitude Calculators](https://awesome-repositories.com/f/data-databases/vector-search/vector-magnitude-calculators.md) — Computes dot products, magnitudes, and Euclidean or Manhattan distances to determine vector similarity. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DotProdVisitor.html))
- [SIMD-Aligned Data Structures](https://awesome-repositories.com/f/data-databases/vectorized-arithmetic/simd-accelerated-arithmetic/simd-aligned-data-structures.md) — Allocates data on specific byte boundaries to support SIMD hardware acceleration and prevent cache-line sharing.
- [Visitor-Based Algorithm Dispatch](https://awesome-repositories.com/f/data-databases/visitor-based-algorithm-dispatch.md) — Decouples data processing logic from storage by using visitor objects that traverse data points or sequences.
- [Wide-to-Long Reshaping](https://awesome-repositories.com/f/data-databases/wide-to-long-reshaping.md) — Implements unpivoting of wide-format data into long-format by melting multiple value columns into a single pair. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/explode.html))
- [Window Functions](https://awesome-repositories.com/f/data-databases/window-functions.md) — Executes custom operations over a sliding window of data across a column to produce sequenced results. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/SimpleRollAdopter.html))

### Artificial Intelligence & ML

- [Statistical Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/statistical-analysis.md) — Provides a comprehensive framework for descriptive and inferential statistical analysis of datasets. ([source](https://cdn.jsdelivr.net/gh/hosseinmoein/dataframe@master/README.md))
- [Analysis Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/statistical-analysis/statistical-inference-frameworks/analysis-frameworks.md) — Provides a collection of algorithms for computing descriptive statistics, hypothesis tests, and probability distributions.
- [Affinity Propagation Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/affinity-propagation-clustering.md) — Divides a named column into clusters using the Affinity Propagation algorithm. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_affin.html))
- [Hampel Filter Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/anomaly-detection/median-absolute-deviation/hampel-filter-pipelines.md) — Implements the Hampel filter for removing outliers using a sliding window and median absolute deviation. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/remove_data_by_hampel.html))
- [C++ Machine Learning Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/c-machine-learning-libraries.md) — Provides a high-performance C++ framework for dimensionality reduction, clustering, and vector similarity analysis.
- [BIRCH Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/clustering-algorithms/birch-clustering.md) — Divides columns into clusters using the BIRCH algorithm and K-Means to group similar points. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_birch.html))
- [Spectral Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/clustering-algorithms/spectral-clustering.md) — Divides scalar or multidimensional data into clusters using a spectral clustering algorithm. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_spectral.html))
- [Density-Based Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/density-based-clustering.md) — Groups data into clusters based on local density using the Mean-Shift algorithm. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_mshift.html))
- [Cluster-Based Outlier Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/density-based-clustering/cluster-based-outlier-detection.md) — Detects anomalies by measuring local density deviations relative to k-nearest neighbors. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/AnomalyDetectByLOFVisitor.html))
- [Independent Component Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/independent-component-analysis.md) — Decomposes multivariate datasets into additive independent subcomponents to identify hidden structures. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/fast_ica.html))
- [K-Means Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/k-means-clustering.md) — Divides named columns into a specified number of clusters using centroid-based partitioning. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/get_data_by_kmeans.html))
- [k-Nearest Neighbors Outlier Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/k-nearest-neighbor-classifiers/k-nearest-neighbors-outlier-detection.md) — Identifies outlier data points by measuring distance to the k-th nearest neighbor using kd-trees. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/AnomalyDetectByKNNVisitor.html))
- [Outlier Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/kernel-density-estimation/outlier-detection.md) — Identifies anomalies in time-series data using sliding windows and absolute deviation. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/HampelFilterVisitor.html))
- [Linear Regression](https://awesome-repositories.com/f/artificial-intelligence-ml/linear-regression.md) — Provides statistical modeling of linear relationships between two columns, including slope and intercept calculations. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/SLRegressionVisitor.html))
- [Cumulative Count Calculators](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-computation-primitives/cumulative-sum-calculators/cumulative-aggregate-calculations/cumulative-count-calculators.md) — Produces a vector of running counts of non-missing data points across a column. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/CountVisitor.html))
- [Cumulative Extremum Calculators](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-computation-primitives/cumulative-sum-calculators/cumulative-aggregate-calculations/cumulative-extremum-calculators.md) — Calculates the running maximum or minimum of a data column and returns the results as a vector. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/CumMaxVisitor.html))
- [Cumulative Product Calculations](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-computation-primitives/cumulative-sum-calculators/cumulative-product-calculations.md) — Computes a vector of running products for a given column with missing-value handling. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/CumProdVisitor.html))
- [Cumulative Sums](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-computation-primitives/cumulative-sum-calculators/cumulative-sums.md) — Computes a vector of running sums for a data column with options to preserve missing values. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/CumSumVisitor.html))
- [Range-Based Truncation](https://awesome-repositories.com/f/artificial-intelligence-ml/tensor-indexing/element-modification/list-element-replacements/element-removals-by-index/range-based-truncation.md) — Removes rows from the index and columns that fall outside a specified range. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/truncate.html))
- [Training Data Outlier Removers](https://awesome-repositories.com/f/artificial-intelligence-ml/training-data-transformations/training-data-outlier-removers.md) — Allows removing outlier rows based on the Inter-Quartile Range statistical method. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/remove_data_by_iqr.html))
- [FFT Outlier Removers](https://awesome-repositories.com/f/artificial-intelligence-ml/training-data-transformations/training-data-outlier-removers/fft-outlier-removers.md) — Detects and removes anomalous data points using Fast Fourier Transform analysis. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/remove_data_by_fft.html))
- [Vector Similarity Search](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-similarity-search.md) — Calculates mathematical closeness between vectors using metrics like Euclidean distance and Cosine similarity. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/VectorSimilarityVisitor.html))

### Part of an Awesome List

- [Data Manipulation](https://awesome-repositories.com/f/awesome-lists/data/data-manipulation.md) — Ships an engine for slicing, joining, pivoting, and filtering tabular datasets with parallel execution.
- [C++ Implementations](https://awesome-repositories.com/f/awesome-lists/data/data-tables/tabular-data-models-with-metadata/c-implementations.md) — Provides a C++ library for managing heterogeneous tabular data in contiguous memory with indexed columns.
- [Time and Date](https://awesome-repositories.com/f/awesome-lists/data/time-and-date.md) — Handles date and time data with nanosecond precision and support for multiple time zones. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DateTime.html))

### Scientific & Mathematical Computing

- [Correlation Coefficient Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/correlation-coefficient-calculators.md) — Calculates mathematical relationships and statistical metrics such as Pearson correlation and rolling exponentially weighted correlations. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/FirstVisitor.html))
- [Signal Processing](https://awesome-repositories.com/f/scientific-mathematical-computing/data-modeling-processing/signal-processing.md) — Transforms data into the frequency domain using Fast Fourier Transforms to analyze periodicity and remove noise.
- [Descriptive Statistics Summaries](https://awesome-repositories.com/f/scientific-mathematical-computing/descriptive-statistics-summaries.md) — Computes a bundle of summary metrics including mean, standard deviation, and quantiles for distribution analysis. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/describe.html))
- [Multivariate Analysis](https://awesome-repositories.com/f/scientific-mathematical-computing/multivariate-analysis.md) — Provides dimensionality reduction, clustering, and matrix factorization to uncover structures in large datasets.
- [Statistical Analysis Libraries](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/statistics-probability/statistical-analysis-libraries/statistical-metric-calculators/statistical-analysis-libraries.md) — Provides a comprehensive toolset for calculating descriptive statistics and correlations across multidimensional datasets.
- [Statistical Moment Calculation](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/statistics-probability/statistical-analysis-libraries/statistical-metric-calculators/statistical-moment-calculation.md) — Computes essential statistical moments including mean, variance, standard deviation, skew, and kurtosis using a visitor pattern. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/StatsVisitor.html))
- [Parallel Computation Thread Pools](https://awesome-repositories.com/f/scientific-mathematical-computing/parallel-computation-thread-pools.md) — Distributes data processing tasks across a thread pool to accelerate the analysis of large datasets. ([source](https://cdn.jsdelivr.net/gh/hosseinmoein/dataframe@master/README.md))
- [Autocorrelation Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/autocorrelation-calculators.md) — Determines the partial correlation of a stationary time series with its lagged values. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/AutoCorrVisitor.html))
- [Average Deviation Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/average-deviation-calculators.md) — Computes mean or median absolute deviations around a center point to measure data dispersion. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/MADVisitor.html))
- [Standard Deviation Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/average-deviation-calculators/standard-deviation-calculators.md) — Implements numerically stable standard deviation and mean calculations with bias correction options. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/StdVisitor.html))
- [Autocorrelation Analysis](https://awesome-repositories.com/f/scientific-mathematical-computing/correlation-coefficient-calculators/autocorrelation-analysis.md) — Computes the correlation of a data column with its own lagged values. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/AutoCorrVisitor.html))
- [Cross-Correlation Analysis](https://awesome-repositories.com/f/scientific-mathematical-computing/correlation-coefficient-calculators/cross-correlation-analysis.md) — Computes a series of correlations between two time-series by shifting them across lag periods. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/CorrVisitor.html))
- [Covariance Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/covariance-calculators.md) — Calculates the variance-covariance or correlation matrix for specified columns with z-score normalization. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/covariance_matrix.html))
- [Fourier Transforms](https://awesome-repositories.com/f/scientific-mathematical-computing/data-modeling-processing/signal-processing/fourier-transforms.md) — Implements fast Fourier transforms to convert signals between time and frequency domains for periodicity analysis. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/FastFourierTransVisitor.html))
- [Fourier-Series Seasonality](https://awesome-repositories.com/f/scientific-mathematical-computing/data-modeling-processing/signal-processing/fourier-transforms/fourier-series-seasonality.md) — Identifies repeating patterns by detrending data and applying Fast Fourier Transform to find dominant frequencies. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/SeasonalPeriodVisitor.html))
- [Data Ranking Utilities](https://awesome-repositories.com/f/scientific-mathematical-computing/data-ranking-utilities.md) — Assigns ranks to items within a sortable data vector using configurable tie-breaking strategies. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/RankVisitor.html))
- [Discrete Difference Calculations](https://awesome-repositories.com/f/scientific-mathematical-computing/discrete-difference-calculations.md) — Computes the discrete difference between shifted data points in a column. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DiffVisitor.html))
- [Distribution Comparison Tests](https://awesome-repositories.com/f/scientific-mathematical-computing/distribution-comparison-tests.md) — Calculates the Kolmogorov-Smirnov test statistic to determine if two samples share a distribution. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/KolmoSmirnovTestVisitor.html))
- [Dynamic Time Warping](https://awesome-repositories.com/f/scientific-mathematical-computing/dynamic-time-warping.md) — Calculates the distance between temporal sequences using Dynamic Time Warping to align signals. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DynamicTimeWarpVisitor.html))
- [Exponential Moving Averages](https://awesome-repositories.com/f/scientific-mathematical-computing/exponential-moving-averages.md) — Implements zero-lag moving averages based on de-lagged exponential moving averages. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/ZeroLagMovingMeanVisitor.html))
- [Generic Mean Calculations](https://awesome-repositories.com/f/scientific-mathematical-computing/filtered-average-calculations/generic-mean-calculations.md) — Computes arithmetic, geometric, harmonic, quadratic, and weighted averages across data while optionally ignoring missing values. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/MeanVisitor.html))
- [Functor-Based Array Mapping](https://awesome-repositories.com/f/scientific-mathematical-computing/functor-based-array-mapping.md) — Uses visitor functors to execute custom logic across named columns and indices sequentially or asynchronously. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/visit.html))
- [Goodness-of-Fit Tests](https://awesome-repositories.com/f/scientific-mathematical-computing/goodness-of-fit-tests.md) — Assesses how well a sample dataset fits a theoretical normal distribution using the Cramer-von Mises test. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/CramerVonMisesTestVisitor.html))
- [T-Test Mean Comparisons](https://awesome-repositories.com/f/scientific-mathematical-computing/group-mean-significance-tests/t-test-mean-comparisons.md) — Calculates t-statistics and degrees of freedom to compare means between paired or unpaired samples. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/TTestVisitor.html))
- [High-Performance and Parallel Computing](https://awesome-repositories.com/f/scientific-mathematical-computing/high-performance-execution-environments/high-performance-and-parallel-computing.md) — Provides a high-performance execution model that distributes computationally intensive data tasks across a thread pool. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DataFrame.html))
- [Inter-Quartile Range Outlier Detectors](https://awesome-repositories.com/f/scientific-mathematical-computing/inter-quartile-range-outlier-detectors.md) — Identifies anomalies by calculating the Inter-Quartile Range and returning indices of out-of-bounds points. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/AnomalyDetectByIQRVisitor.html))
- [Principal Component Analysis](https://awesome-repositories.com/f/scientific-mathematical-computing/linear-algebra-routines/principal-component-analysis.md) — Implements principal component analysis to reduce dimensionality by extracting structural patterns from matrices. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/pca_by_eigen.html))
- [Matrix Decompositions](https://awesome-repositories.com/f/scientific-mathematical-computing/matrix-decompositions.md) — Provides singular value decomposition to factorize matrices into singular vectors and values. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/pca_by_eigen.html))
- [Median Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/median-calculators.md) — Finds the median value of a data column in linear time using a selection-based approach. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/MedianVisitor.html))
- [Exponentially Weighted Statistics](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/statistics-probability/statistical-analysis-libraries/statistical-metric-calculators/exponentially-weighted-statistics.md) — Computes moving statistics by applying a decay factor that prioritizes recent observations. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/ExponentiallyWeightedMeanVisitor.html))
- [Statistical Significance Testing](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/statistics-probability/statistical-analysis-libraries/statistical-metric-calculators/statistical-significance-testing.md) — Performs Chi-Squared tests to determine if differences between observed and expected frequencies are statistically significant. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/ChiSquaredTestVisitor.html))
- [Power Transformations](https://awesome-repositories.com/f/scientific-mathematical-computing/power-transformations.md) — Transforms scalar or multidimensional data into a normal distribution using Box-Cox power transformation formulas. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/BoxCoxVisitor.html))
- [Product Calculations](https://awesome-repositories.com/f/scientific-mathematical-computing/product-calculations.md) — Provides general utilities for calculating the product of all values within a data column. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/ProdVisitor.html))
- [Quantile and Percentile Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/quantile-and-percentile-calculators.md) — Calculates dataset quantiles using configurable policies for linear interpolation or midpoint averaging. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/QuantileVisitor.html))
- [Rank-Based Statistical Tests](https://awesome-repositories.com/f/scientific-mathematical-computing/rank-based-statistical-tests.md) — Compares two independent groups using rank-based analysis via the Mann-Whitney U test. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/MannWhitneyUTestVisitor.html))
- [Rate of Change Calculations](https://awesome-repositories.com/f/scientific-mathematical-computing/rate-of-change-calculations.md) — Computes the rate of change between a dataset and its moving average for scalar and multidimensional data. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/BiasVisitor.html))
- [Population and Sample Variance Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/sample-statistics-calculators/population-and-sample-variance-calculators.md) — Computes both population and sample variance with options for numerical stability and bias correction. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/VarVisitor.html))
- [Z-Score Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/sample-statistics-calculators/z-score-calculators.md) — Computes the z-score of a sample column against a population column to measure deviation. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/SampleZScoreVisitor.html))
- [Standard Error Calculators](https://awesome-repositories.com/f/scientific-mathematical-computing/standard-error-calculators.md) — Computes the standard error of the mean for scalar or multidimensional data columns. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/SEMVisitor.html))
- [Time Series Smoothing](https://awesome-repositories.com/f/scientific-mathematical-computing/time-series-smoothing.md) — Detrends and smooths time series data using oscillator-type filters to identify cycles in datasets. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/EhlersHighPassFilterVisitor.html))

### Software Engineering & Architecture

- [Memory Alignment Utilities](https://awesome-repositories.com/f/software-engineering-architecture/memory-alignment-utilities.md) — Specifies byte alignment boundaries and custom allocators to optimize hardware-level vector instruction performance. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DataFrameTypes.html))
- [Tabular Duplicate Removers](https://awesome-repositories.com/f/software-engineering-architecture/in-place-array-manipulations/duplicate-removal/tabular-duplicate-removers.md) — Implements utilities to identify and eliminate duplicate records from tabular datasets. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/remove_duplicates.html))

### Education & Learning Resources

- [Custom Analytical Algorithms](https://awesome-repositories.com/f/education-learning-resources/educational-resources/algorithms-theory-academics/cs-theory-foundations/algorithms/general-collections-and-study/algorithm-implementations/custom-analytical-algorithms.md) — Enables the creation of custom visitor objects that process data points individually or as sequences to extend analysis. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/DataFrame.html))

### System Administration & Monitoring

- [Expanding Window Functions](https://awesome-repositories.com/f/system-administration-monitoring/health-monitoring/rolling-window-analysis/expanding-window-functions.md) — Executes functions over a data vector using a rolling window that expands based on specified increments. ([source](https://hosseinmoein.github.io/DataFrame/docs/HTML/ExpandingRollAdopter.html))
