30 open-source projects similar to jrfiedler/causal_inference_python_code, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Causal Inference Python Code alternative.
DoWhy is an open-source Python library for causal inference that structures the entire analysis into a sequential four-step framework: modeling, identification, estimation, and refutation. It treats causal assumptions as explicit, first-class citizens, represented as directed acyclic graphs that can be automatically validated against observed data. The library distinguishes itself by cleanly separating the causal identification problem from statistical estimation, allowing any compatible estimator to be used for a given target estimand. It includes automated refutation testing that validates
EconML is a Python library for causal inference designed to estimate heterogeneous treatment effects using a combination of machine learning and econometrics. It serves as a toolkit for calculating conditional average treatment effects to determine how specific interventions impact individuals or subgroups. The project provides a framework for double machine learning and orthogonal machine learning to isolate causal signals from high-dimensional confounders. It includes specialized implementations for causal forests and instrumental variable learners, allowing for the recovery of causal relat
CausalML is a machine learning library for causal inference, providing tools to estimate treatment effects and causal impacts using experimental and observational data. It functions as a framework for uplift modeling and the estimation of heterogeneous treatment effects to distinguish causation from correlation. The library focuses on identifying how different user segments respond to specific interventions. This includes calculating the incremental gain of target metrics to optimize marketing campaigns, targeting high-response customer segments, and personalizing user engagement through the
This project is a collection of Bayesian statistics courseware and educational resources. It provides instructional materials, problem sets, and solutions designed for learning Bayesian data analysis and causal modeling. The repository includes a suite of statistical data visualization scripts used to generate instructional animations and plots. It also contains code examples that implement Bayesian modeling and survival analysis across multiple programming languages to demonstrate different computational approaches. The materials cover a range of statistical capabilities, including causal i
Statsmodels is a comprehensive Python library designed for statistical modeling, econometric research, and data analysis. It provides a robust framework for estimating and diagnosing a wide range of statistical models, enabling users to perform rigorous hypothesis testing, regression analysis, and complex data exploration within structured environments. The library distinguishes itself through its support for advanced statistical methodologies, including state space representation for dynamic systems and generalized linear frameworks that accommodate non-normal response variables. It offers s
This project is a computational statistics textbook and Bayesian data analysis course. It serves as a guide for performing statistical inference and quantifying uncertainty through a probabilistic programming workflow using Python. The resource employs a computation-first pedagogy, teaching Bayesian methods and parameter estimation through executable code and simulations instead of formal mathematical notation. It provides a practical approach to implementing Markov Chain Monte Carlo sampling to estimate posterior distributions. The content covers building probabilistic models, integrating e
ThinkStats2 is a computational statistics course and educational library designed to teach probability and statistics through a programmatic approach. It provides a framework for studying statistical concepts by writing Python code and running simulations on real-world datasets. The project uses interactive notebooks and a collection of Python modules to deliver guided lessons. It emphasizes the verification of theoretical statistical laws through iterative computational experiments and simulation-driven testing. The resource covers broad capabilities in data analysis and data science traini
NumPy is a foundational library for scientific computing in Python, providing a comprehensive framework for managing and manipulating large-scale numerical information. It centers on high-performance multidimensional array objects that serve as the primary data structure for complex mathematical operations and data analysis workflows. The library distinguishes itself through specialized mechanisms for handling multidimensional data, including advanced indexing, slicing, and broadcasting techniques that allow for efficient operations across arrays of varying shapes. It utilizes strided metadat
quant-wiki is a comprehensive knowledge base and structured reference for quantitative finance, financial engineering, and algorithmic trading. It serves as a centralized library of documentation covering mathematical models, financial instruments, and systematic trading strategies. The project integrates AI-driven capabilities through a modular retrieval-augmented generation framework that extracts structured data from research papers and news. It features a multi-agent workflow engine designed to discover and validate predictive alpha factors, alongside tools for local large language model
Visual Insights is an automated exploratory data analysis platform and causal inference tool designed to discover patterns and cause-and-effect relationships within datasets. It functions as an interactive data visualization library using a grammar-of-graphics approach to generate multi-dimensional charts and dashboards. The project distinguishes itself through a natural language interface that translates plain-text questions into data answers and visualizations via a language model. It provides a specialized framework for causal discovery and inference, allowing users to identify variable li
This project is a collection of foundational machine learning algorithms and data science tools implemented in Python. It focuses on building the logic of these tools using basic programming primitives rather than relying on specialized libraries. The implementation covers several core domains, including a linear algebra library for matrix and vector operations, a statistical analysis toolkit for probability and hypothesis testing, and a framework for map-reduce distributed processing. It also includes implementations for natural language processing, graph theory for network analysis, and var
Math.js is a comprehensive JavaScript library for scientific, complex, and arbitrary precision calculations. It functions as a symbolic computation engine, a linear algebra toolkit, a statistical analysis library, and a unit conversion system. The project distinguishes itself by providing a symbolic engine capable of parsing, simplifying, and manipulating mathematical expressions algebraically without requiring immediate numerical evaluation. It includes a framework for defining and converting physical quantities with units of measure and automatic prefix support. The library covers a broad
Riskfolio-Lib is a Python portfolio optimization library and convex risk management tool. It provides a framework for calculating optimal asset allocations using convex risk measures and mathematical programming solvers, supporting linear, quadratic, and semidefinite programming. The library features a hierarchical risk parity framework and financial asset clustering tools to group similar instruments and improve diversification. It includes a portfolio backtesting engine for simulating investment strategies using historical data and cross-validation. The system covers a broad range of quant
GrowthBook is a feature flagging and experimentation platform that utilizes a warehouse-native approach to data analysis. It serves as a system for managing feature rollouts and conducting A/B tests by executing SQL queries directly against existing data warehouses to calculate experiment results. The platform is distinguished by its integration of a Model Context Protocol server, which allows AI coding assistants and IDEs to manage flags and query analytics using natural language. It also provides specialized capabilities for AI model optimization, enabling the testing of prompts and models
Gonum is a numerical computing library for the Go programming language, providing a collection of packages for scientific computing, linear algebra, statistics, and optimization. It functions as a framework for performing complex numerical computations and solving systems of linear equations. The project includes a dedicated graph analysis framework for modeling network graphs and solving connectivity and pathfinding problems. It also provides a statistical analysis toolkit for computing descriptive and inferential statistics and estimating mixture entropy. The library's capability surface c
Seaborn is a Python library designed for statistical data visualization. It functions as a high-level interface built on the Matplotlib ecosystem, providing specialized routines to explore and communicate complex patterns within datasets. The framework enables users to generate informative graphics through automated statistical aggregation, multi-plot faceting, and integrated regression modeling. The library distinguishes itself through a declarative approach to data mapping, which translates raw inputs into visual properties like color, size, and position. It includes a robust statistical tr
Alembic is a database schema versioning system and migration tool for SQLAlchemy. It manages incremental updates to database structures using versioned scripts that support both upgrading and downgrading to keep the database and code in sync. The system utilizes a directed acyclic graph for migration management, which allows for non-linear versioning, including branching and merging across multiple root versions. It includes an automated schema diffing tool that compares live database schemas against metadata objects to programmatically generate migration instructions. The tool provides capa
This project is a machine learning algorithm reference and implementation guide that provides theoretical foundations and code for supervised learning, deep learning, and natural language processing. It serves as a comprehensive toolkit for implementing predictive models and a technical reference for algorithm engineering. The project focuses on ensemble learning frameworks, including the construction of decision trees, random forests, and gradient boosting models. It also functions as a probabilistic graphical model library and an NLP algorithm reference, with specific implementations for se
This project is a collection of educational notebooks and computational workflows designed for cheminformatics and molecular data science. It provides a structured environment for processing chemical structures, performing scaffold identification, and executing reaction enumeration through standardized data representations. The toolkit distinguishes itself by integrating statistical clustering and visualization techniques to interpret chemical diversity within large datasets. It supports advanced research workflows by enabling structure-activity relationship analysis and the evaluation of pro
Surge is a Swift library for high-performance numerical analysis, linear algebra, digital signal processing, and accelerated image manipulation. It utilizes the Accelerate framework to provide hardware-accelerated tools for matrix mathematics and signal processing. The library provides specialized capabilities for digital signal processing, including convolution, signal similarity analysis through cross-correlation, and domain transformations using fast Fourier transforms. It also includes a suite of tools for the rapid transformation and analysis of pixel buffers and image data. Beyond sign
This project is a comprehensive knowledge base and study resource designed for mastering technical interviews. It provides structured guides, roadmaps, and curricula focused on data structures, algorithms, system design, and frontend engineering to help candidates prepare for software engineering screenings. The repository distinguishes itself by offering a holistic approach to professional advancement. Beyond technical drills, it includes a career development handbook covering resume optimization, salary benchmarking, and strategic negotiation coaching. It also provides detailed methodologie
This project is a collection of educational Jupyter Notebooks providing tutorials on neural network construction and tensor operations using the TensorFlow framework. It serves as a machine learning educational repository and implementation guide for deep learning students. The suite focuses on specific advanced architectures, including convolutional networks for image classification, residual networks with skip connections for training stability, and variational autoencoders for generative modeling and data synthesis. It also includes guides for building denoising and deep autoencoders to pe
Python-Guide-CN is a Chinese translation of a comprehensive guide to idiomatic Python programming and software development. It serves as a curated programming tutorial and ecosystem reference, providing a structured path for learning Python syntax, standard libraries, and professional coding patterns. The project distinguishes itself by offering detailed instructions for setting up development environments across Windows, macOS, and Linux. It specifically focuses on the selection of interpreters and the management of virtual environments to ensure a consistent workspace. The guide covers a b
This project is an educational resource providing practical code examples and implementations of machine learning algorithms using the Python language. It serves as a guide for constructing predictive pipelines, clustering models, and dimensionality reduction within the Scikit-Learn ecosystem. The repository includes comprehensive demonstrations for supervised and unsupervised learning, as well as detailed examples for implementing neural networks and deep architectures. It also provides practical guidance on exporting model parameters to JSON and wrapping trained models in web APIs for produ
This project is a data mining algorithm library and machine learning reference implementation. It provides a collection of tools for performing classification, clustering, and association rule mining, as well as a toolkit for nature-inspired optimization. The library includes specialized utilities for graph and sequence mining, enabling the extraction of frequent subgraphs and sequential patterns. It also features a dimensionality reduction utility that uses rough set theory to remove redundant attributes from datasets. The project covers a broad range of analytical capabilities, including n
Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a directed acyclic graph approach, the framework allows users to build intricate models with multiple inputs, outputs, and shared layers, ensuring consistent numerical execution through functional state management. The project distinguishes itself as a multi-backend machine learning