30 open-source projects similar to sql-machine-learning/sqlflow, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Sqlflow alternative.
This library is a web-native engine designed to execute pretrained machine learning models directly within the browser. It functions as a client-side inference framework, enabling developers to run complex neural networks for natural language processing, computer vision, and audio tasks without requiring a backend server or external API calls. The framework distinguishes itself by providing a unified pipeline-based abstraction that handles the entire lifecycle of model execution. It manages the dynamic retrieval of model weights and configurations from remote registries, while simultaneously
MindsDB is an AI-native database engine that treats machine learning models and autonomous agents as virtual tables. By mapping external data sources, predictive models, and third-party services directly into the database schema, it enables users to perform inference, data retrieval, and complex orchestration using standard SQL syntax. The platform distinguishes itself through an autonomous agent orchestrator that executes iterative reasoning loops, allowing agents to plan data access and synthesize natural language responses from connected knowledge bases. It functions as a federated data ga
Seldon Core is a Kubernetes-based machine learning model server and MLOps inference framework. It functions as a multi-model serving engine and pipeline orchestrator, packaging models as scalable microservices that are exposed via standardized REST and gRPC APIs. The project distinguishes itself through graph-based inference pipelines that chain models and data transformers into sequential workflows. It optimizes hardware utilization via multi-model shared serving and dynamic memory overcommit strategies, while supporting production experimentation through weighted traffic routing, A/B testin
PostgresML is a machine learning database extension for PostgreSQL that integrates model training and inference directly into the database. It functions as an in-database AI platform and vector database, enabling the execution of large language models and natural language processing tasks on stored records without exporting data to external services. The system distinguishes itself by utilizing GPU acceleration to minimize latency during model predictions and employing a hybrid storage engine that maintains relational data alongside high-dimensional vectors. It allows for the building and fin
This project is a structured learning curriculum and technical reference for mastering deep learning with TensorFlow. It provides a comprehensive guide for building, training, and deploying neural networks, combining theoretical fundamentals with practical implementation examples. The repository distinguishes itself by covering the end-to-end machine learning workflow, from low-level tensor mathematics and linear algebra to the creation of complex model architectures. It includes specific guidance on developing data pipelines for diverse data types, such as images, text, and time-series seque
This project provides Rust bindings for the TensorFlow C API, serving as a tensor computation interface and machine learning library. It enables the construction and execution of machine learning models and neural networks by bridging a systems language to high-performance backends. The framework supports GPU-accelerated computing to increase the speed of model training and inference by offloading mathematical operations to graphics processing units. It offers both graph-based computation for defining static network architectures and an eager execution mode for immediate operation calls durin
CatBoost is a gradient boosting machine learning library used to train decision tree ensembles for regression, classification, and ranking tasks. It functions as a high-performance framework that provides a categorical data processor for transforming non-numeric features, a distributed trainer for large-scale datasets, and GPU acceleration to speed up model construction. The library distinguishes itself through native handling of categorical data and text features, removing the need for manual encoding. It includes a specialized model interpretability tool that leverages SHAP values and featu
Deep Java Library is a Java deep learning framework and JVM model inference engine. It provides a high-level API for building and deploying deep learning models within the Java ecosystem, acting as a cross-platform runtime for executing models across CPUs, GPUs, and mobile devices. The library is engine-agnostic, allowing users to switch between different deep learning engines such as PyTorch, TensorFlow, and MXNet while maintaining a single unified API. This enables the deployment of the same model across different backends without changing the application code. The framework supports the f
Orange3 is a visual data mining platform that provides an interactive canvas for building data analysis workflows without writing code. At its core, it offers a widget-based visual programming environment where users connect configurable components to perform data preprocessing, machine learning model training, statistical evaluation, and interactive visualization. The platform is built on NumPy-backed data tables with domain descriptors that define variable names, types, and roles, and includes a lazy SQL query proxy for working with database tables without loading all data into memory. The
This repository is a collection of Jupyter notebooks providing reference implementations and templates for building, training, and deploying machine learning models using Amazon SageMaker. It serves as an example library for implementing model architectures and automating the machine learning lifecycle. The library provides practical patterns for machine learning training, data engineering, and model deployment. It includes implementation guides for MLOps, including workflows for model monitoring, lineage tracking, and hyperparameter tuning. The examples cover a broad range of capabilities i
Tensor2Tensor is a deep learning library built on TensorFlow designed for training and evaluating complex machine learning models. It provides a unified framework for managing the entire model lifecycle, including data ingestion, training execution, and performance evaluation across diverse hardware environments. The library distinguishes itself through a modular architecture that supports multimodal data processing, allowing for the simultaneous analysis of text, audio, and image inputs. It features a central registry system that enables developers to extend the framework with custom models,
This project is a PyTorch transformer model library and pre-trained model framework. It serves as a deep learning model hub and multimodal inference engine, providing a centralized system for loading, executing, and fine-tuning state-of-the-art model checkpoints. The library focuses on multimodal machine learning, enabling predictions across text, vision, and audio data. It provides specialized capabilities for model framework interoperability, allowing the conversion of weights and definitions between different deep learning libraries. The platform covers the full model lifecycle, including
PyCaret is a Python AutoML platform and MLOps lifecycle manager designed to automate machine learning workflows. It functions as a low-code environment that leverages a scikit-learn native engine to execute preprocessing, training, and evaluation for tabular data. The platform distinguishes itself as an LLM-powered ML copilot, using large language model agents to analyze datasets, design experiment configurations, and explain model results. It also serves as a Kubernetes ML orchestrator and model registry, enabling the versioning of trained pipelines and their promotion to production API endp
This is a cross-platform framework for building, training, and deploying custom machine learning models within the .NET ecosystem. It provides a predictive modeling engine for classification, regression, and forecasting tasks, alongside an inference runtime to generate predictions across different hardware architectures. The framework includes a gradient boosting library and supports interoperability with external models via a standardized open format. It features tools for prediction explainability, allowing the analysis of feature importance to debug model behavior and identify bias. The p
PredictionIO is a machine learning server designed for the deployment of predictive models to transform raw data into actionable predictions. It manages the full lifecycle of machine learning operations, from ingesting event data via APIs to hosting production-ready predictive services for real-time inference. The system supports distributed model training by spreading computational workloads across a cluster of nodes to increase processing speed. It enables the implementation of custom prediction engines using programming languages or the application of pre-built model templates for common t
ai-edu is a comprehensive AI education curriculum and machine learning courseware collection. It provides theoretical tutorials, deep learning lab exercises, and project blueprints designed to teach artificial intelligence fundamentals through a combination of study and practical implementation. The project focuses on a learning-by-doing approach, guiding users from Python programming and neural network basics to advanced topics. It includes specialized instructional content on distributed AI training, MLOps educational guides for model quantization and pruning, and detailed frameworks for im
This project is a modular PyTorch framework for training and evaluating object detection and instance segmentation models. It serves as a computer vision research tool and a deep learning inference engine designed to identify object locations, classes, and pixel-level masks within images. The framework implements a two-stage inference pipeline that utilizes region proposal networks and a symmetric mask-head architecture. It provides specialized capabilities for instance segmentation, object bounding box detection, and human pose estimation via anatomical keypoint detection. The system includ
Moltworker is an AI agent sandbox and model orchestrator designed for the secure execution of untrusted code and shell commands generated by large language models. It functions as a gateway proxy that routes requests to multiple AI providers through a unified interface, integrating a container runtime backed by S3-compatible object storage to persist state across ephemeral lifecycles. The system distinguishes itself by combining an AI model orchestrator with a headless browser controller for automated web scraping and screenshot capture. It manages the full lifecycle of AI agents, including m
MLOps-Basics is a collection of implementation guides and blueprints for automating the machine learning lifecycle. It provides practical workflows for managing the transition of models from training to production deployment, focusing on the integration of operational tools into the machine learning pipeline. The project features specific architectural patterns for deploying containerized models using serverless infrastructure and cloud registries. It includes frameworks for tracking large datasets and model artifacts via remote storage, as well as guides for converting models into standardiz
SynapseML is an Apache Spark machine learning library designed for building and scaling machine learning workflows and data pipelines across distributed clusters. It serves as a distributed machine learning pipeline framework and a distributed inference engine for executing hardware-accelerated predictions and deep learning tasks on large-scale datasets. The project functions as a cloud AI integration layer, allowing users to apply pretrained artificial intelligence services for text, vision, and speech within distributed pipelines. It also includes a dedicated suite of tools for distributed
Deeplearning4j is a JVM-based deep learning framework and tensor computing library. It provides a computational graph engine for defining and executing deep learning workflows and mathematical operations within the Java Virtual Machine. The project includes a dedicated importer for loading and running pretrained models exported from Keras, TensorFlow, and ONNX formats. Its tensor computing capabilities are driven by a modular native C++ math core to execute high-performance linear algebra operations. The framework covers neural network training, deep learning model inference, and the constru
ConvNetJS is a JavaScript deep learning library and neural network training engine designed for client-side machine learning. It functions as a framework for building, training, and running convolutional neural networks directly within a web browser without the need for a backend server. The library specializes in image recognition and pattern analysis using convolutional and pooling layers. It enables the creation of models for classification and regression tasks, as well as the development of reinforcement learning agents that optimize behavior through trial and error in simulated environme
Kubeflow is a Kubernetes machine learning platform and containerized toolkit designed to orchestrate the entire machine learning lifecycle. It functions as an MLOps workflow orchestrator and infrastructure layer for building, training, and deploying models within containerized environments. The project provides specialized infrastructure for scaling compute resources and managing GPU workloads for large-scale distributed training. It automates the transition of models from experimental development to production through workflow orchestration and model deployment services. The platform covers
Label Studio is a multi-modal data annotation platform designed to create and manage high-quality training datasets for machine learning. It functions as a self-hosted, containerized environment that supports secure, private deployments, including air-gapped configurations. The platform provides a centralized workspace for labeling diverse media types, such as images, text, audio, and time-series data, to support supervised and reinforcement learning workflows. The platform distinguishes itself through deep integration with machine learning backends, enabling active learning loops, automated
LiteLLM is a unified gateway and proxy server designed to centralize access to over one hundred language model providers. It provides a standardized API interface that abstracts vendor-specific schemas, allowing developers to interact with diverse models through a single, consistent format. By acting as a central traffic management layer, it enables organizations to route, secure, and govern model interactions across multiple deployments. The platform distinguishes itself through its policy-driven architecture, which uses configuration-based routing to manage traffic distribution, load balanc
Feast is an open-source feature store for machine learning that provides a central platform for defining, storing, and serving features across both training and inference workflows. It operates as a declarative system where feature definitions are written as code in Python files, synchronized to a central registry, and made available for low-latency online retrieval or point-in-time correct historical joins for training datasets. The project abstracts storage behind a pluggable architecture, allowing offline and online backends to be swapped without changing retrieval logic, and coordinates ma
BentoML is a machine learning model serving framework and GPU-accelerated inference server designed to package, deploy, and scale AI models as production-ready REST APIs. It functions as an AI model lifecycle manager and an inference graph orchestrator, enabling the chaining of multiple models and custom logic into complex pipelines for advanced task sequences. The framework distinguishes itself through a dynamic batching engine that optimizes GPU throughput and an artifact-based packaging system that bundles model weights and dependencies into immutable archives for consistent deployment. It
Wandb is a centralized platform for machine learning experiment tracking, model registry management, and workflow orchestration. It provides a comprehensive suite of tools for logging, visualizing, and versioning training metrics, model artifacts, and hyperparameter sweeps to ensure reproducibility across development cycles. The platform also functions as an observability tool for large language model applications, enabling the tracing of execution steps, token usage, and reasoning processes. The project distinguishes itself through its event-driven automation capabilities, which allow users
pgai is a PostgreSQL AI toolkit and framework designed to integrate large language models and vector embeddings directly into a database. It serves as a bridge for executing machine learning model requests and performing text-to-SQL translations within standard database queries. The project provides an automated vector embedding pipeline that handles the loading, parsing, and chunking of text from tables and unstructured documents. This system utilizes a background worker to synchronize embeddings automatically as source data changes and includes specialized tools for building retrieval-augme