5 dépôts
Using programming scripts to clean, transform, and compute metrics on datasets.
Distinct from Code Analysis and Metrics: Existing candidates focus on static code analysis (linting/quality) rather than using code to analyze data
Explore 5 awesome GitHub repositories matching data & databases · Scripted Data Analysis. Refine with filters or upvote what's useful.
DB-GPT is an AI-driven database management system that uses agentic reasoning to execute data tasks. It converts natural language prompts into executable database queries and combines structured database records with unstructured knowledge bases to provide grounded analysis. The system orchestrates multi-step reasoning chains that integrate database queries, custom scripts, and external tool calls. It allows for the packaging of domain knowledge into reusable analysis skills and executes generated code within sandboxed environments for system safety. The platform covers data orchestration ac
Provides the ability to execute script-based logic for cleaning datasets and computing complex metrics.
DesktopCommanderMCP is a Model Context Protocol (MCP) server that gives AI agents direct access to local files, shell commands, and system processes through natural language instructions. It acts as a unified bridge between conversational commands and desktop operations, enabling an AI to translate plain English into file management, code editing, system command execution, data analysis, and software scaffolding tasks without needing its own API. The server exposes these capabilities as structured tools via the MCP protocol, so any compatible agent can interact with the local environment in a
Executes Python, Node.js, or R code in memory to analyze CSV, JSON, and Excel files instantly.
Beancount is a plain-text double-entry accounting system. It enforces zero-sum transactions, organizes accounts into a hierarchical five-type tree, and verifies balances at specific dates using precision-derived tolerances. Transactions are recorded in plain-text files with a strict syntax that supports currency-specific rounding, automatic interpolation of missing amounts, and comprehensive metadata including tags, links, and payee annotations. Beyond core bookkeeping, Beancount offers investment portfolio tracking with lot-based cost basis management, configurable booking strategies (FIFO,
Loads accounting directives into custom Python scripts for analysis beyond the built-in reports.
Ce projet est une collection de ressources éducatives et de supports d'étude axés sur le calcul scientifique et l'analyse de données avec Python. Il se compose de notes traduites et de notebooks Jupyter conçus pour guider les apprenants à travers l'écosystème de données Python. Le contenu couvre des flux de travail spécialisés, notamment le calcul numérique, le nettoyage de données et l'analyse de séries temporelles. Ces supports servent de référence pour effectuer des manipulations de données complexes et traiter des données séquentielles afin d'identifier des modèles. La ressource est organisée sous forme d'une série de fichiers statiques et de documents markdown utilisant une structure de répertoire à plat. Elle intègre des cellules de code exécutables au sein des blocs de documents et utilise le contrôle de version git pour gérer les mises à jour des traductions et des extraits de code.
Provides scripted examples for cleaning, transforming, and computing metrics on study datasets.
mimic-code is a clinical data analysis framework and toolset for processing deidentified electronic health records and intensive care unit data. It provides a healthcare SQL query library and a processing tool to transform raw health records into formats suitable for longitudinal analysis and machine learning. The project features a medical research notebook environment that integrates with cloud-hosted datasets, allowing for remote querying and analysis. It includes a DICOM imaging pipeline to retrieve chest radiographs and link medical imaging with structured clinical metadata. The framewo
Uses reproducible scripts to clean, transform, and analyze electronic health record data from critical care databases.