5 مستودعات
Using programming scripts to clean, transform, and compute metrics on datasets.
Distinct from Code Analysis and Metrics: Existing candidates focus on static code analysis (linting/quality) rather than using code to analyze data
Explore 5 awesome GitHub repositories matching data & databases · Scripted Data Analysis. Refine with filters or upvote what's useful.
DB-GPT is an AI-driven database management system that uses agentic reasoning to execute data tasks. It converts natural language prompts into executable database queries and combines structured database records with unstructured knowledge bases to provide grounded analysis. The system orchestrates multi-step reasoning chains that integrate database queries, custom scripts, and external tool calls. It allows for the packaging of domain knowledge into reusable analysis skills and executes generated code within sandboxed environments for system safety. The platform covers data orchestration ac
Provides the ability to execute script-based logic for cleaning datasets and computing complex metrics.
DesktopCommanderMCP is a Model Context Protocol (MCP) server that gives AI agents direct access to local files, shell commands, and system processes through natural language instructions. It acts as a unified bridge between conversational commands and desktop operations, enabling an AI to translate plain English into file management, code editing, system command execution, data analysis, and software scaffolding tasks without needing its own API. The server exposes these capabilities as structured tools via the MCP protocol, so any compatible agent can interact with the local environment in a
Executes Python, Node.js, or R code in memory to analyze CSV, JSON, and Excel files instantly.
Beancount is a plain-text double-entry accounting system. It enforces zero-sum transactions, organizes accounts into a hierarchical five-type tree, and verifies balances at specific dates using precision-derived tolerances. Transactions are recorded in plain-text files with a strict syntax that supports currency-specific rounding, automatic interpolation of missing amounts, and comprehensive metadata including tags, links, and payee annotations. Beyond core bookkeeping, Beancount offers investment portfolio tracking with lot-based cost basis management, configurable booking strategies (FIFO,
Loads accounting directives into custom Python scripts for analysis beyond the built-in reports.
This project is a collection of educational resources and study materials focused on scientific computing and data analysis using Python. It consists of translated notes and Jupyter notebooks designed to guide learners through the Python data ecosystem. The content covers specialized workflows including numerical computation, data cleaning, and time series analysis. These materials provide a reference for performing complex data manipulations and processing sequential data to identify patterns. The resource is organized as a series of static files and markdown documents using a flat-file dir
Provides scripted examples for cleaning, transforming, and computing metrics on study datasets.
mimic-code is a clinical data analysis framework and toolset for processing deidentified electronic health records and intensive care unit data. It provides a healthcare SQL query library and a processing tool to transform raw health records into formats suitable for longitudinal analysis and machine learning. The project features a medical research notebook environment that integrates with cloud-hosted datasets, allowing for remote querying and analysis. It includes a DICOM imaging pipeline to retrieve chest radiographs and link medical imaging with structured clinical metadata. The framewo
Uses reproducible scripts to clean, transform, and analyze electronic health record data from critical care databases.