2 dépôts
Tools for assigning categorical tags to database columns based on schema metadata.
Distinct from Data Categorization: Distinct from general data categorization: focuses specifically on schema-level column classification for sensitive data identification.
Explore 2 awesome GitHub repositories matching data & databases · Column Classifiers. Refine with filters or upvote what's useful.
This project serves as a comprehensive educational resource and technical handbook for engineers building applications powered by large language models. It provides a structured framework for mastering the principles of artificial intelligence engineering, covering the full lifecycle of model development from initial design to production deployment. The repository distinguishes itself by offering a deep dive into the practical implementation of advanced design patterns, including retrieval-augmented generation, agentic tool orchestration, and parameter-efficient model adaptation. It emphasize
Assigns categorical tags to database columns based on schema metadata to organize and identify sensitive or functional information.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Assigns columns as dimensions, metrics, or time fields to optimize storage and enable specific aggregation operations.