1 repo
Tools that utilize machine learning models to parse, structure, and extract information from unstructured data sources.
Distinguishing note: Focuses on the extraction of structured data from text rather than general model inference.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Extraction Engines. Refine with filters or upvote what's useful.
Langextract is a framework designed to transform unstructured text into structured, machine-readable data using language model orchestration. It provides a high-performance pipeline that processes large volumes of narrative text by utilizing parallel execution and sequential extraction passes. The library is built to handle complex data extraction tasks, including specialized support for clinical information and medical entity relationship recognition. The project distinguishes itself through a plugin-based architecture that supports both local hardware execution and cloud-hosted model endpoi
The library processes data entirely on local hardware using integrated model runners to perform extraction tasks without needing external API keys or cloud-based infrastructure.