MedicalGPT

MedicalGPT is an open-source framework for fine-tuning large language models, with a dedicated focus on adapting general models to the medical domain. It provides a complete pipeline that covers continued pretraining on domain-specific corpora, supervised instruction tuning, tokenizer vocabulary extension with medical terminology, and alignment to clinician preferences through direct preference optimization, reinforcement learning, or knowledge distillation. The framework also supports training models to invoke external tools and functions in multi-turn clinical conversations.

The platform distinguishes itself by integrating multiple adaptation techniques into a single, configurable workflow. It handles multi-stage domain adaptation—chaining continued pretraining, supervised fine-tuning, preference alignment, and optional knowledge distillation—to inject specialized knowledge and then align model behavior. Beyond standard alignment methods, it offers adapter-based model merging, incremental pretraining with extended vocabularies, and a unified interface that supports over twenty open-source LLM families without requiring manual architecture adaptation.

In addition to core training capabilities, MedicalGPT includes utilities for dataset preparation, such as formatting multi-turn conversations, converting dataset formats, generating synthetic role-play dialogues, and compiling pretraining corpora. It provides inference tools like an interactive command-line chat session and a web-based demo interface for serving trained models.

Features

Direct Preference Optimization - Implements direct preference optimization to align model outputs with clinician preferences from paired data.

Instruction Fine-Tuning Frameworks - A framework for fine-tuning and aligning large language models using supervised instruction, preference optimization, and reinforcement learning.

Supervised Instruction Fine-Tuning - Fine‑tunes a pre‑trained language model on labeled instruction‑following data to specialise its behaviour for specific tasks within the framework

Function Calling Fine-Tuning - Provides a pipeline for fine-tuning language models to invoke external functions and tools in multi-turn conversations.

Multi-Stage Pipelines - Chains continued pretraining, supervised fine-tuning, and preference optimization to adapt general language models to the medical domain.

Domain-Adaptive Continued Pretraining - Runs additional unsupervised pretraining on domain‑specific text corpora to inject specialized knowledge into the base model before fine‑tuning in the framework

Domain-Specific Pretraining - Trains a language model from scratch on an unlabeled domain‑specific corpus to produce an adapted base model for subsequent fine‑tuning with the framework

RLHF Training Pipelines - Fine‑tunes the language model using a reward model and proximal‑policy optimisation to improve alignment with human preferences within the framework

Vocabulary Extension Systems - A system for extending tokenizer vocabularies and training models on custom corpora to inject domain-specific knowledge.

Medical - Aligns medical language models to clinician preferences using reinforcement learning or direct optimization.

Multi-Architecture Training Frameworks - An open-source framework supporting multiple model architectures for supervised fine-tuning, alignment, and tool calling.

Reward Modeling - Trains a separate model to predict human preferences from comparison data, providing a reward signal for reinforcement‑learning alignment stages in the framework

Incremental Vocabulary Adaptation - Resumes language model training from a checkpoint after expanding the tokenizer and freezes embedding‑output layers to adapt to new vocabulary without catastrophic forgetting

Medical - Transfers knowledge from a stronger medical teacher model to a smaller student model for efficient deployment.

Conversational Tool-Calling Toolkits - A toolkit for training language models to invoke external functions and tools during multi-turn conversations.

Teacher-Student Distillation - Supports transferring capabilities from a stronger teacher model to a student model via on-policy generation and divergence minimization.

Multi-Architecture LLM Support - Unifies training and inference interfaces across over twenty open-source LLM families without manual architecture adaptation.

Preference Alignment Datasets - Processes chosen-and-rejected response pairs to prepare preference data for reward model or DPO training.

Medical - Generates synthetic medical dialogues, preference pairs, and instruction data for domain-specific fine-tuning.

Tokenizer Vocabulary Merging - Retrains the tokenizer on medical text and merges with a specialist vocabulary to improve clinical term segmentation.

Domain-Specific Vocabulary Merging - Retrains the tokenizer on a custom corpus and merges with a specialized vocabulary to improve character-level and domain-term segmentation for the model fine‑tuning framework

Tokenizer Customization - Extends tokenizer vocabulary with medical terms to improve segmentation and understanding of clinical text.

Tool-Calling Capabilities - Enabling language models to invoke external medical tools and functions during multi-turn clinical conversations.

Adapter Merging - Ships utilities for combining trained LoRA adapter weights with the base model into a standalone checkpoint.

Proximal Policy Optimization Implementations - Refines the language model policy using a learned reward model and the PPO algorithm to align outputs with human preferences within the fine‑tuning framework

Interactive Model Inference Sessions - Loads the fine‑tuned model weights and supports interactive chat or batch text generation from a command‑line session within the framework

Interactive AI Demos - Starts an interactive web‑based chat interface that loads a trained model base and optional LoRA adapters so practitioners can demo the framework‑tuned model live

Machine Learning and Analytics - Pipeline for training custom medical language models.

Medical Domain Models - Medical model supporting full training stages including DPO.

shibing624MedicalGPT

Features

Star history