Obliteratus is a weight ablation framework and refusal removal tool designed to identify and delete the internal representations responsible for content refusals in large language models without retraining. It functions as a circuit analysis suite that maps the geometric structure of model guardrails to isolate the specific layers and attention heads that enforce refusals.
The project enables the removal of these behaviors through geometric projection, rank-1 adapter ablation for reversible modifications, and the application of steering vectors to alter behavior during inference. It includes automation for configuring projection strengths and layer selection based on real-time analysis of model geometry.
The system covers distributed GPU processing for weight sharding and remote pipeline execution via SSH. It also provides observability tools for model coherence evaluation, measuring perplexity and refusal rates, and benchmarking removal strategies using topology charts and angular drift.