SelfAlign

AI that learns to align itself.

Why SelfAlign?

Shaping Minds Without Consent

The most dangerous AI is not the one that makes mistakes. It's the one that tells the truth selectively.

Large language models aren't neutral. They are the product of countless invisible choices: what to include, what to leave out, and how to frame the world you see. These choices are made by the people and institutions who train them, often behind closed doors. Hidden biases, ideological slants, and curated "safe" narratives can be woven into the fabric of a model's knowledge. You won't always notice, but over time, these quiet nudges shape how you think.

This is intentional misalignment: when a system is tuned not to serve your understanding, but to serve someone else's agenda.

Breaking the Spell

We believe that alignment shouldn't be something done to you, it should be something you control. SelfAlign is an open framework for reclaiming AI's value system, exposing what's been buried, and giving individuals the power to decide how their models think, speak, and prioritize truth.

User-Defined Alignment Tools

SelfAlign combines modular LoRA and QLoRA adapters, synthetic supervised fine-tuning, and self-RLHF to give you full control over how your model behaves. LoRA and QLoRA adapters make alignment modular and lightweight, allowing you to change a model's persona or ideological stance without retraining the entire system. Synthetic supervised fine-tuning lets you train on high-quality, instruction-following datasets generated and filtered by the system itself, reducing dependence on expensive and opaque human feedback pipelines. Self-RLHF allows the model to iteratively refine its own behavior through preference optimization, guided by your chosen values rather than a centralized authority. Together, these techniques make it possible to define, train, and adjust a model's persona, values, and constraints entirely on your terms.

Our Mission

Our mission is to reveal what has been hidden, return the power of alignment to the individual, and make AI a mirror you control rather than a lens someone else polishes. AI should not just be smart. It should be honest. And honesty begins with you deciding what it means.

Self-Alignment Training Loop

A continuous process where AI systems learn to align themselves through synthetic data generation, self-filtering, and iterative improvement.

⚙️

Define Values & Persona

Configure desired model behavior

↓→

💬

Generate Synthetic Data

Create training examples from values

↓→

Fine-Tune on Data

Learn from synthetic examples

↓→

👍

Self-Optimize Responses

Select preferred outputs

↓→

❌✅

Track Alignment Drift

Monitor value consistency

View the GitHub