Akshara is a truly open-source, bilingual 1B foundation model — every datapoint, line of code, and checkpoint released and reproducible. We’re a small team at the start of the build, looking for collaborators who want their work to be fully in the open.
Grade-A-only manifests, A1/A2 sub-grading, dedup & decontamination over Dolma 3 + Sangraha verified.
Custom BPE, Indic fertility benchmarking across scripts, vocab/embedding budget.
OLMo-core trainer, ladder validation, the bilingual pretrain → midtrain → long-context flow.
OLMES + an Indic eval layer; SFT/DPO/RLVR for the Instruct & Think variants.
The ledger, manifest emitter, repro tooling — the part that makes "truly open" real.
IndiaAI empanelment, cluster bring-up, the things that unblock everything else.
Because the project is fully open, we keep the contributor circle real. Give us one public, resolvable link that establishes your research background. We resolve it ourselves; we don’t share it.
Takes ~5 minutes.
We resolve your research link(s).
A conversation about fit and a first task.
Email magic-link, repo access, a Phase task.