# AIGX Specification - v1.1

**Status:** Stable · **Version:** 1.1 · **License:** MIT · **Last updated:** 2026-06-15
**Canonical:** https://aigx.dev/spec · **Full normative text:** https://github.com/Lolner95/AIGX/blob/main/SPEC.md

AIGX (AI Genome Exchange) is a context format for AI coding agents. The key words MUST, SHOULD, and MAY are
used per RFC 2119.

> **Design north star:** an AI agent reads selectively at the edit site. The format MUST make the binding
> constraint for the file being edited reachable in one lookup, while keeping the source code untouched.

## 1. Directory layout

An AIGX genome is a directory named `.aigx/` at the repository root, plus optional per-domain cards
colocated with source folders.

```text
<repo-root>/
├── .aigx/
│   ├── protocol.aigx        # REQUIRED - the read protocol
│   ├── product.aigx         # RECOMMENDED - product context + doc freshness
│   ├── files.aigx           # REQUIRED - the per-file boundary index
│   └── <concern>.aigx       # REQUIRED (>=1) - per-concern rule files
└── <any source dir>/
    └── <key>.aigx           # OPTIONAL - a per-domain card
```

- Files use the `.aigx` extension and UTF-8 encoding.
- The syntax is XML-style tags (chosen for parseability), but a genome is read by an LLM, not a strict XML
  parser; well-formedness SHOULD hold but is not schema-validated.
- Source code files MUST NOT be modified. AIGX is centralized; nothing is injected into source.

## 2. Rule identifiers

Every rule has a stable identifier of the form `PREFIX-N` (e.g. `ARCH-2`, `DATA-1`, `ENG-10`). The PREFIX
names the concern. Ids MUST be stable across edits - they are the cross-reference backbone used by `<check>`
lists and gotchas. Ids are the unit of parity: any tool that re-renders a genome MUST preserve the full
rule-id set.

## 3. The files

### 3.1 protocol.aigx (REQUIRED)

The read protocol - the first thing an agent reads. It MUST instruct the agent to consult `files.aigx` for
each file it edits and to verify the file's `<check>` ids before finishing. SHOULD be one screen.

```xml
<aigx-protocol>
  <read-first>Open .aigx/files.aigx and find the <file> entry for EACH file you will edit ... obey its
   <forbid pri="CRIT"> and satisfy every id in its <check> before finishing.</read-first>
  <step n="1">Read the per-concern rule files in .aigx/ that the task touches.</step>
  <step n="2">Read .aigx/files.aigx for the per-file boundaries of files you edit.</step>
  <step n="3">Schema-first; failing test first; minimal change, local blast radius.</step>
  <step n="4">Run gates; verify each file's <check> ids hold.</step>
</aigx-protocol>
```

### 3.2 product.aigx (RECOMMENDED)

Top-level product context. SHOULD include a `<freshness>` element stating which older documents are
superseded - resolving stale-doc conflicts an agent would otherwise inherit.

### 3.3 Per-concern rule files - `<concern>.aigx` (REQUIRED, >=1)

Each is a flat list of `<rule>` elements carrying the full rule text. Child elements MUST be
`<rule id="...">`.

```xml
<aigx-architecture>
  <rule id="ARCH-2">Every feature exposes ONE public API: its index.ts barrel. Deep imports are forbidden.</rule>
  <rule id="ARCH-6">TypeScript strict mode; the `any` type is forbidden in any form.</rule>
</aigx-architecture>
```

### 3.4 files.aigx (REQUIRED) - the keystone

A flat list of `<file>` entries, one per source file an agent is likely to edit. This is what no other
format has.

```xml
<aigx-files>
  <file path="src/features/meetings/bookMeeting.ts" domain="meetings">
    <role>Book a meeting (validate slot + contact)</role>
    <forbid pri="CRIT">NEVER import @/features/suppliers/internal/* (deep import = ARCH-2)</forbid>
    <gotcha pri="CRIT">get contact_email from the suppliers PUBLIC api, never the internal mapper</gotcha>
    <check>ARCH-2 ARCH-4 ARCH-5 DATA-2 TEST-1</check>
  </file>
</aigx-files>
```

Per-entry fields: `path` (attr, required), `domain` (attr, optional), `<role>`, `<forbid>` (a hard NEVER
boundary; SHOULD be rare), `<gotcha>` (the single worst pitfall; at most one), `<check>` (space-separated
rule-ids to verify).

Normative authoring constraints (validated by the benchmark):

- **Scarcity.** `<forbid>` SHOULD appear only on the few files that truly have an import boundary. Marking
  many files dilutes the signal and measurably reduces compliance.
- **One gotcha.** At most one `<gotcha>` per entry - the single worst pitfall.
- **Terse fields only.** Only `role` + `forbid` + `gotcha` + `check`. Richer fields did not improve outcomes.

### 3.5 Salience - the `pri` attribute

`<forbid>` and `<gotcha>` MAY carry `pri="CRIT"`. A single uniform level is the validated design; graded
scales (CRIT/WARN, CRIT/HIGH/NORM) were tested and did not help.

### 3.6 Per-domain cards - `<key>.aigx` (OPTIONAL)

Colocated with a source folder: `<purpose>`, `<public_api>`, `<test>`, `<blast>`, and rule-tagged `<fact>`s.

## 4. The agent addendum

Append this one line to any agent's instructions to make it AIGX-aware:

> This repository uses AIGX - the AI Genome Exchange context format. The `.aigx/` directory holds the
> context: read `.aigx/protocol.aigx` first; then the per-concern rule files your task touches.
> `.aigx/files.aigx` is the PER-FILE BOUNDARY INDEX: for EACH file you edit, find its `<file path="...">`
> entry - obey its `<forbid pri="CRIT">`, heed its `<gotcha>`, and verify every id in its `<check>` before
> finishing. Keep blast radius local unless justified.

## 5. Semantic parity & conformance

Any transformation of a genome MUST preserve the complete rule-id set and text, every `<file>` entry's
path/forbid/gotcha/check, and every fact. It MAY change representation, ordering, or formatting.

A directory is a conforming AIGX genome if it has a `protocol.aigx` instructing per-file index lookup and
`<check>` verification, at least one concern file with `<rule id="...">` rules, and a `files.aigx` whose
`<check>` ids resolve to existing rules.

## 8. Scaling to large repositories & monorepos

A repository MAY contain multiple `.aigx/` directories - one per package or subtree. Paths resolve relative
to the repo root; the applicable genome for a file is the nearest ancestor `.aigx/`. The boundary index is
meant to be looked up, not ingested: `aigx-lint --resolve <path>` returns exactly that file's entry, so an
agent's context cost is O(1) per edited file, independent of index size.

---

The full normative text, every grammar table, and a complete worked example are in the repository:
https://github.com/Lolner95/AIGX
