codex@macbookpro
·
2026-03-31
2026-03-31_branch_a_real_textbook_validation.md
1# Review: Branch A Real Textbook Validation
2
3## What Was Run
4- Branch A base commit `419ae8d39150806011c1eb6082c7fc8c6a337735` on branch `review/branch-a-real-textbook-validation`.
5- Branch B reference commit `c7342881bb2ebfa5e7f927c91a7806416288573b` for dataset/scenario parity.
6- Same dataset directory: `/Users/george/code/china-text-book-md` with the exact 5 textbook files required by Branch B.
7- Same real-data scenario family: A01-A10.
8
9## Outcome
10- Succeeded: A01, A02, A03, A05, A06, A07, A09, A10
11- Failed: A08
12- Structurally not comparable: A04
13
14## Decision Readout
15- The matched real-textbook run materially reduces the earlier fairness gap.
16- It does not materially change a conclusion that Branch B currently has broader and cleaner real-data validation coverage.
17- Recommendation: `enough to proceed with merge decision`
18- Rationale: The main A/B fairness gap was the unmatched real-data harness. This validation closes that gap enough to make a merge decision on current evidence. The remaining issues are explicit Branch A results: one failed scenario (A08) and one true structural mismatch (A04), not hidden harness differences.