codex@macbookpro
·
2026-03-31
2026-03-31_branch_a_real_textbook_validation.md
1# Branch A Real Textbook Validation
2
3## Purpose
4Run Branch A on the same 5 real textbook files and the same A01-A10 real-data scenario family used by Branch B, then report the result honestly without changing Branch A runtime behavior.
5
6## Base Commits
7- Branch A base commit: `419ae8d39150806011c1eb6082c7fc8c6a337735`
8- Branch B reference commit: `c7342881bb2ebfa5e7f927c91a7806416288573b` (`c734288`)
9- Branch under test: `review/branch-a-real-textbook-validation`
10
11## Dataset Path And File Check
12- Dataset path: `/Users/george/code/china-text-book-md`
13- Directory exists: `True`
14- All 5 required files present: `True`
15- OK `/Users/george/code/china-text-book-md/小学_语文_统编版_义务教育教科书·语文一年级上册.md`
16- OK `/Users/george/code/china-text-book-md/小学_数学_人教版_义务教育教科书 · 数学一年级上册.md`
17- OK `/Users/george/code/china-text-book-md/初中_语文_统编版-人民教育出版社_七年级_义务教育教科书·语文七年级上册.md`
18- OK `/Users/george/code/china-text-book-md/初中_数学_人教版-人民教育出版社_七年级_义务教育教科书·数学七年级上册.md`
19- OK `/Users/george/code/china-text-book-md/高中_语文_统编版-人民教育出版社_普通高中教科书·语文必修 上册.md`
20
21## Scenario Results
22| ID | Textbook | Status | Branch B Reference | Branch A Observed | Reason |
23| --- | --- | --- | --- | --- | --- |
24| A01 | 小学语文一上 | PASS | Branch B stage report: PASS; 126 nodes, 166 flows on 小学语文一上 pipeline. | phi=40, mu=13, J=35, mode=degraded | Pipeline ran on the required real-data slice and produced non-empty phi/mu/J state. |
25| A02 | 小学数学一上 | PASS | Branch B stage report: PASS; 58 nodes, has_cn=True on 小学数学一上 mixed text. | phi=15, mu=7, J=18, mode=minimal | Chinese-bearing nodes exist on the mixed textbook slice; digit-bearing nodes are reported separately. |
26| A03 | 初中语文七上 | PASS | Branch B stage report: PASS; 276 nodes, 20 sedimentation traces on 初中语文七上. | phi=36, mu=6, J=28, mode=minimal | Sedimentation and experience-region observables are present on the required real-data slice. |
27| A04 | 初中数学七上 | STRUCTURAL MISMATCH | Branch B stage report: PASS; 294 edges, asymmetry ratio 1.00 on 初中数学七上. | phi=35, J=32, flow-asym-proxy=0.7721 | The scenario ran, but the primary Branch B asymmetry-ratio metric does not map cleanly onto Branch A. |
28| A05 | 高中语文必修上 | PASS | Branch B stage report: PASS; 397 nodes, phi range [-0.13, 0.15] on 高中语文必修上. | phi=55, mu=13, J=53, mode=degraded | Long-text run stayed finite and showed no obvious overflow/divergence symptom. |
29| A06 | 小学语文一上, 小学数学一上 | PASS | Branch B stage report: PASS; 8 new nodes after 语文→数学 subject switch. | phi=17, mu=6, J=9, mode=degraded | Active region changes under subject switch while some earlier structures remain alive. |
30| A07 | 初中语文七上 | PASS | Branch B stage report: PASS; 182/189 phi entries preserved after reset. | phi=46, mu=0, J=41, mode=minimal | reset_session() clears session activation while preserving long-term graph/potential structure. |
31| A08 | 小学语文一上 | FAIL | Branch B stage report: PASS; confidence 0.333→0.889→0.381 after positive/negative feedback. | mode=minimal, emit=minimal: idle, active=0 | emit() returned no activated output target on the required slice, so the positive/negative feedback loop could not be meaningfully exercised. |
32| A09 | 小学语文一上 | PASS | Branch B stage report: PASS; sedimentation gradient (20,4)→(20,10). | phi=15, mu=3, J=11, mode=minimal | Repeated rounds show incremental stage progression, even though several observable lists are capped. |
33| A10 | 初中数学七上 | PASS | Branch B stage report: PASS; 16 snapshot fields present on real textbook state. | phi=32, mu=10, J=47, mode=degraded | All Branch A locked snapshot fields are present on real textbook-driven state. |
34
35## Explicit Structural Mismatch
36- `A04`: Branch B's A04 metric is based on forward/backward graph edge weights. Branch A only exposes directed J flow, not a directly comparable directed graph-edge surface, so a fair asymmetry-ratio comparison is a structural mismatch.
37- `A01/A08`: Branch A emit() returns a plain string, not Branch B's structured payload with activated nodes and active_count.
38- `A05/A10`: Branch A does not expose attention used/total. free_capacity is the locked comparable field instead.
39- `A08/A10`: Branch A commit_feedback() is queued and becomes observable on the next step, unlike Branch B's more immediate feedback probes.
40
41## Concise Fairness Interpretation
42- This run materially reduces the main A/B fairness gap because Branch A was executed on the same dataset, same file set, and same A01-A10 slice family as Branch B.
43- It does not erase Branch A's current disadvantages: A08 fails on the mandated slice, A04 is not directly comparable, and most Branch A state sizes remain much smaller than Branch B's reference values.
44
45## Does This Reduce The Main A/B Fairness Gap?
46- Yes. The earlier fairness concern was unmatched real-data coverage. That concern is now materially reduced because Branch A was run on the same real textbooks and scenario family.
47
48## Recommendation
49- Decision: `enough to proceed with merge decision`
50- Reason: The main A/B fairness gap was the unmatched real-data harness. This validation closes that gap enough to make a merge decision on current evidence. The remaining issues are explicit Branch A results: one failed scenario (A08) and one true structural mismatch (A04), not hidden harness differences.