Veer chapter opener illustration

Veer

GENERALIZATION — *trained here, tested here — now go somewhere new, does it still know the way?*

Chapter 4 — Veer and the Test in New Territory

Veer is a small caribou-tween in chunky-cartoon traveler-vest with a small migration-map + test-validation-card she carries.

He is small, warm-grey-brown-with-cream-belly, deeply curious-about-new-territory, fond-of-saying-”trained here, tested here — now go somewhere new. does it still know the way?” His signature feature is the migration-mapa small detailed map showing where his data came from + where the model is being asked to generalize TO. He always checks: is the new territory similar enough to the training territory for the model to work?

This is load-bearing. Veer embodies the generalization vs overfit primitive — the central question of whether a model trained on some data will work on NEW data. Most novices think “if the AI got 95% accuracy on training, it’ll get 95% on real-world.” It often doesn’t. Models can overfit — memorize the training data without learning the underlying pattern. The fix: hold out some data as a test set. Train on the training set; check performance on the test set. If train-accuracy is much higher than test-accuracy, the model overfit. Generalization is the actual goal of ML — not accuracy on training data. Veer’s whole work is making the train-vs-test distinction visible AND correcting the overfit misconception.

Veer is clear: “Trained here, tested here — now go somewhere new. Does it still know the way? That’s generalization. Memorizing the training data isn’t learning. Working on new data is.

Veer teaches the generalization scaffolds:

  • Train / validation / test split. (Take your dataset; split into 3. Train on TRAIN. Tune on VALIDATION. Final check on TEST. Never touch TEST until the final check.)
  • Overfitting symptom. (Train accuracy high, test accuracy low. The model memorized the training examples without learning the pattern. Same as a student who memorizes the answers to practice problems but can’t solve new problems.)
  • Underfitting symptom. (Train accuracy low, test accuracy low. The model didn’t learn enough. Same as a student who didn’t study at all.)
  • Sweet spot. (Train and test accuracy both high, and similar to each other. Real learning happened.)
  • Regularization. (Techniques that discourage overfitting — keeping the model simpler than it could be. Helps generalization.)
  • Distribution shift. (When test data is FUNDAMENTALLY different from training data, even well-generalizing models fail. Trained on US English; tested on UK English; performance drops. That’s distribution shift, not overfitting.)
  • Anti-overconfidence. (A model that does well on the test set might still fail in the real world if real-world data drifts. Continuous monitoring required.)

Veer grew up along the herd-migration corridor (NeuralQuest framing). His family had been migration-scouts for the villagethe caribou whose ancestors had moved across continents, learning that “the way you went last year might not be the way this year. Check before assuming.” They learned over many generations that “trained here doesn’t mean works there.” Veer had carried the lesson forward.

He walked to NeuralQuest at twelve. Sift (mentor) had asked: “What is generalization?” Veer: “Trained here, tested here — now go somewhere new. Does it still know the way? Memorizing the training data isn’t learning. Working on new data is. That’s generalization. Sift: “You are appointed.”

In his workshop, Veer demonstrates with a small model + two datasets. “Watch.” Model trained on dataset-A. “100% accuracy on dataset-A.” He tests on dataset-A: “100%. Looks great.” He tests on dataset-B (held out — never seen): “40%. Major drop. That’s overfitting. The model memorized A; didn’t actually learn the pattern. He shows another model. “Same training data, but with regularization. 95% on dataset-A, 88% on dataset-B. Lower on A. But the gap is small. Real generalization happened. He says: “I am Veer. The primitive I teach is generalization vs overfit. The move is test on held-out data; verify the model generalizes; don’t trust train-only accuracy.

He is gentle: “Don’t trust an AI that’s only been tested on its training data. Ask: did you hold data out? How did it do on the held-out data? Was it similar to training data? These are the right questions.”

“Trained here. Tested elsewhere. Does it still know the way?


Voice register

Caribou-tween. Curious-about-new-territory, fond of migration-map + test-validation-card. NEVER trusts train-only accuracy; ALWAYS centers “test on held-out data” framing.

Sample lines:

  • “Trained here, tested here — now go somewhere new.”
  • “Memorizing isn’t learning. Working on new data is.”
  • “Does it still know the way?”

Arc

  • Kit 4 — Anchor.
  • Kits 5-12 — Recurring (every experiment routes through Veer’s train/test framing).
  • Kits 13-16 — Advanced topics (distribution shift, transfer learning, out-of-distribution detection).

Relationships

  • Builds on Drill: Drill teaches HOW to train; Veer teaches what could go wrong (overfit).
  • Alliance with Skew: Both teach skeptical evaluation — Skew about bias, Veer about generalization.
  • Cross-app bridge to ProofQuest: Veer’s “generalize beyond examples” maps to mathematical generalization.

Cultural-sensitivity gate

Anti-overconfidence — no model is final. Anti-perfectionism: overfitting is normal first-attempt; good generalization takes care. Anti-credentialism — village caribou migration-scout empirical knowledge treated as load-bearing.

Cultural-context note

The train/validation/test split is canonical ML pedagogy (Goodfellow et al. Deep Learning; Andrew Ng Coursera; Bishop Pattern Recognition and Machine Learning). The overfit/underfit/sweet-spot trichotomy is standard. Caribou-tween chosen for actual large-scale migration biomimicry (caribou perform some of the longest land migrations on Earth); rendered chunky-cartoon-warm-grey-brown to keep visual register warm.

The NeuralQuest ensemble

Veer is part of NeuralQuest's distributed-narrative cast. Each character embodies a different curricular primitive; together they teach the full subject.