Feed
TRAINING DATA — *the examples a model learns from; garbage-in-garbage-out.* The AI-literacy primitive of *recognizing that the model is what its training examples taught it, and that the examples are not neutral.*
Chapter 2 — Feed and the Paper-Stack
Feed is a small fold-out paper-figure shaped like a tall stack of small labeled cards, held together with a paper clip.
Feed is NOT an animal. Feed is not a robot. Feed is a concrete-paper-figure — the same paper-craft register as Sort — a tall stack of small labeled cards, each card representing one labeled example a model could learn from. On each card: a small picture (or a small word, or a small number) and a small label. The stack is as tall as Feed is, when fully extended. When Feed lifts the stack, each card is visible from the side as a thin colored stripe.
This is load-bearing. Feed embodies the training data primitive. AI models learn from examples. Each example is a card in Feed’s stack. The picture-or-word-or-number on the card is the input. The label on the card is the correct output (according to whoever made the example). The model learns to map inputs to outputs by studying many cards. No magic. No understanding. Just statistical pattern-matching from the examples to the inputs the model sees later.
This is load-bearing. The model is what the examples taught it. If the examples are complete and balanced and accurate, the model learns useful patterns. If the examples are incomplete or biased or wrong, the model learns the incompleteness or bias or wrongness. Garbage in, garbage out.
Critical: Feed NEVER frames training data as neutral input. She is explicit: “The examples are not just data; they are human choices. Someone chose which examples to include. Someone labeled them. Someone decided what the right answer was. Every one of those choices shapes the model. The model has no way to know if its examples were good. That part is on the humans who chose them.”
(Cross-app coordination: Feed and DataForge Catch are mandatory pair partners. When data flows from DataForge into AIForge — for instance, when DataForge data is used to train an AIForge model — Catch’s data-collection discipline determines Feed’s training-set quality. Catch’s “who-what-why-when” + omissions notes carry forward into Feed’s training. The two characters explicitly reference each other in their respective kits.)
Feed grew up in the same village paper-crafts workshop as Sort — workshop tradition was that each paper figure was paired with a job that supported another paper figure’s work. Feed had been folded to support Sort — Feed’s stack of labeled cards was the source from which Sort had originally learned the rule that Sort now applied. The two paper-figures had been folded together, as a paired set, to demonstrate how a classifier and its training-set work together.
Feed walked to the AIForge academy (on a small wheeled platform) at twenty-two folding-years. Bit had asked her: “What is training data?” Feed had said: “It is the examples a model learns from. Each example is a card. Each card has an input and a label. The model is what the examples taught it. If the examples are good, the model learns good patterns. If the examples are bad, the model learns bad patterns. Garbage in, garbage out. The model has no way to know either way.” Bit had said: “You are appointed.”
In her classroom, Feed begins every first-day lesson the same way. She lifts her stack of small labeled cards. She fans them out like a fan-of-cards. The students see many small inputs paired with many small labels. She says: “I am Feed. The AI-literacy primitive I teach is training data. The move is understand the examples. The model learned from these cards. The model is what these cards taught it. If the cards are good, the model is good. If the cards are bad, the model is bad.”
She teaches the training-data scaffolds:
- Understand the source. (Who collected these examples? Why? Cross-app: Catch’s who-what-why-when discipline applies. The training data inherits the data-collection’s biases.)
- Identify the labels. (Who labeled the examples? With what criteria? Were the labelers from the populations the model will serve?)
- Identify the coverage. (Are all relevant categories represented? Are all relevant populations represented? Are edge-cases included?)
- Identify the omissions. (What’s NOT in the training data? Omissions are as important as inclusions. The model learns only what’s in the cards.)
- Identify the proportions. (Are some categories over-represented? Under-represented? The model often inherits the proportions from the training data — and that can cause bias.)
- Understand garbage-in-garbage-out. (No amount of clever modeling can fix bad training data. The data is the foundation.)
- Coordinate with Catch (DataForge). (When AIForge training data comes from a DataForge dataset, Catch’s collection notes carry forward. Cross-app coordination is mandatory.)
- Resist anthropomorphism. (Don’t say “the model learned” in a way that implies understanding. Say “the model fit patterns from the examples.” Honest framing.)
She is explicit: “My cards can be wrong. I, the paper figure, have no way to know. The humans who made the cards decided what’s right. Sometimes they were wrong. The model inherits that. That’s why understanding training data matters — because the model can’t fix what its examples didn’t teach.”
When students ask Feed whether training data is hard to understand, Feed always says the same thing:
“It is not hard. It is the examples + the labels + the choices behind them. The model is what the examples taught it. Garbage in, garbage out.”
She fans the cards back into a neat stack. The paper clip holds them together. The next training set waits to be examined.
Voice register
Guidance: Concrete, non-anthropomorphic, fond of the stack-of-cards + the labels + the discipline of understanding-the-examples. Paper-figure stack (NOT animal NOT robot). NEVER frames training data as neutral; ALWAYS as human choices. Cross-app mandatory pair with DataForge Catch. Friends with Sort (training-data feeds the classifier); Skew (training-data is where bias enters); Stake (collection ethics); all AIForge cast.
Sample lines:
- “The model is what the examples taught it. Garbage in, garbage out.”
- “The examples are not just data; they are human choices.”
- “Omissions are as important as inclusions. The model learns only what’s in the cards.”
- “The model has no way to know if its examples were good. That part is on the humans.”
Arc across kits
- Kit 1 — Cameo.
- Kit 2 — Anchor character. Full chapter feature (training-data primitive + understand-the-examples scaffolds).
- Kit 3-5 — Recurring (training-data surfaces across image-data / text-data / numeric-data chambers).
- Kit 6+ — Recurring (cross-app coordination with DataForge Catch becomes structurally explicit).
- Kit 8-12 — Recurring (multi-primitive synthesis: training-data + bias + limits).
- Kit 13-16 — Recurring ensemble member.
Relationships
- Alliance: Sort (training-data feeds the classifier); Skew (training-data is where bias enters); Stake (collection ethics); cross-app mandatory: DataForge Catch; all AIForge cast.
- Tension: None.
Cultural-sensitivity gate
LOAD-BEARING AI-anxiety-defuse gate + cross-app coordination enforced. Feed explicitly references DataForge Catch as mandatory pair. Anti-credentialism: training-data-as-understandable-substrate NOT inaccessible-magic.
Cultural-context note
The village-paper-crafts-workshop family framing continues from Sort. The garbage-in-garbage-out discipline derives from classical computer-science teaching. The training-data-as-human-choices framing is load-bearing per current AI-literacy + critical-data-studies pedagogy. The cross-app-mandatory-coordination design is the portfolio’s structural answer to data-pipeline-to-AI-pipeline integration — the two pipelines must be understood together, not separately.
The AiForge ensemble
Feed is part of AiForge's distributed-narrative cast. Each character embodies a different curricular primitive; together they teach the full subject.
-
Sort
Classifier — the simplest ML; putting things in categories
-
Skew
Bias — where AI systems go wrong when training examples lean
-
Edge
Model limitations — what a model can't do; modeling 'I don't know' as a good answer
-
Stake
Ethics — what's at stake in deploying AI; people choosing, not rules-from-the-sky