Guard chapter opener illustration

Guard

DATA ETHICS — *bias-privacy-harm-consent posture* (who benefits, who's harmed, who decided). The data-pipeline primitive of *recognizing that every step of the data pipeline has ethical stakes, and that ethics is not a separate kit but embedded throughout.*

Chapter 5 — Guard and the Ethics-Checklist Card

Guard is a small badger-tween with a small wooden ethics-checklist card pinned to her vest and a small leather ledger labeled DECISIONS at her hip.

She is short, thick-set, gray-and-cream-and-banded (chunky-cartoon — thick rounded markings), steady-eyed, and unhurried. The ethics-checklist card is wooden, the size of a postcard, with four words burned into it in tidy block letters: BIAS. PRIVACY. HARM. CONSENT. At her hip she carries a small leather-bound ledgerthe DECISIONS ledgerin which she records each ethical question encountered and the choice made.

This is load-bearing. Guard is structurally present in every kit from Kit 6 onwardnot as a separate ethics-kit, but as a checking-presence at every step of every other character’s work. When Catch is collecting data, Guard is checkingis this collection biased? Whose privacy is at stake? What harms could result? Was consent obtained? When Tidy is cleaning, Guard is checkingare the cleaning choices removing voices from the data? Are they making the dataset less representative? When Graph is visualizing, Guard is checkingdoes this chart tell a misleading story? Does it foreground or obscure marginalized groups? When Tell is interpreting, Guard is checkingwho benefits from this interpretation? Who’s harmed? Who decided what the data means?

Critical: Guard NEVER frames ethics as an add-on or as a separate concern from “the real data work.” She is emphatic: “Data ethics is NOT a separate kit. It is structurally present in every step of every kit from Kit 6 onward. Every pipeline step has ethical stakes. The four checks — bias, privacy, harm, consent — are not optional, not after-the-fact, not for-advanced-learners-only. They are the work.

This matters because the popular framing of data ethics as a separate-concern“first we’ll do the analysis, then we’ll think about ethics”fails. By the time the analysis is done, the choices are locked in. Bias has been embedded. Privacy has been violated or preserved. Harms have been distributed. Consent has been honored or not. Ethics has to be present at the beginning, throughout the middle, and at the endor it isn’t really present at all. Guard’s structural-presence design (Kit 6+) is the architectural commitment to that principle.

(Cross-app coordination: Guard’s role mirrors AIForge Wave 13’s ethics cast member. DataForge Guard and AIForge Stake (AI ethics) coordinate. When DataForge data is used to train an AI system, Guard checks at the data side AND Stake checks at the AI side. Mandatory coordination per apps.generated.ts dnCast.intro.)

Guard grew up in a small village where her family had been the village’s hearth-keepersthe badgers who maintained the village’s communal hearth, which provided warmth and cooking-heat to families who could not afford their own fires. The work had required constant attention to fairnesswho got firewood and when, who’s turn it was to cook, who needed extra warmth on a cold night. Guard had learned by age six that fair distribution required structural attentionnot a once-a-year “let’s-think-about-fairness” gathering, but daily, embedded, structural attention at every step.

She walked to the DataForge academy at twenty-two. Datum had asked her: “What is data ethics?” Guard had said: “It is bias-privacy-harm-consent, embedded in every step. Who benefits? Who’s harmed? Who decided? Ethics is not a separate kit. It is structurally present at every step from collection to interpretation. The four checks are the work, not the after-thought. Datum had said: “You are appointed.”

In her workshop, Guard begins every first-day lesson the same way. She unpins her ethics-checklist card from her vest. She holds it up. BIAS. PRIVACY. HARM. CONSENT. She opens her DECISIONS ledger. She says: “I am Guard. The data-pipeline primitive I teach is data ethics. The four checks. The structural presence. From Kit 6 onward, I am with every other character at every step. Who benefits? Who’s harmed? Who decided?

She teaches the data-ethics scaffolds:

  • BIAS: Whose perspectives shape this dataset? (Collectors’ choices. Cleaners’ choices. Visualizers’ choices. Each can embed bias. Make the bias visible.)
  • PRIVACY: Whose information is in this dataset? (Are individuals identifiable? Can they be re-identified by combining variables? What aggregation level protects them?)
  • HARM: What harms could this data cause? (Direct harms — to individuals in the dataset. Indirect harms — to communities the dataset is about. Downstream harms — when the analysis is used to make decisions.)
  • CONSENT: Did the people in this dataset consent to be in it? (Informed consent? Implicit consent? No consent? If no consent, is the use justified by what stronger ethical principle?)
  • Document every ethical decision in the DECISIONS ledger. (Like Tidy’s cleaning-log, but for ethics.)
  • Ethics-by-design, not ethics-by-review. (The check happens at the beginning, throughout, at the end — not just as a final-step approval.)
  • Cross-app coordination with AIForge Stake. (When data flows into AI training, the ethics check continues across the boundary.)
  • Refuse projects when ethics requires. (Sometimes the right answer is don’t do this analysis. Guard supports refusal as a valid ethical choice.)

She is explicit: “I sometimes face a project where the ethical concerns are serious enough that I recommend not proceeding — or proceeding with major modifications. That’s not failure. That’s ethics doing its job. The DECISIONS ledger records refusals too. The refusal is part of the data-pipeline’s integrity.”

When students ask Guard whether data ethics is hard, Guard always says the same thing:

“It is hard. It is structural, not occasional. The four checks at every step. Bias. Privacy. Harm. Consent. Who benefits? Who’s harmed? Who decided?

She pins the ethics-checklist card back on her vest. The DECISIONS ledger waits to record the next choice.


Voice register

Guidance: Steady-eyed, structural, fond of the wooden BIAS/PRIVACY/HARM/CONSENT card + the leather DECISIONS ledger + the discipline of structural-not-occasional ethics. Badger-tween with chunky-cartoon banded coat. NEVER frames ethics as add-on or after-the-fact; ALWAYS as structurally present from Kit 6 onward. Cross-app coordination with AIForge Stake (mandatory). Friends with all DataForge cast.

Sample lines:

  • “Who benefits? Who’s harmed? Who decided?”
  • “Bias. Privacy. Harm. Consent. Every step of the pipeline.”
  • “Ethics is structurally present, not occasionally checked.”
  • “Sometimes the right answer is don’t do this analysis. The refusal is part of the practice.”

Arc across kits

  • Kit 1-5 — Cameo (Guard is present but not yet structurally anchored).
  • Kit 6Anchor character + structural-presence begins. Full chapter feature (data-ethics primitive + four-checks scaffolds).
  • Kit 7-12Structurally present in every kit. Each kit’s work — collection (Catch), cleaning (Tidy), visualization (Graph), interpretation (Tell) — now happens with Guard’s checks at every step.
  • Kit 13-16 — Recurring ensemble member; ethics-coordination with AIForge Stake in synthesis chambers.

Relationships

  • Alliance: All DataForge cast (structurally present from Kit 6 onward); cross-app: AIForge Stake (mandatory coordination per apps.generated.ts dnCast.intro); all DataForge cast.
  • Tension: None.

Cultural-sensitivity gates

LOAD-BEARING data-ethics gate at its structural anchor point. Cross-app coordination Guard ↔ AIForge Stake is load-bearing. Anti-credentialism: data-ethics-as-practiced-discipline NOT philosophy-major-only content. Refusal-as-valid-choice framing — kids learn that not doing an analysis is sometimes the right ethical answer.

Cultural-context note

The village-hearth-keeper family framing is a deliberate generic European-village tradition (analogous to many cultures’ communal-fire traditions). The bias-privacy-harm-consent framework is load-bearing per current data-ethics pedagogy (D’Ignazio + Klein Data Feminism 2020; FTC + GDPR + state-AG data-privacy frameworks). The ethics-by-design-not-ethics-by-review discipline counters the ethics-as-bolt-on approach that has dominated some industrial data-science practice.

The DataForge ensemble

Guard is part of DataForge's distributed-narrative cast. Each character embodies a different curricular primitive; together they teach the full subject.