Summary

An edtech vendor deploying AI to minors operates under FERPA, COPPA, state student-privacy laws, and rising efficacy-claim scrutiny. Governance is not a compliance afterthought; it is the license to sell into schools and to parents. This playbook covers student-data privacy for AI training and inference, defensible learning-outcome claims, bias and equity testing across learner populations, safety controls for content shown to children, and the transparency records districts now require in procurement. The vendors that formalize these controls close enterprise deals faster and survive the incident that eventually tests every AI product.

Context

Governance is the edtech vendor's license to operate

Selling AI into education means selling to a buyer legally responsible for children. FERPA governs education records, COPPA governs data collection from children under 13, and more than 120 state student-privacy laws add contractual demands on top. Districts now send AI-specific security and privacy questionnaires before a pilot, and a single mishandled data flow can end a multi-year contract. For a vendor, governance is not overhead. It is the gate that decides whether procurement even starts.

The scrutiny has widened from privacy to efficacy and safety. Regulators and buyers challenge unsubstantiated learning claims, districts ask how models were tested for bias across student demographics, and parents ask what a chatbot will and will not say to a fourteen-year-old. A vendor that cannot answer these in writing, with logs and test results, loses to one that can. The governance stack below is what an enterprise education buyer expects to see, and increasingly what a state contract requires. Districts also increasingly require a named data protection contact, a signed student privacy pledge or equivalent, and a clear statement of which subprocessors touch student data, so governance is now a documented contractual artifact rather than a verbal assurance during a sales call. Vendors that treat these as living controls, reviewed each renewal and each model change, keep contracts through leadership turnover at the district; vendors that file them once and forget them fail the next audit.

The framework

Five governance domains every AI edtech vendor must own

Treat each domain as a control with an owner, a written policy, and evidence you can hand a district in procurement.

Governance domainCore requirementEvidence districts expect
Student data privacyFERPA and COPPA compliance; no training on student PII without a lawful basis and contract termsData processing agreement, data flow map, retention and deletion policy, subprocessor list
Efficacy claimsLearning-outcome claims backed by controlled evidence, not testimonialsStudy design, control cohort results, ESSA evidence tier where applicable
Bias and equityModel outputs tested across race, language, disability, and socioeconomic segmentsDisaggregated performance results, remediation log, ongoing monitoring plan
Safety for minorsContent filters and refusal behavior tuned for children by age bandSafety policy, red-team results, incident and escalation procedure
Transparency and explainabilityDisclose model use, data sources, and how automated decisions are madeModel cards, provenance on AI outputs, human-review points documented
Recommended actions

Build the governance stack procurement asks for

  • Publish a data processing agreement and data flow map that state plainly whether student data ever trains a model, and default to not training on student PII.
  • Attach provenance to every AI output shown in the product: the source content, the model and version, and whether a human reviewed it, so nothing is a black box.
  • Run disaggregated bias testing before launch and on a schedule after, checking accuracy and tone across language, disability, and demographic segments, and log every fix.
  • Tune safety behavior by age band and red-team the tutor with adversarial prompts a real student would try, then document refusal and escalation paths.
  • Hold every consequential claim, especially learning-growth numbers, to a controlled study standard before marketing or sales can use it.
Common pitfalls

Governance failures that end contracts

  • Quietly training or fine-tuning on student interaction data without contract terms or parental notice, which breaks COPPA and FERPA trust the moment it surfaces.
  • Publishing outcome claims from a self-selected pilot with no control, then facing an efficacy or advertising challenge you cannot substantiate.
  • Testing the model only on aggregate accuracy and missing that it underperforms for English-language learners or students with disabilities.
  • Treating safety as a launch checkbox and skipping ongoing red-teaming, so a jailbreak that exposes a minor emerges in production instead of in review.
Metrics that matter

Track governance as measurable controls

  • Share of AI outputs carrying full provenance metadata, targeting 100 percent for anything shown to a learner.
  • Disaggregated performance gap: maximum accuracy or safety difference across demographic and language segments.
  • Safety incident rate and mean time to remediate a reported unsafe or harmful output.
  • Procurement pass rate: percentage of district security and privacy reviews cleared without remediation.
FAQ

Frequently asked questions

Can we train our models on student data?

Default to no. Under COPPA and FERPA, using student PII to train models needs a clear lawful basis, contract terms with the school, and often parental notice. Many districts flatly prohibit it. Design so your product improves through vetted content, synthetic data, and consented non-student data, and state your training position plainly in the data processing agreement.

What efficacy evidence do education buyers actually accept?

Controlled evidence, not testimonials. Buyers increasingly reference ESSA evidence tiers, from correlational up to randomized controlled trials. A matched control cohort showing your AI-assisted group outperformed a comparable group is the credible floor. Testimonials and uncontrolled before-after numbers invite challenge.

How do we prove our AI is not biased against certain students?

Test disaggregated, not just aggregate. Measure accuracy, tone, and safety separately across race, primary language, disability status, and socioeconomic proxy, document the gaps, remediate, and re-test on a schedule. Hand the district the disaggregated results and your monitoring plan.