A four-quarter AI roadmap for a software company sequences from foundation to governed AI-native product. Quarter one builds the data and eval foundation and ships an internal coding assistant. Quarter two adds support deflection and a governance spine of approval gates and provenance. Quarter three ships the first customer-facing AI feature behind eval and human-review gates. Quarter four scales AI-native capabilities with inference cost under control and EU AI Act classification complete. The through-line is that each quarter earns the next: no customer-facing AI ships until the eval harness, tenant isolation, and provenance kernel are in place and proven.
Sequence foundation before customer-facing AI
The most common AI roadmap failure in software companies is shipping a customer-facing feature before the foundation exists to make it safe or measurable. A demo built on an unlabeled dataset and a shared retrieval index looks impressive and then leaks data or degrades quietly in production. A durable roadmap inverts that: it spends the first two quarters on the data foundation, the eval harness, tenant-isolated retrieval, and the governance spine, and only then ships to customers. The internal surfaces, coding assistants and support deflection, come first precisely because a mistake there does not reach a customer while the team builds the muscles it needs.
The payoff of sequencing is compounding. By the time the first customer-facing AI feature ships in quarter three, the eval harness already measures quality, provenance already attaches to every output, and inference cost is already modeled as COGS. Quarter four then scales AI-native capabilities on a base that is governed, measurable, and margin-aware, with EU AI Act classification complete. Each quarter is designed to earn the next, so the company never ships an AI output it cannot explain, isolate, or afford. The governance spine built in quarter two is the pivot of the whole plan. Approval gates separate drafts from approved outputs, provenance attaches source documents, retrieval IDs, model, and prompt version to every recommendation, and audit logs make engagement activity queryable by tenant, actor, and time. Building this before the first customer-facing release means the spine gates that release rather than being retrofitted onto an incident. Teams that invert the order almost always end up rebuilding under pressure after a leak or a wrong answer reaches a customer, at far higher cost than doing it in sequence.
The four-quarter AI-native roadmap
Each quarter has a theme, a shipped capability, and a gate that must pass before the next quarter begins.
| Quarter | Theme and capability shipped | Gate to advance |
|---|---|---|
| Q1 | Data and eval foundation, internal coding assistant | Eval harness live, telemetry cleaned |
| Q2 | Support deflection, governance spine | Approval gates and provenance in place |
| Q3 | First customer-facing AI feature | Passes eval, tenant isolation proven |
| Q4 | Scale AI-native features, cost control | Inference COGS modeled, AI Act classified |
Execute the roadmap gate by gate
- In quarter one, clean product telemetry, curate a versioned eval dataset, and ship the internal coding assistant so the team learns on a low-risk surface.
- In quarter two, stand up support deflection on tenant-scoped retrieval and build the governance spine of approval gates and provenance before any customer-facing plan.
- In quarter three, ship one customer-facing AI feature only after it passes the eval harness and tenant-isolation tests, with human review on consequential outputs.
- In quarter four, scale AI-native capabilities with inference modeled as COGS, model tiering in place, and EU AI Act classification completed for each feature.
- Hold each quarter to its advancement gate and refuse to skip ahead, so no surface ships without the foundation that makes it safe and measurable.
- Make audit logs queryable by tenant, actor, and time range from quarter two onward, so engagement-level accountability exists before the first customer-facing feature goes live.
Roadmap traps to avoid
- Shipping a customer-facing AI feature in quarter one on an unlabeled dataset and shared retrieval, before any eval or isolation exists.
- Building the governance spine after launch as a retrofit, when it should gate the first customer-facing release.
- Scaling AI-native features before inference cost is modeled as COGS, letting margin erode as usage grows.
- Treating the roadmap as fixed dates rather than gates, so unfinished foundations get skipped to hit a deadline.
Roadmap health indicators
- Gate pass status per quarter, so advancement is earned by evidence rather than calendar.
- Eval coverage of shipped AI features, targeting full coverage before any customer-facing release.
- Inference cost per active user by quarter four, held under the ceiling that pricing supports.
- Share of AI features with completed EU AI Act classification and attached provenance in production.
Frequently asked questions
Why not ship a customer-facing AI feature in the first quarter?
Because the foundation that makes it safe and measurable does not exist yet. Without an eval harness you cannot prove it is correct, and without tenant-isolated retrieval a feature can leak data. Internal surfaces first let the team build those muscles where a mistake does not reach a customer.
What has to be true before we ship AI to customers?
The eval harness must gate the feature, retrieval must be tenant-isolated and tested, provenance must attach to every output, and consequential outputs must pass a human review gate. That is the quarter-three gate, and skipping it is how demos become incidents.
How long does it take to become AI-native?
Roughly four quarters if sequenced deliberately: two quarters on data, eval, and governance foundations, one to ship the first governed customer-facing feature, and one to scale with inference cost controlled and EU AI Act classification complete. Each quarter earns the next.
Related reading
Go deeper on this sector and topic.