When AI Becomes Labour, Not Software
Context
Most AI products are still designed as tools. Smarter tools, faster tools, but tools all the same.
At the same time, a different reality is emerging. AI systems are executing work end-to-end, operating continuously, and being evaluated on outcomes rather than usage. In practice, they are functioning as employees.
This case study explores what changes when AI is treated as labour rather than software, and how product decisions must evolve as a result.
The Problem
Treating AI as “just another feature” creates three systemic failures.
1. Broken ownership
Software has users. Labour has accountability. Most AI products define neither clearly.
2. Misaligned value measurement
Feature adoption metrics fail to capture whether work is actually getting done.
3. Organisational friction
Teams bolt AI onto workflows without redesigning handoffs, escalation paths, or governance.
The result is predictable. Impressive demos, stalled pilots, and limited real-world impact.
After several pilots showed strong model performance but weak operational adoption, I pushed to reframe the core question:
How should products be designed when AI is expected to behave like a member of the workforce?
Reframing the Product
The critical shift was conceptual, not technical.
Instead of asking:
“What tasks can AI assist with?”
The product lens became:
“What work can this AI own, and under what conditions should it stop?”
That reframing drove every downstream decision, from system boundaries to pricing and compliance.
Key Product Decisions
AI Requires an Employment Model, Not a Feature Spec
Most failed AI implementations collapse at the handoff.
The product explicitly separated work into:
Low-risk autonomous execution
Conditional execution with approval
Mandatory human control
Rather than generic “human-in-the-loop” assumptions, handoffs were triggered by:
Confidence thresholds
Risk classification
Contextual signals such as ambiguity or emotional volatility
In one healthcare workflow, early versions optimised for throughput increased downstream clinical review time. Reclassifying the agent as a “junior worker” with mandatory escalation thresholds reduced total human time per case, despite slower raw execution.
Human–AI Handoffs Must Be Designed, Not Assumed
Decision thresholds
Below 40 percent risk: informational only
40–65 percent: manager review recommended
Above 65 percent: mandatory human review with action logging
Ownership and review
Thresholds were owned by RevOps
Reviewed quarterly with Sales and Finance leadership
Adjusted as sales motion or market conditions changed
Overrides
All recommendations could be overridden
Overrides required a reason
Override data fed back into evaluation and process review
This preserved judgement while enforcing accountability.
Performance Is Measured on Outcomes, Not Activity
Traditional software metrics were deliberately deprioritised.
Instead of:
Usage
Engagement
Feature adoption
The AI was evaluated like labour:
Cost per unit of work
Resolution completeness
Time to outcome
Human oversight load
This surfaced uncomfortable truths early, particularly where AI created downstream rework rather than genuine efficiency. It also made ROI discussions concrete rather than speculative.
Pricing Must Reflect Labour Economics, Not SaaS Norms
Seat-based pricing breaks down when AI operates independently of humans.
The model shifted toward:
Outcome-based pricing where work completion could be measured
Consumption models tied to task volume and complexity
Clear comparison against equivalent human cost
This framing simplified procurement conversations and forced internal discipline around performance and value delivery.
Compliance Is a Product Capability, Not a Legal Afterthought
In regulated environments, AI that behaves like labour inherits labour-level scrutiny.
The product incorporated:
Auditability of decisions and actions
Clear attribution of responsibility
Predictable update and change-control paths
Rather than slowing adoption, this became a differentiator. Buyers were not looking for maximal autonomy. They were looking for controlled reliability.
What This Case Study Demonstrates
This work was not about building an AI agent.
It was about:
Recognising a category shift before it becomes obvious
Translating abstract AI capability into concrete product decisions
Designing for second-order effects inside real organisations
Treating governance, economics, and change management as first-class product concerns
Most AI PM portfolios stop at what a model can do.
This case study focuses on what organisations must be ready to live with.
Why This Matters Now
AI is collapsing the boundary between software and labour. Products that ignore this will continue to struggle with trust, scale, and value realisation.
The next generation of successful AI products will not win because they are smarter.
They will win because they are designed to work responsibly inside human systems.
When AI Becomes Labour, Not Software
Context
Most AI products are still designed as tools. Smarter tools, faster tools, but tools all the same.
At the same time, a different reality is emerging. AI systems are executing work end-to-end, operating continuously, and being evaluated on outcomes rather than usage. In practice, they are functioning as employees.
This case study explores what changes when AI is treated as labour rather than software, and how product decisions must evolve as a result.
The Problem
Treating AI as “just another feature” creates three systemic failures.
1. Broken ownership
Software has users. Labour has accountability. Most AI products define neither clearly.
2. Misaligned value measurement
Feature adoption metrics fail to capture whether work is actually getting done.
3. Organisational friction
Teams bolt AI onto workflows without redesigning handoffs, escalation paths, or governance.
The result is predictable. Impressive demos, stalled pilots, and limited real-world impact.
After several pilots showed strong model performance but weak operational adoption, I pushed to reframe the core question:
How should products be designed when AI is expected to behave like a member of the workforce?
Reframing the Product
The critical shift was conceptual, not technical.
Instead of asking:
“What tasks can AI assist with?”
The product lens became:
“What work can this AI own, and under what conditions should it stop?”
That reframing drove every downstream decision, from system boundaries to pricing and compliance.
Key Product Decisions
AI Requires an Employment Model, Not a Feature Spec
Once AI is treated as labour, it needs the same structural primitives as a human worker.
A defined scope of responsibility
Clear authority boundaries
Performance expectations
Escalation rules
Offboarding mechanisms
This led to designing AI agents with:
Explicit job definitions rather than open-ended capabilities
Task ownership that could be audited
Hard stop conditions instead of silent failure modes
This reduced operational risk and materially increased trust in regulated deployment environments.
Human–AI Handoffs Must Be Designed, Not Assumed
Most failed AI implementations collapse at the handoff.
The product explicitly separated work into:
Low-risk autonomous execution
Conditional execution with approval
Mandatory human control
Rather than generic “human-in-the-loop” assumptions, handoffs were triggered by:
Confidence thresholds
Risk classification
Contextual signals such as ambiguity or emotional volatility
In one healthcare workflow, early versions optimised for throughput increased downstream clinical review time. Reclassifying the agent as a “junior worker” with mandatory escalation thresholds reduced total human time per case, despite slower raw execution.
Performance Is Measured on Outcomes, Not Activity
Traditional software metrics were deliberately deprioritised.
Instead of:
Usage
Engagement
Feature adoption
The AI was evaluated like labour:
Cost per unit of work
Resolution completeness
Time to outcome
Human oversight load
This surfaced uncomfortable truths early, particularly where AI created downstream rework rather than genuine efficiency. It also made ROI discussions concrete rather than speculative.
Pricing Must Reflect Labour Economics, Not SaaS Norms
Seat-based pricing breaks down when AI operates independently of humans.
The model shifted toward:
Outcome-based pricing where work completion could be measured
Consumption models tied to task volume and complexity
Clear comparison against equivalent human cost
This framing simplified procurement conversations and forced internal discipline around performance and value delivery.
Compliance Is a Product Capability, Not a Legal Afterthought
In regulated environments, AI that behaves like labour inherits labour-level scrutiny.
The product incorporated:
Auditability of decisions and actions
Clear attribution of responsibility
Predictable update and change-control paths
Rather than slowing adoption, this became a differentiator. Buyers were not looking for maximal autonomy. They were looking for controlled reliability.
What This Case Study Demonstrates
This work was not about building an AI agent.
It was about:
Recognising a category shift before it becomes obvious
Translating abstract AI capability into concrete product decisions
Designing for second-order effects inside real organisations
Treating governance, economics, and change management as first-class product concerns
Most AI PM portfolios stop at what a model can do.
This case study focuses on what organisations must be ready to live with.
Why This Matters Now
AI is collapsing the boundary between software and labour. Products that ignore this will continue to struggle with trust, scale, and value realisation.
The next generation of successful AI products will not win because they are smarter.
When AI Becomes Labour, Not Software
Context
Most AI products are still designed as tools. Smarter tools, faster tools, but tools all the same.
At the same time, a different reality is emerging. AI systems are executing work end-to-end, operating continuously, and being evaluated on outcomes rather than usage. In practice, they are functioning as employees.
This case study explores what changes when AI is treated as labour rather than software, and how product decisions must evolve as a result.
The Problem
Treating AI as “just another feature” creates three systemic failures.
1. Broken ownership
Software has users. Labour has accountability. Most AI products define neither clearly.
2. Misaligned value measurement
Feature adoption metrics fail to capture whether work is actually getting done.
3. Organisational friction
Teams bolt AI onto workflows without redesigning handoffs, escalation paths, or governance.
The result is predictable. Impressive demos, stalled pilots, and limited real-world impact.
After several pilots showed strong model performance but weak operational adoption, I pushed to reframe the core question:
How should products be designed when AI is expected to behave like a member of the workforce?
Reframing the Product
The critical shift was conceptual, not technical.
Instead of asking:
“What tasks can AI assist with?”
The product lens became:
“What work can this AI own, and under what conditions should it stop?”
That reframing drove every downstream decision, from system boundaries to pricing and compliance.
Key Product Decisions
AI Requires an Employment Model, Not a Feature Spec
Most failed AI implementations collapse at the handoff.
The product explicitly separated work into:
Low-risk autonomous execution
Conditional execution with approval
Mandatory human control
Rather than generic “human-in-the-loop” assumptions, handoffs were triggered by:
Confidence thresholds
Risk classification
Contextual signals such as ambiguity or emotional volatility
In one healthcare workflow, early versions optimised for throughput increased downstream clinical review time. Reclassifying the agent as a “junior worker” with mandatory escalation thresholds reduced total human time per case, despite slower raw execution.
Human–AI Handoffs Must Be Designed, Not Assumed
Decision thresholds
Below 40 percent risk: informational only
40–65 percent: manager review recommended
Above 65 percent: mandatory human review with action logging
Ownership and review
Thresholds were owned by RevOps
Reviewed quarterly with Sales and Finance leadership
Adjusted as sales motion or market conditions changed
Overrides
All recommendations could be overridden
Overrides required a reason
Override data fed back into evaluation and process review
This preserved judgement while enforcing accountability.
Performance Is Measured on Outcomes, Not Activity
Traditional software metrics were deliberately deprioritised.
Instead of:
Usage
Engagement
Feature adoption
The AI was evaluated like labour:
Cost per unit of work
Resolution completeness
Time to outcome
Human oversight load
This surfaced uncomfortable truths early, particularly where AI created downstream rework rather than genuine efficiency. It also made ROI discussions concrete rather than speculative.
Pricing Must Reflect Labour Economics, Not SaaS Norms
Seat-based pricing breaks down when AI operates independently of humans.
The model shifted toward:
Outcome-based pricing where work completion could be measured
Consumption models tied to task volume and complexity
Clear comparison against equivalent human cost
This framing simplified procurement conversations and forced internal discipline around performance and value delivery.
Compliance Is a Product Capability, Not a Legal Afterthought
In regulated environments, AI that behaves like labour inherits labour-level scrutiny.
The product incorporated:
Auditability of decisions and actions
Clear attribution of responsibility
Predictable update and change-control paths
Rather than slowing adoption, this became a differentiator. Buyers were not looking for maximal autonomy. They were looking for controlled reliability.
What This Case Study Demonstrates
This work was not about building an AI agent.
It was about:
Recognising a category shift before it becomes obvious
Translating abstract AI capability into concrete product decisions
Designing for second-order effects inside real organisations
Treating governance, economics, and change management as first-class product concerns
Most AI PM portfolios stop at what a model can do.
This case study focuses on what organisations must be ready to live with.
Why This Matters Now
AI is collapsing the boundary between software and labour. Products that ignore this will continue to struggle with trust, scale, and value realisation.
The next generation of successful AI products will not win because they are smarter.
They will win because they are designed to work responsibly inside human systems.