AI Risk & Decision Intelligence System for High-Stakes Business Decisions
Summary
Designed and led an AI decision intelligence system that helped senior leaders in a £50–100m ARR B2B SaaS business anticipate revenue risk earlier, intervene deliberately, and make accountable decisions under uncertainty.
1. Context and problem
Business context
The organisation was a mid-size B2B SaaS company with:
£50–100m ARR
£100k–£250k ACV deals
90–120 day sales cycles
150–250 active deals per quarter
In this environment, a small number of late-stage decisions had outsized impact. Two or three large deals slipping could materially affect quarterly guidance, investor confidence, and leadership credibility.
The real problem
Despite sophisticated dashboards, including stage-based CRM forecasting and pipeline coverage views, leaders consistently discovered risk too late.
Decision-making relied on:
Lagging indicators from CRM stages
Sales-manager intuition
Retrospective explanations after targets were missed
The issue was not lack of data. It was a decision quality problem under uncertainty.
Leaders had information, but no reliable way to:
Quantify risk early
Understand why risk was changing
Decide when intervention was actually warranted
2. Why AI was appropriate, and why simpler approaches failed
Why rules and heuristics broke down
The organisation had previously tried:
Rules-based risk scoring
Velocity thresholds
Red-flag checklists
These approaches failed because:
They could not model non-linear interactions across signals
They degraded quickly as sales behaviour evolved
They were gamed once rules became predictable
They created confidence without robustness.
Why AI was justified
AI was appropriate because it could:
Combine weak signals across CRM, product usage, and customer behaviour
Produce probabilistic outputs rather than binary judgements
Adapt as patterns shifted over time
Explicit restraint
The system was not designed to:
Make autonomous decisions
Replace commercial judgement
Optimise blindly for short-term revenue
AI informed decisions. Humans owned outcomes.
That boundary shaped every design choice.
3. Product vision and user workflow
Product vision
An AI Risk & Decision Intelligence System that:
Predicts the probability of adverse outcomes
Explains the drivers behind rising or falling risk
Recommends interventions with confidence bounds
Enforces human-in-the-loop governance
Who used it and how
Deal owners
Received explanations and drivers, not raw scores
Used insights to adjust deal strategy
Sales managers
Reviewed flagged deals weekly
Focused attention on the highest risk-adjusted portion of pipeline
CRO and leadership
Saw aggregate risk shifts and confidence bands
Used trends for forecasting and intervention planning
In practice, this operated as a hybrid system: managers and leaders reviewed a weekly dashboard of flagged risks, while deal owners received pushed explanations when their deals crossed defined risk thresholds. Scores were never surfaced in isolation. Explanations and drivers always accompanied them.
The system was embedded into existing weekly and quarterly rhythms. No new rituals were invented.
4. Data and modelling choices
Data sources
CRM: stage transitions, deal velocity, stakeholder churn
Product usage: adoption drop-offs, engagement anomalies
Customer signals: support volume, sentiment, NPS
Behavioural metadata: response latency, meeting cadence
Modelling approach
Gradient-boosted tree models were selected because they:
Perform strongly on heterogeneous, tabular business data
Capture non-linear feature interactions
Support feature attribution for explainability
Deep learning approaches were rejected due to:
Marginal performance gains in this context
Reduced interpretability
Higher operational and governance complexity
All outputs were probabilistic and explicitly calibrated, with regular checks to ensure predicted probabilities matched observed outcomes over time.
LLM usage
LLMs were used only to:
Translate model outputs into executive-readable explanations
Synthesize drivers already visible to the model
LLMs were not used for prediction.
5. Human-in-the-loop governance
Decision thresholds
Below 40 percent risk: informational only
40–65 percent: manager review recommended
Above 65 percent: mandatory human review with action logging
Ownership and review
Thresholds were owned by RevOps
Reviewed quarterly with Sales and Finance leadership
Adjusted as sales motion or market conditions changed
Overrides
All recommendations could be overridden
Overrides required a reason
Override data fed back into evaluation and process review
This preserved judgement while enforcing accountability.
6. Failure modes and safeguards
Known risks
Data drift as sales tactics evolved
Over-confidence during low-volume periods
Bias towards historically “typical” deals
Safeguards in practice
Continuous calibration monitoring
Drift detection on key feature distributions
Automatic confidence suppression when uncertainty increased
Concrete example
During a quarter with a new pricing structure, drift detection triggered on deal velocity features. The system automatically:
Suppressed high-confidence recommendations
Flagged outputs as low confidence
Prompted a temporary return to manual review
The decision to suppress or resume normal operation sat with RevOps in consultation with Data and Sales leadership. The model was recalibrated before normal operation resumed.
If the system was unsure, it said so.
7. Metrics and outcomes
What we did not optimise for
Raw prediction accuracy
Single-number model performance
What we measured instead
Over two quarters:
~30 percent reduction in last-week deal escalations compared to the previous two quarters
At-risk deals identified on average 2–3 weeks earlier
~20 percent fewer surprise misses against quarterly revenue forecasts compared to the prior baseline
The most telling signal was behavioural. Leadership conversations shifted from “Why did this happen?” to “What should we do now?”
8. Trade-offs and judgement
Deliberate choices included:
Accepting lower recall to avoid false confidence
Prioritising explainability over marginal accuracy gains
Limiting automation despite pressure to “let the AI decide”
This was a decision system, not a leaderboard.
9. What this demonstrates as an AI PM
This case study demonstrates my ability to:
Frame AI problems around decision quality and accountability
Design probabilistic systems that respect uncertainty
Balance technical capability with organisational reality
Say no to inappropriate uses of AI and defend those decisions
The work required close, ongoing alignment across Sales, RevOps, Finance, and Data, particularly where model outputs intersected with forecasting and external guidance.
AI Chief of Staff for Founder-Led Team
Summary
Designed and led an AI decision intelligence system that helped senior leaders in a £50–100m ARR B2B SaaS business anticipate revenue risk earlier, intervene deliberately, and make accountable decisions under uncertainty.
1. Context and problem
Business context
The organisation was a mid-size B2B SaaS company with:
£50–100m ARR
£100k–£250k ACV deals
90–120 day sales cycles
150–250 active deals per quarter
In this environment, a small number of late-stage decisions had outsized impact. Two or three large deals slipping could materially affect quarterly guidance, investor confidence, and leadership credibility.
The real problem
Despite sophisticated dashboards, including stage-based CRM forecasting and pipeline coverage views, leaders consistently discovered risk too late.
Decision-making relied on:
Lagging indicators from CRM stages
Sales-manager intuition
Retrospective explanations after targets were missed
The issue was not lack of data. It was a decision quality problem under uncertainty.
Leaders had information, but no reliable way to:
Quantify risk early
Understand why risk was changing
Decide when intervention was actually warranted
2. Why AI was appropriate, and why simpler approaches failed
Why rules and heuristics broke down
The organisation had previously tried:
Rules-based risk scoring
Velocity thresholds
Red-flag checklists
These approaches failed because:
They could not model non-linear interactions across signals
They degraded quickly as sales behaviour evolved
They were gamed once rules became predictable
They created confidence without robustness.
Why AI was justified
AI was appropriate because it could:
Combine weak signals across CRM, product usage, and customer behaviour
Produce probabilistic outputs rather than binary judgements
Adapt as patterns shifted over time
Explicit restraint
The system was not designed to:
Make autonomous decisions
Replace commercial judgement
Optimise blindly for short-term revenue
AI informed decisions. Humans owned outcomes.
That boundary shaped every design choice.
3. Product vision and user workflow
Product vision
An AI Risk & Decision Intelligence System that:
Predicts the probability of adverse outcomes
Explains the drivers behind rising or falling risk
Recommends interventions with confidence bounds
Enforces human-in-the-loop governance
Who used it and how
Deal owners
Received explanations and drivers, not raw scores
Used insights to adjust deal strategy
Sales managers
Reviewed flagged deals weekly
Focused attention on the highest risk-adjusted portion of pipeline
CRO and leadership
Saw aggregate risk shifts and confidence bands
Used trends for forecasting and intervention planning
In practice, this operated as a hybrid system: managers and leaders reviewed a weekly dashboard of flagged risks, while deal owners received pushed explanations when their deals crossed defined risk thresholds. Scores were never surfaced in isolation. Explanations and drivers always accompanied them.
The system was embedded into existing weekly and quarterly rhythms. No new rituals were invented.
4. Data and modelling choices
Data sources
CRM: stage transitions, deal velocity, stakeholder churn
Product usage: adoption drop-offs, engagement anomalies
Customer signals: support volume, sentiment, NPS
Behavioural metadata: response latency, meeting cadence
Modelling approach
Gradient-boosted tree models were selected because they:
Perform strongly on heterogeneous, tabular business data
Capture non-linear feature interactions
Support feature attribution for explainability
Deep learning approaches were rejected due to:
Marginal performance gains in this context
Reduced interpretability
Higher operational and governance complexity
All outputs were probabilistic and explicitly calibrated, with regular checks to ensure predicted probabilities matched observed outcomes over time.
LLM usage
LLMs were used only to:
Translate model outputs into executive-readable explanations
Synthesize drivers already visible to the model
LLMs were not used for prediction.
5. Human-in-the-loop governance
Decision thresholds
Below 40 percent risk: informational only
40–65 percent: manager review recommended
Above 65 percent: mandatory human review with action logging
Ownership and review
Thresholds were owned by RevOps
Reviewed quarterly with Sales and Finance leadership
Adjusted as sales motion or market conditions changed
Overrides
All recommendations could be overridden
Overrides required a reason
Override data fed back into evaluation and process review
This preserved judgement while enforcing accountability.
6. Failure modes and safeguards
Known risks
Data drift as sales tactics evolved
Over-confidence during low-volume periods
Bias towards historically “typical” deals
Safeguards in practice
Continuous calibration monitoring
Drift detection on key feature distributions
Automatic confidence suppression when uncertainty increased
Concrete example
During a quarter with a new pricing structure, drift detection triggered on deal velocity features. The system automatically:
Suppressed high-confidence recommendations
Flagged outputs as low confidence
Prompted a temporary return to manual review
The decision to suppress or resume normal operation sat with RevOps in consultation with Data and Sales leadership. The model was recalibrated before normal operation resumed.
If the system was unsure, it said so.
7. Metrics and outcomes
What we did not optimise for
Raw prediction accuracy
Single-number model performance
What we measured instead
Over two quarters:
~30 percent reduction in last-week deal escalations compared to the previous two quarters
At-risk deals identified on average 2–3 weeks earlier
~20 percent fewer surprise misses against quarterly revenue forecasts compared to the prior baseline
The most telling signal was behavioural. Leadership conversations shifted from “Why did this happen?” to “What should we do now?”
8. Trade-offs and judgement
Deliberate choices included:
Accepting lower recall to avoid false confidence
Prioritising explainability over marginal accuracy gains
Limiting automation despite pressure to “let the AI decide”
This was a decision system, not a leaderboard.
9. What this demonstrates as an AI PM
This case study demonstrates my ability to:
Frame AI problems around decision quality and accountability
Design probabilistic systems that respect uncertainty
Balance technical capability with organisational reality
Say no to inappropriate uses of AI and defend those decisions
The work required close, ongoing alignment across Sales, RevOps, Finance, and Data, particularly where model outputs intersected with forecasting and external guidance.
AI Risk & Decision Intelligence System for High-Stakes Business Decisions
Summary
Designed and led an AI decision intelligence system that helped senior leaders in a £50–100m ARR B2B SaaS business anticipate revenue risk earlier, intervene deliberately, and make accountable decisions under uncertainty.
1. Context and problem
Business context
The organisation was a mid-size B2B SaaS company with:
£50–100m ARR
£100k–£250k ACV deals
90–120 day sales cycles
150–250 active deals per quarter
In this environment, a small number of late-stage decisions had outsized impact. Two or three large deals slipping could materially affect quarterly guidance, investor confidence, and leadership credibility.
The real problem
Despite sophisticated dashboards, including stage-based CRM forecasting and pipeline coverage views, leaders consistently discovered risk too late.
Decision-making relied on:
Lagging indicators from CRM stages
Sales-manager intuition
Retrospective explanations after targets were missed
The issue was not lack of data. It was a decision quality problem under uncertainty.
Leaders had information, but no reliable way to:
Quantify risk early
Understand why risk was changing
Decide when intervention was actually warranted
2. Why AI was appropriate, and why simpler approaches failed
Why rules and heuristics broke down
The organisation had previously tried:
Rules-based risk scoring
Velocity thresholds
Red-flag checklists
These approaches failed because:
They could not model non-linear interactions across signals
They degraded quickly as sales behaviour evolved
They were gamed once rules became predictable
They created confidence without robustness.
Why AI was justified
AI was appropriate because it could:
Combine weak signals across CRM, product usage, and customer behaviour
Produce probabilistic outputs rather than binary judgements
Adapt as patterns shifted over time
Explicit restraint
The system was not designed to:
Make autonomous decisions
Replace commercial judgement
Optimise blindly for short-term revenue
AI informed decisions. Humans owned outcomes.
That boundary shaped every design choice.
3. Product vision and user workflow
Product vision
An AI Risk & Decision Intelligence System that:
Predicts the probability of adverse outcomes
Explains the drivers behind rising or falling risk
Recommends interventions with confidence bounds
Enforces human-in-the-loop governance
Who used it and how
Deal owners
Received explanations and drivers, not raw scores
Used insights to adjust deal strategy
Sales managers
Reviewed flagged deals weekly
Focused attention on the highest risk-adjusted portion of pipeline
CRO and leadership
Saw aggregate risk shifts and confidence bands
Used trends for forecasting and intervention planning
In practice, this operated as a hybrid system: managers and leaders reviewed a weekly dashboard of flagged risks, while deal owners received pushed explanations when their deals crossed defined risk thresholds. Scores were never surfaced in isolation. Explanations and drivers always accompanied them.
The system was embedded into existing weekly and quarterly rhythms. No new rituals were invented.
4. Data and modelling choices
Data sources
CRM: stage transitions, deal velocity, stakeholder churn
Product usage: adoption drop-offs, engagement anomalies
Customer signals: support volume, sentiment, NPS
Behavioural metadata: response latency, meeting cadence
Modelling approach
Gradient-boosted tree models were selected because they:
Perform strongly on heterogeneous, tabular business data
Capture non-linear feature interactions
Support feature attribution for explainability
Deep learning approaches were rejected due to:
Marginal performance gains in this context
Reduced interpretability
Higher operational and governance complexity
All outputs were probabilistic and explicitly calibrated, with regular checks to ensure predicted probabilities matched observed outcomes over time.
LLM usage
LLMs were used only to:
Translate model outputs into executive-readable explanations
Synthesize drivers already visible to the model
LLMs were not used for prediction.
5. Human-in-the-loop governance
Decision thresholds
Below 40 percent risk: informational only
40–65 percent: manager review recommended
Above 65 percent: mandatory human review with action logging
Ownership and review
Thresholds were owned by RevOps
Reviewed quarterly with Sales and Finance leadership
Adjusted as sales motion or market conditions changed
Overrides
All recommendations could be overridden
Overrides required a reason
Override data fed back into evaluation and process review
This preserved judgement while enforcing accountability.
6. Failure modes and safeguards
Known risks
Data drift as sales tactics evolved
Over-confidence during low-volume periods
Bias towards historically “typical” deals
Safeguards in practice
Continuous calibration monitoring
Drift detection on key feature distributions
Automatic confidence suppression when uncertainty increased
Concrete example
During a quarter with a new pricing structure, drift detection triggered on deal velocity features. The system automatically:
Suppressed high-confidence recommendations
Flagged outputs as low confidence
Prompted a temporary return to manual review
The decision to suppress or resume normal operation sat with RevOps in consultation with Data and Sales leadership. The model was recalibrated before normal operation resumed.
If the system was unsure, it said so.
7. Metrics and outcomes
What we did not optimise for
Raw prediction accuracy
Single-number model performance
What we measured instead
Over two quarters:
~30 percent reduction in last-week deal escalations compared to the previous two quarters
At-risk deals identified on average 2–3 weeks earlier
~20 percent fewer surprise misses against quarterly revenue forecasts compared to the prior baseline
The most telling signal was behavioural. Leadership conversations shifted from “Why did this happen?” to “What should we do now?”
8. Trade-offs and judgement
Deliberate choices included:
Accepting lower recall to avoid false confidence
Prioritising explainability over marginal accuracy gains
Limiting automation despite pressure to “let the AI decide”
This was a decision system, not a leaderboard.
9. What this demonstrates as an AI PM
This case study demonstrates my ability to:
Frame AI problems around decision quality and accountability
Design probabilistic systems that respect uncertainty
Balance technical capability with organisational reality
Say no to inappropriate uses of AI and defend those decisions
The work required close, ongoing alignment across Sales, RevOps, Finance, and Data, particularly where model outputs intersected with forecasting and external guidance.