Skip to content
Allerin, go to homepage
Back to Home

How we measure outcomes

We publish what changes, how we calculate it, and when we call success. The same rules apply to every deployment.

What we track

Latency (p95)

95th-percentile end-to-end request time for defined operations.

Security (critical CVEs)

Count of critical vulnerabilities open at go-live (target: zero).

Infra spend

Comparable monthly run-rate for compute, storage, and egress for the scoped system.

Adoption & engagement

Usage of shipped capabilities (eligible population, active users, events).

Accuracy & drift (ML/CV)

Precision/recall or class-wise accuracy vs. a labeled sample; drift deltas on key features.

Windows & sampling

Pre-window

Minimum 14 days of production baseline (exclude incidents).

Post-window

Minimum 14 days after cutover (exclude incident days; allow warm-up of 48 hours).

Like-for-like

Identical operation sets, identical time-of-day/day-of-week distribution.

Confidence

If p-value > 0.1 or seasonality bias is detected, extend windows or rerun.

Scope

Only the system(s) touched by the engagement; shared services allocated pro-rata.

Formulas & examples

Latency (p95 Δ%)

(p95_pre − p95_post) ÷ p95_pre

Example: 840 ms → 450 ms ⇒ (840−450)/840 = 46% lower

Critical CVEs at go-live

count(severity = critical, status=open) on release branch at T0

Example: Target 0

Infra spend Δ%

(run-rate_pre − run-rate_post) ÷ run-rate_pre

Example: $42k → $33k ⇒ 21% lower

Adoption rate

active_users_feature ÷ eligible_population (same window)

Example: 1,250 active / 2,000 eligible = 62.5%

CV accuracy

per-class precision/recall vs. labeled sample, with site weighting

Example: Drift = KS/PSI on selected features and Δ accuracy vs. gate

Instrumentation & tools

Latency

Distributed tracing/metrics (e.g., OpenTelemetry → Prometheus/Grafana), sampled by operation.

Security

SCA/SAST/DAST scanners plus OS package scanners; SBOM at release.

Infra

Cloud bills and usage (compute/storage/egress), plus on-prem meter data where applicable.

Adoption

App analytics + server events; anonymous where required.

Accuracy & drift

Eval harness (fixed seed), site-stratified samples, drift monitors on features and outputs.

Acceptance criteria

Performance

p95 lower by an agreed target (typ. 30–60%), sustained for the post-window, no feature freeze.

Security

0 critical CVEs before go-live; high/medium tracked with owner and SLA.

Cost

Infra run-rate 20–40% lower for scoped workloads, same or better SLOs.

ML/CV

Accuracy at or above gate; drift bounded; reviewer load at target.

Adoption

Feature usage reaches agreed floor within the window.

Evidence we export

  • Before/after KPI chart pack (PNG/PDF)
  • Scanner reports + SBOM summary at release
  • Cost deltas with line items and allocation notes
  • Eval summary (confusion matrices, drift plots)
  • Change log and rollback plan snapshot

Frequently asked questions

Last updated: October 28, 2025

Ready to ship with measurable outcomes?

Every sprint ends with verifiable metrics. Let's discuss your KPIs.