Lyft: Quality Engineer → Developer Experience PM

When to stop optimizing for conventional metrics — and start optimizing what actually matters.

Developer ExperienceInternal ToolsSimulationSystems ThinkingValidation StrategyPlatform Reliability
Lyft developer experience

Challenge

  • Major driver workflow change touched pricing, dispatch, and earnings services across regions
  • Weekly SEVs tied to earnings/pricing logic exposed gaps in existing validation
  • Critical bugs slipped past manual regression suites and UI automation
  • Severe failures emerged only when multiple services interacted under real-world conditions
  • Move beyond “maximize coverage” to prevent high-stakes system failures before production

Role

  • System Quality Engineer → Internal Tool Product Manager
  • Set quality strategy for complex, multi-service driver systems
  • Created alignment on where validation effort mattered most (failure modes over coverage)
  • Identified simulation + metrics as the highest-leverage solution
  • Owned roadmap, reliability, and adoption of the internal simulation tool

Approach & Decisions

UI automation → acceptance tests → still not enough → simulation + metrics

Used evidence to question UI automation ROI
Reviewed six months of Jira bugs and SEVs to see what automation could realistically prevent.
  • UI tests would have caught only a small fraction of high-impact issues
Named where automation breaks down
The worst failures came from real-world combinations (region logic, ride types, pricing edges) that tests can’t cover reliably.
  • Multi-service interactions
  • Edge-case condition explosions
Rebalanced the validation strategy
Shifted effort away from flaky UI tests toward deterministic backend acceptance tests.
  • Automate what’s deterministic
  • Acknowledge the limits of coverage
Made simulation the center of gravity
Postmortems + metrics made it clear: system-level simulation paired with alerts was the only reliable early-warning mechanism.
  • Surface failures before launch
  • Detect behavioral changes across services
Productized the solution for adoption
Took ownership of the rider–driver simulation tool and improved reliability, usability, and adoption under constraints.
  • Roadmap from user feedback + usage data
  • Better support loops and documentation

Outcomes

  • Earlier detection: shifted discovery of high-risk failures from post-launch to pre-release
  • Prevented recurring SEVs with metrics, alerts, and simulation checks automation couldn’t cover
  • Improved reliability/adoption of internal validation tooling (+10%)
  • Reduced support load and on-call noise (−30%), validating ~$150K quarterly savings

Learnings

  • Judgment beats tools: redirect effort when it stops creating value
  • Leverage comes from understanding system interactions, not optimizing a single layer
  • Evidence is the fastest way to influence and align without authority
  • Internal tools need product rigor: clear users, workflows, metrics, and positioning