Software Breaks in Mysterious Ways: Stories and Learnings from Netflix
Unexpected regressions, bugs with no apparent reason—they happen especially if you work in a complex, fast-paced ecosystem with multiple moving parts. Is it possible to avoid failures, if you don’t know what can fail? How do you even find out about such failures? The Netflix Growth Engineering team owns the infrastructure for user signup for over 190 different countries. The team supports many device types and payment methods, and run A/B experiments. They integrate with tens of services owned by multiple other teams as well as external partners. They have legacy code. So many opportunities for things to go wrong. Marek Kiszkis will present techniques employed at Netflix Growth Engineering to prevent and detect unknown errors. Learn how Netflix embraced end-to-end testing (which now comprises 70% of our team's testing pyramid), and how we made them cheaper, faster, and more reliable. Understand how Netflix leveraged A/B testing for migrations from legacy systems or large-scale refactorings. Understand how accepting the fact that software will break in unknown ways changes your day-to-day work, from development workflow, to deployments, to monitoring and alerting. Accompanied, of course, by war stories caused by unknown errors.