Why Tracking Averaged Out Business Metrics is Just Not Good Enough
- Jinal Sanghavi
- Jun 23
- 1 min read
Updated: Aug 14
I came across the concept of 𝐭𝐚𝐢𝐥 𝐥𝐚𝐭𝐞𝐧𝐜𝐲 in an interview by Aravind Srinivas, CEO of Perplexity. Tail latency refers to the small percentage of response times from a system that take significantly longer than the average. It is typically expressed as high percentile latencies, such as the 98th or 99th percentile response times. He talked about how he obsesses over the load time for each query not at an average level but specifically at the tail latency. Because, it's that 1% users that are frustrated with delays and it impacts their user experience.

And, it struck me that most companies measure business performance in averages. Few report or measure tail latency - in customer service response times, order delivery times, NPS scores, etc.
But tail latency makes all the difference in:
- Reliability: High tail latency can be an early warning sign of underlying system issues.
- True performance insight: Tail latency provides a more accurate representation of worst-case scenarios and system performance under stress, whereas average latency can mask these issues.
- User experience impact: Tail latency has a more significant effect on user experience, especially in large-scale distributed systems where even rare performance hiccups can affect a substantial fraction of requests.
- Resource allocation: Understanding tail latency helps in allocating resources more efficiently.
While the concept is for computing architecture, I think it's application isn't restricted to this business function. Whether in performance marketing, operational excellence or product manufacturing, continously improvements to tail latency make a whole lot of difference
Comments