Back-Of-The-Envelope-Calculations : Google’s approach to system designing
How do you know which is the “best” design for a given problem? If, for example,
“you were given the problem of generating an image search results page of 30 thumbnails, would you load images sequentially? In parallel? Would you cache? How would you decide?”
One way is to start implementing every possible option you could and see for max performance boost you could achieve.But that’s crazy impractical, isn’t it?. But there is other way around it, we can use “Back of the envelope calculations” method typically used by Google as a workaround for such problems. Let’s see how it works and be sure about what you are going to implement next.
Evaluate different designs : Numbers you should know before
Jeff Dean, Head of Google’s School of Infrastructure Wizardry — instrumental in many of Google’s key systems: ad serving, BigTable; search, MapReduce, ProtocolBuffers — advocates evaluating different designs using back-of-the-envelope calculations. He gives the full story in this Stanford video presentation.
Google always better at whatever it do.
Back-of-the-envelope calculations are estimates you create using a combination of thought experiments and common performance numbers to a get a good feel for which designs will meet your requirements. Dr. Dean thinks an important skill for every software engineer is the ability to estimate the performance of alternative systems, using back of the envelope calculations, without having to build them. To evaluate design alternatives you first need a good sense of how long typical operations will take. Here is a list given by him
- L1 cache reference 0.5 ns
- Branch mispredict 5 ns
- L2 cache reference 7 ns
- Mutex lock/unlock 100 ns
- Main memory reference 100 ns
- Compress 1K bytes with Zippy 10,000 ns
- Send 2K bytes over 1 Gbps network 20,000 ns
- Read 1 MB sequentially from memory 250,000 ns
- Round trip within same datacenter 500,000 ns
- Disk seek 10,000,000 ns
- Read 1 MB sequentially from network 10,000,000 ns
- Read 1 MB sequentially from disk 30,000,000 ns
- Send packet CA->Netherlands->CA 150,000,000 ns
There are a lot many facts to to see here:
- Notice the magnitude differences in the performance of different options.
- Datacenters are far away so it takes a long time to send anything between them.
- Memory is fast and disks are slow.
- By using a cheap compression algorithm a lot (by a factor of 2) of network bandwidth can be saved.
- Writes are 40 times more expensive than reads.
- Global shared data is expensive. This is a fundamental limitation of distributed systems. The lock contention in shared heavily written objects kills performance as transactions become serialized and slow.
- Architect for scaling writes.
- Optimize for low write contention.
- Optimize wide. Make writes as parallel as you can.
Since you get to know these numbers now we can get back our initial problem we were discussing.
Generate Image Results Page Of 30 Thumbnails
Design 1 — Serial
- Read images serially. Do a disk seek. Read a 256K image and then go on to the next image.
- Performance: 30 seeks * 10 ms/seek + 30 * 256K / 30 MB /s = 560ms
Design 2 — Parallel
- Issue reads in parallel.
- Performance: 10 ms/seek + 256K read / 30 MB/s = 18ms
- There will be variance from the disk reads, so the more likely time is 30–60ms
Which design is best? It depends on you requirements, but given the back-of-the-envelope calculations you have a quick way to compare them without building them.
Now you have a framework for asking yourself other design questions and comparing different design variations:
- Does it make sense to cache single thumbnail images?
- Should you cache a whole set of images in one entry?
- Does it make sense to precompute the thumbnails?
To make these estimates realistic you’ll have to know the performance of your services. If there is an unknown variable then perhaps you could rapidly prototype just that part to settle the question. To know if caching is a good design alternative, for example, you’ll have to know how long it takes to write into your cache.