Look at the world from the perspective of the CPU, talk about how slow they are.

I often hear people say that the disk is very slow, the network is very card, which is expressed in the human perception dimension, such as copying a file to the hard disk takes a few minutes to tens of minutes, enough for me to eat a meal; Downloading a movie online, sometimes it takes a few hours, I can sleep.

The most well-known chart about the difference in speed between different components of a computer is the pyramid form: the faster the speed, the smaller the capacity and the higher the price. This picture just gives us an intuitive feel and does not give a quantitative explanation and explanation of each speed and performance. In fact, the difference between the different levels is much larger than this picture. This article lets you look at the world from the perspective of the CPU and talk about how slow they are.

I hope you see two things after reading this article:

Disk and network are really slow

Performance optimization is a complex systemic live

Note: All data is from this address. All data will be different due to different machine configurations or hardware updates, but it does not affect our intuitive feelings. If you are interested in these data, this URL gives the values â€‹â€‹for some indicators for different years.

data

Let's take a look at the speed of the CPU. Take my computer, the clock speed is 2.6G, which means that 2.6*10^9 instructions can be executed per second, and each instruction only needs 0.38ns (now many personal computers). The main frequency is higher than this, and the configuration is relatively high to reach 3.0G+). We take this time as the basic unit 1s, because 1s is probably the smallest unit of time that humans can perceive.

Look at the world from the perspective of the CPU, talk about how slow they are.

The level 1 cache read time is 0.5 ns, which translates to human time is about 1.3 s, about one or two heartbeats. Here you can see the importance of the cache, because its speed can catch up with the CPU, the locality of the program itself plus the optimization at the instruction level, the hit rate of cache access is very high, which ultimately can greatly improve efficiency.

Branch prediction errors take 5 ns, and conversion to human time is about 13 s. This is a long time, so you will see a lot of articles analyzing how to optimize the code to reduce the probability of branch prediction, such as the very high score of stackoverflow.

The second-level cache time is relatively long, about 7ns, converted to human time is about 18.2s, you can see that if the first-level cache does not hit, then go to the second-level cache to read the data, the time difference is an order of magnitude.

Tip: Why do you need multiple layers of CPU cache? This article gives an explanation through an easy-to-understand example.

We continue, the lock and unlock time of the mutex lock takes 25ns, and the time for human conversion is about 65s, which is one minute for the first time. In concurrent programming, we often hear that locks are a very time consuming thing, because it takes a minute to heat a thing in the microwave oven, you have to wait so silly for quite a long time.

Then it is in memory, each memory address needs 100ns, converted into human time is 260s, which is more than 4 minutes, if you read some articles that do not require much thinking, you can read 2-3 thousand words for a long time (this fast In the era of reading, few people can meditate more on the phone.) It doesn't look bad, it doesn't take much longer to read a piece of data from memory. When it comes to memory, time has changed by an order of magnitude, and the speed bottleneck between CPU and memory is called the Von Neumann bottleneck.

A CPU context switch (system call) takes about 1500ns, which is 1.5us (this number refers to this article, using the average time of a single-core CPU thread), converted to human time is about 65 minutes, ah, that is, a hour. We also know that context switching is a very time consuming behavior. After all, it is guilty to waste an hour each time. The more horrible thing about context switching is that during this time the CPU didn't do any useful calculations, just switching the registers and memory states of two different processes; and this process also corrupted the cache, making subsequent calculations more time consuming.

It takes 20us to transfer 2K of data on a 1Gbps network, and 14.4 hours in human time. After a long time, you can finish watching the "Star Wars" trilogy (even with the time of eating and peeing)! It can be seen that very little data transfer on the network has been very long for the CPU. And the time here is still the theoretical maximum, and the actual process is even slower.

The SSD random read takes 150us and the conversion to human time is about 4.5 days. In other words, the SSD reads the point data, the CPU can take a vacation, and the group participates in the surrounding tour. Although we know that SSDs are much faster than mechanical hard drives, this speed is similar to a turtle for a CPU. I/O Devices Starting from the hard disk has become a long time, and this time we think of the benefits of memory. Minimize the reading and writing of IO devices, and put the most commonly used data into memory as a cache is the general knowledge of all programs. The emergence of cache systems like memcached and redis in recent years has solved the problem here.

Reading 1MB of continuous data from the memory takes about 250us, and the conversion time to human time is 7.5 days. This holiday is upgraded to the National Day seven days abroad.

It takes 0.5ms to run back and forth on the same data center network, and the conversion time into human time is about 15 days, which is half a month. If your program has segment code that needs to interact with other servers in the data center, the CPU has been mad for half a month during this time. Reducing network requests from different service components is a major issue in performance optimization.

Reading 1MB of sequential data from the SSD takes about 1ms, and the conversion to human time is 1 month. That is to say, the SSD reads a normal file. If you have to wait for you to finish it, the CPU will be abandoned in a month. Still, the SSD is fast, and I don't believe you see the performance of the mechanical disk below.

The disk addressing time is 10ms, which translates into human time is 10 months, just enough for humans to create a new life. If the CPU needs to make the disk a cup of coffee, in its eyes, the disk has a child, come back and tell it that you let me make the coffee. Mechanical hard disks use RPM (Revolutions Per Minute) to evaluate disk performance: the larger the RPM, the shorter the average addressing time and the better the disk performance. Addressing simply moves the head to the correct track before reading the contents of the specified sector. In other words, although addressing is a waste of time, it does not do anything (read disk content).

It takes 20ms to read 1MB of continuous data from disk and 20 months into human time. IO devices are the bottleneck of computer systems. I hope you can understand this sentence more deeply after reading here! If you still don't understand it, think about what you bought online. It has been delivered for nearly two years. What is your mood?

From the network of different cities in the world, it takes an average of 150ms (refer to the time of pinging messages around the world), and the time converted into human is 12.5 years. It's not hard to understand that all programs and architectures try to avoid network access in different cities and even across countries. CDN is a solution to this problem: let users interact with the server closest to them, thus reducing the transmission time of packets on the network. .

It takes about 4 seconds for the virtual machine to restart, and it takes more than 300 years to convert to humans. For this, I thought of the story of Steve Jobs's desperate efforts to optimize the boot time of the Mac system. If the machine can restart less and every time it starts faster, it will not only save lives, but also save the CPU.

It takes 5 minutes for the physical server to restart once, and it is 25,000 years for human time. It is catching up with the history of human civilization. 5 minutes humans have to wait for a while, let alone the CPU, so nothing to restart the server, end the cycle of a civilization in minutes.

Breakout Cable Assembly

Breakout Cable Assembly,Fiber Optic Trunk Cable,Cable Assembly,Breakout Cable Assembly Adaptor

Huizhou Fibercan Industrial Co.Ltd , https://www.fibercan-network.com