优化 L1 指令
Preparing code for good L1i use needs similar techniques as good L1d use.
The problem is, though, that the programmer usually does not directly influence the way L1i is used unless s/he ...
Cache Access
Programmers wishing to improve their programs’ performance will find it best to focus on changes affected the level 1 cache since those will likely yield the best results.
We will di...
What Programmers Can Do
After the descriptions in the previous sections it is clear that there are many, many opportunities for programmers to influence a program’s performance, positively or nega...
NUMA Support
In section 2 we saw that, on some machines, the cost of access to specific regions of physical memory differs depending on where the access originated.
This type of hardware requires...
Virtual Memory
The virtual memory (VM) subsystem of a processor implements the virtual address spaces provided to each process.
This makes each process think it is alone in the system.
ps: 让每一个进...
FSB Influence
The FSB plays a central role in the performance of the machine.
Cache content can only be stored and loaded as quickly as the connection to the memory allows.
We can show how much ...
Cache Placement
Where the caches are placed in relationship to the hyperthreads(超级线程), cores, and processors is not under control of the programmer.
But programmers can determine(确认) where the th...
Critical Word Load(关键词加载)
Memory is transferred from the main memory into the caches in blocks which are smaller than the cache line size.
Today 64 bits are transferred at once and the cache line...