Vector Operations
The multi-media extensions in today’s mainstream processors implement vector operations only in a limited fashion.
Vector instructions are characterized by large numbers of oper...
Increasing Latency
One thing about future development of memory technology is almost certain: latency will continue to creep up.
延迟将继续蔓延。
We already discussed, in section 2.2.4, that the upcomin...
Transactional Memory
In their groundbreaking(开创性) 1993 paper Herlihy and Moss propose to implement transactions for memory operations in hardware since software alone cannot deal with the problem ...
Upcoming Technology
In the preceding sections about multi-processor handling we have seen that significant performance problems must be expected if the number of CPUs or cores is scaled up.
But t...
缺页优化
On operating systems like Linux with demand-paging support, an mmap call only modifies the page tables.
It makes sure that, for file-backed pages, the underlying data can be found and, for a...
Improving Branch Prediction
In section 6.2.2, two methods to improve L1i use through branch prediction and block reordering were mentioned:
static prediction through __builtin_expect and profile ...
Measuring Memory Usage
Knowing how much memory a program allocates and possibly where the allocation happens is the first step to optimizing its memory use.
There are, fortunately, some easy-to-u...