Direction: A CPU has a 32 kbyte direct mapped cache with 128-byte block size. Suppose A is a two-dimensional array of size 512 × 512 with elements that occupy 8-byte each. Consider the following two C code segments, P1 and P2
P1 : for (i = 0; i < 512; i + +) {
for (j = 0; j < 512; j + +) {
x + = A [i] [j];
}
}
P2 : for (i = 0; i < 512; i + +) {
for (j = 0; j < 512; j + +) {
x + = A [j] [i];
}
}
P1 and P2 are executed independently with the same initial state, namely, the array A is not in the cache and i, j, x are in registers. Let the number of cache misses experienced by P1 be M1 and that for P2 be M2.
-
The value of the ratio M1 is M2
-
- Zero
-
1 16
-
1 8
- 16
- Zero
Correct Option: B
Now M2 = ? In the case (P2 loop) the array is accessed column wise, so even the block brought for A [0][0]? A[0][15] would not be used for second column wise access i.e. A[1][0]. So new block need to swap, similarly for A[3][0] & so on. This would continue for every element, since memory is contiguous.
So M2 = 512 × 512 = 262144
And M1 / M2 =16384 / 262144 = 1 / 16