Now that we have seen
what DRAM looks like, let’s briefly look at how to
connect the DRAM to the processor. So we have our processor. It sends its requests to level one
cache, which sends its misses and write back requests to the larger level two
cache, which might send its misses and write back requests to in even
larger level three cache and let’s say that these are all
on our processor chip. Now what happens is the misses and the write back requests from the L3
cache would be made over and external connection, so
we need processor pins here. And traditionally, this data would go
over what is called a front-side bus. Now you want to design
the processor chip, so that you can connect it to
many possible memories. So you don’t design the front-side bus,
such that it supplies the row address and a column address and
so on to the memory chips. What you have instead is another chip
that contains the memory controller and that memory controller will have what’s
called a memory channel connecting it to a DRAM module. And over this channel,
it will issue things like open a row, read something, get the data. Write something, supply the data. Close or open another one and so on and it will usually have more
than one such memory channel. So the memory latency is seen by
the level three cache is not only the access time of
the memory are right here, it includes sending the request
over a front-side bus. Having the memory
controller figure it out, sending the page open to
the appropriate DRAM. Sending a request to read, supplying
the column addresses and everything. Getting the data over
the memory channel, know that you have
a level three cache miss. That means you want a whole cache
line worth of data to read, so it takes a while to
transfer the data here. Then the memory controller reads
it from the memory channel, which usually has one bus frequency and sends it at a different data rate over
to front-side bus to the processor chip, which then puts the line together and
puts it into L3 cache. So the latency includes all of this
here in addition to just the memory access and this can be a significant
part of the overall memory latency. So recent processor chips integrate
the memory controller, which means that the memory controller is put on the same
chip as the processor and the caches. Now we no longer need the front-side
bus, we can use lots of unshaped wiring to communicate with
the on chip memory controller. So this can be plenty of bandwidth and
very, very close. This whole thing is something
like two by two centimeters and then we just send requests directly
through the memory channels to DRAM. So now the processor chip directly
knows how to talk to DRAM’s and open pages and so on, which dramatically
can reduce this part of the latency. And because it’s not a negligible
part of the overall latency, we get our data from
dram a little bit faster. Actually, 10,
20 maybe 30% faster than before, but the cost of that is that now, we design a processor chip to
talk to a specific kind of DRAM. And because of that, we need a relatively high degree of
standardization of DRAM modules. So that, for example, when we go from 2
gigabytes to 4 gigabyte memory modules, we don’t have to
redesign our whole chip. So the protocols here, got a lot
more standardized and uniform and flexible than before. But in exchange, we get to transfer data
very quickly between the processor and the memory controller. We can make the memory
controller now very smart and then access memory in
a more efficient way.

Connecting DRAM To The Processor – Georgia Tech – HPCA: Part 4
Tagged on:                                 

Leave a Reply

Your email address will not be published. Required fields are marked *