Skip to content
【☢】 ▷ Hardware, Reviews, News, Tutorials, Help post

A new type of DRAM memory built into the CPU, has zero latency

The new DRAM, made of oxide semiconductors and built in layers above the processor die (that is, integrated), retains hundreds or thousands of times more bits than commercial DRAM and could provide significant area and power savings , especially running large neural networks.

How Zero Latency DRAM Works

The DRAM cells in your PC are made up of a single transistor and a single capacitor each in a design called 1T1C. To write a bit to the cell the transistor is turned on and the charge is pushed towards (1) or removed from (0) the capacitor. To read from it, the charge (if any) is removed and measured; it is a very fast, cheap and low power consumption system but it has some disadvantages: on the one hand, reading the bit depletes the capacitor so it has to be written back to memory and in fact even if the bit is not read the charge will eventually leak out of the capacitor through the transistor.

So all cells need to be updated periodically just to be able to keep data, and in modern chips this is done every 64 milliseconds.

Embedding DRAM on the processor chip itself is done commercially, but it has its limits. The challenge with the 1T1C monolithic design has always been the difficulty of building the capacitor, as well as making a non-leaking transistor using the manufacturing process intended for logic transistors. Good capacitors are difficult to make in manufacturing processes created for logic circuits.

Instead, the new built-in DRAM is made of just two transistors and uses no capacitor (2T0C), and this works because the gate of a transistor is a natural capacitor, albeit a small one. So the payload the bit represents can be stored there, and this design has some key benefits, especially for AI.

2T0C DRAM circuit

One of these advantages is that writing and reading involve doing it on separate devices, and therefore can be read from a 2T0C DRAM cell without destroying the data and having to write it again. All you have to do is see if current flows through the transistor whose gate holds the charge, and if it is there, it will turn the transistor on, current flows, and if there is no change it stops.

This is a zero latency read, and it is especially important for AI because neural networks tend to read at least three times for every write. However, a 2T0C arrangement does not work well with silicon logic transistors, and any bit would drain immediately because the transistor gate capacitance is too low and the leakage through the transistors is too high. So researchers have turned to devices made from amorphous oxide semiconductors, such as those used for the pixels of some displays.

A new manufacturing method for a new DRAM

These types of materials have some especially useful qualities for DRAM. In particular, they can carry a lot of current, which makes writing faster, and when they are off they lose very little charge, which makes the bits last longer. The US-based team used Indium Oxide doped at about 1% with Tungsten (IWO) as its semiconductor. The currents in this type of device provide sufficient read and write speed sufficient for logical operations, and at the same time the shutdown currents are really small, on the order of two or three times less than with Silicon.

Zero latency DRAM

In the image above we can see the transistors in the capacitorless DRAM that the US team has developed. It includes a Tungsten doped Indium oxide semiconductor (orange), Palladium upper and lower gates (yellow), Nickel source and drain electrodes (green), and Hafnium oxide dielectrics (blue).

Just as important is that oxides like these can be processed at relatively low temperatures. That means devices made from these materials can be built in layers of interconnects on top of a processor’s silicon without damaging the devices underneath. Building memory cells in that position provides a direct, high-bandwidth path for data to reach the processing elements on the silicon, effectively breaking down the memory wall and with near-zero latency.