AI revives in-memory processors

nabla9 · on May 3, 2018

David Patterson (who else) experimented with the idea almost 20 years ago. Idea was to put CPU into SRAM, remove caches and connect to the rest of the world with fast serial interfaces.

http://iram.cs.berkeley.edu/

I'm guessing that mixing CMOS and SRAM into same process is hard to do in large scale.

kurthr · on May 3, 2018

With water stacking, its not unreasonable to have high interconnect density for two different (highly optimized eg DRAM and FinFET) processes with high yield. I expect it to happen sooner than later. Latency to extremely large caches could be significantly improved. I expect it to happen.

https://semiengineering.com/whats-next-in-scaling-and-stacki...

jacquesm · on May 3, 2018

SRAM can be CMOS.

nabla9 · on May 3, 2018

Of course, I had a brain fart.

I meant mixing CMOS and DRAM.

jacquesm · on May 3, 2018

Ah, now I get it. I was wondering whether you meant to write that. Thank you for clearing that up!

tbrownaw · on May 3, 2018

aims to compute neural-network jobs inside a flash memory array, working in the analog domain to slash power consumption.

That’s a bit more than just locating the processor inside the memory array.

_fq4v · on May 3, 2018

An in memory processor is not simply a processor in a memory array. It's more like a processor based on cellular automata where each memory cell also performs computation.

tbrownaw · on May 3, 2018

Not sure how state machines / cellular automata tie in to analog processing. Am I missing something, or is the article mixing things up?

mywittyname · on May 3, 2018

My thinking is that analog signals are a more natural and effective processing medium for hardware neural nets. The "memory" component of the chip might be constant signals representing the weight of each (initial) connection in the network, which amplifies the down-stream signal.

Contrast this with digital circuits where the amplification of each transistor is a high/low signal from an upstream transistor which represents a 0/1, making it a switch.

Quequau · on May 3, 2018

Whatever happened to that tech that Micron was working on? Automata processors or something.

joe_the_user · on May 3, 2018

Sounds exciting and uncertain.

Questions: How close would this be to a drop-in replacement for a GPU? How close could this sort of chip be to a general purpose SIMD[1] processor?

[1] https://en.wikipedia.org/wiki/SIMD

sanxiyn · on May 3, 2018

Currently, PIM(Processing In Memory) has no reasonable programming model. Hardware people don't seem to understand that without programming model, hardware capability will remain unused. As long as this is not solved, PIM will continue to fail.

The most reasonable one I have seen is Ambit from Microsoft Research. Ambit looks nice for its proposed workloads, but it is still unclear how it can be extended to more general computation.

https://www.microsoft.com/en-us/research/publication/ambit-m...

jacobush · on May 3, 2018

I think there is. Verilog and VHDL. A lot of "normal" programmers would hesitate to call that "reasonable", I guesss, but if the shoe fits...

sanxiyn · on May 3, 2018

Well, joe_the_user asked for "drop-in replacement" for GPU or SIMD. Verilog and VHDL are definitely not.

joe_the_user · on May 3, 2018

That's true but I'm also curious about what we would wind-up with with in-memory processors.

Is it possible that the now-defunct Micron Automaton Processor would be an example of the kind of model one would wind-up with?

It is still supposedly being studies here:

https://engineering.virginia.edu/center-automata-processing-...

jacquesm · on May 3, 2018

> How close would this be to a drop-in replacement for a GPU?

Very far away from it. A GPU is a clever bit of kit but it still has a memory bus, what is described here does away with that bus and moves computational capability right next to the memory cells.

jacobush · on May 3, 2018

Sounds like an FPGA.

jacquesm · on May 3, 2018

An FPGA used for this kind of application would incur significant overhead compared to an ASIC, because you'd be wasting large chunks of the FPGA to remain idle.

If in-memory processors are going to be a thing you could prototype them very inefficiently on an FPGA, which would be an interesting thing to do but not interesting enough to give it a commercial edge in that particular form of packaging.

jacobush · on May 3, 2018

I meant more in spirit. What is an FPGA more than a very small computing device (logical function block) next to a memory (configuration of said function block)?

jacquesm · on May 3, 2018

The configuration part of an FPGA is not at all like normal memory, it is much more like a flash device, and one that can't be as easily reconfigured as that due to the internal structure of the FPGA. It is more akin to a series of fuses that you can blow to create a circuit than memory, and if it were to be seen as memory it would be (P)ROM rather than RAM.

In order to make a part of the FPGA work as memory you'd have to use some of that capacity to use the logical function blocks to emulate the memory.

jacobush · on May 3, 2018

Indeed. Now imagine, what people already tried years ago on FPGA, rewriting the memory configuration in a machine learning, or evolution fashion. But with RAM instead of flash (-like) memory the iterations could go much, much faster.

rasz · on May 3, 2018

we had Ram based cellular automata for ~10 years now, didnt go anywhere

http://www.micronautomata.com/research

sp332 · on May 3, 2018

Now we have a much more lucrative application for them.

partycoder · on May 3, 2018

There is a no discernible distinction between memory and computing elements in nervous systems.

ouid · on May 3, 2018

extraordinary claims require more than zero evidence

eleitl · on May 3, 2018

Neuroscience 101.