BiNary NEURal NetwOrks based oN CMOS/RRAM Hybrid archItecture and in Memory Computing Concept for Sensor Fusion Application – NEURONIC
NEURONIC aims at the fabrication of a binarized neural network hardware accelerator associating CMOS circuitry and fully embedded Resistive Memory (RRAM), based on the logic in memory concept. This digital accelerator will perform inference using a revolutionary “XNOR-NET” concept, and function with record-breaking low energy consumption.
Deep neural networks are currently the most widely investigated algorithms in Artificial Intelligence. Unfortunately, their operations on processors or graphics cards consume considerable energy, in particular due to the intensive data exchanges between processors and memory. However, a recent breakthrough in deep learning – XNOR-NET – has shown that the memory requirements of deep neural networks could be considerably reduced. Indeed, in XNOR-NET synaptic weights and neuronal activations are replaced by binary values, and the arithmetic operations by simple logic operations instate of 32 or 64 bits multiplication and storage. Quite astonishingly, due to the inherent redundancy in neural networks, the performance of such networks is only weakly degraded with regards to conventional deep neural networks.
In this context, NEURONIC aims at fabricating revolutionary ASICs that perform inference with such binarized neural networks. We will use a novel embedded Resistive Memory (RRAM) technology as memory, which can be embedded on chip at the core of the CMOS, and thus that can locally store synaptic weight with logic in memory and perform computation. Our solution will allow processing with a minimal amount of energy comparatively to classical solutions based on processor that are constrained by memory capacity.
In terms of applications, NEURONIC targets in particular sensor fusion, a process that aggregates data from several different sensors to compute more accurate information.
Our ASIC implementation of XNOR-NETs will be able to perform inference with exceptionally low energy consumption because:
• Computing can be achieved directly “in-memory”, avoiding considerable energy-hungry constant synaptic weight transfer.
• Memory requirement of XNOR-NET are minimal, allowing all the memory to be distributed on chip. Moreover, as RRAM is a non-volatile memory, the ASIC can be turned off any time to save power and be instantly usable when turned back on.
• The memory needs to be programmed only once before the ASIC is used. Indeed, the ASIC is specialized for inference; training will be performed by software off-chip.
We estimate that the combination of NEURONIC ideas could allow a 1,000x energy reduction for inference with regard to processor-based solutions for Internet-of-Things use cases.
Two generations of NEURONIC systems will be developed: one generation using a transistor for selecting memory cells (“1T1R” RRAM structure), and a second generation using a currently under development back end selector (“1S1R” structure), allowing considerable area reduction. The circuit architecture will rely on full-custom design of basic block including RRAM array and associated logic through various memory organizations. The global demonstrator architecture will be then defined using a classical “top-down” digital flow. This will allow using classical digital optimization technics, such as virtualization, multiplexing and pipeline to adapt the architecture to the XNOR-NET complexity.
We are not the only group with the idea to implement XNOR-NETs with RRAM. Several Asian and American groups have started to present system simulation preliminary results. However, all the proposed approaches are focusing on RRAM cross-point architectures to perform inference in an analog manner, which can lead to conversion error and low reliability. We propose a more robust concept, which gives us a significant edge at being the first group to implement a real XNOR-Net based on RRAM. Nevertheless, the international competition gives a sense of urgency to the funding of NEURONIC project.
Project coordination
Jacques-Olivier KLEIN (Université Paris Sud , Centre de Nanosciences et de Nanotechnologies)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partnership
UPSud , C2N Université Paris Sud , Centre de Nanosciences et de Nanotechnologies
IM2NP Institut des Matériaux, de Microélectronique et des Nanosciences de Provence
CEA - LETI Commissariat à l'énergie atomique et aux énergies alternatives
Help of the ANR 513,000 euros
Beginning and duration of the scientific project:
March 2019
- 48 Months