A 65nm image processing SoC supporting multiple DNN models and real-time computation-communication trade-off via actor-critical neuro-controller (VLSI 2020)

Editing a markdown file for a talk This work presents a 65nm wireless image processing SoC for real-time computation-communication trade-off on resource-constrained edge devices. The test-chip includes (1) an all-digital, near-memory, reconfigurable and programmable neural-network (NN) based systolic image processor at 1.05TOPS/W (peak), (2) a digitally-adaptive RF-DAC based transceiver with Tx energy-efficiency of 768pJ/b and (3) a mixed-signal, time-based, actor-critic neuro-controller with compute-in-memory (CIM) and in-place weight updates that provides online learning and adaptation at 0.59pJ/MAC for efficiently controlling the computation, communication blocks separately as well as jointly.

A 65nm 1.1-to-9.1 tops/w hybrid-digital-mixed-signal computing platform for accelerating model-based and model-free swarm robotics (ISSCC 2019)

Editing a markdown file for a talk Artificial swarm intelligence, inspired by biological studies of insects, ants, and other organisms, presents an emerging computing paradigm, where seemingly simple elements interact with each other to collectively solve challenging problems. In particular, swarm robotics, where multiple robots coordinate in real-time to solve diverse problems such as pattern-formation, cooperative reinforcement learning (RL), path-planning, etc. [1], find extensive uses in exploration, reconnaissance, and disaster relief. This is partly motivated by the robustness of swarm dynamics to failures and malfunctions of individual robots. Successful hardware demonstrations of neuro-inspired algorithms on edge devices [2]-[6] is now leading to the emergence of intelligence and control in swarms as the next frontier. Although certain swarm algorithms rely on real-time learning (e.g., cooperative RL) representing a model-free approach, many powerful algorithms that have been developed over the past two decades (e.g., pattern formation) rely on a mathematical structure and represent a more traditional model-based approach. The next generation of swarm hardware needs to support both of these approaches. In this paper, we identify the commonalities and shared compute primitives across a variety of model-based and model-free swarm algorithms and present a unified, fully programmable, energy-efficient, and scalable platform capable of real-time swarm intelligence.

A 55nm 50nJ/encode 13nJ/decode Homomorphic Encryption Crypto-Engine for IoT Nodes to Enable Secure Computation on Encrypted Data (CICC 2019)

Editing a markdown file for a talk This work presents a 55nm test-chip prototype of a cryptosystem (encryption and decryption) implementing Homomorphic Encryption that can enable computation on encrypted data. Test-chip measurements show 50nJ/encode and 13nJ/decode thus making the cryptosystem suitable for sensor-nodes and IoT applications.

A 65nm thermometer-encoded time/charge-based compute-in-memory neural network accelerator at 0.735 pJ/MAC and 0.41 pJ/Update (TCAS-II 2020)

Editing a markdown file for a talk This work presents an in-memory compute macro for neural-network-based controllers including inference and in-situ weight updates featuring: (1) in-memory multi-bit matrix and transposed matrix multiplication; (2) thermometer-encoded, pulse-modulated storage element for in-memory weight update; (3) adaptive bit-line analog-to-digital (A/D) conversion for enhanced area/power efficiency. The chip was fabricated in 65nm CMOS technology and measured energy efficiency of 0.735pJ/multiply-accumulate operation (MAC) and 0.41pJ/weight update.