Learning in Energy-Efficient Neuromorphic Computing: Algorithm and Architecture Co-Design

Name: Learning in Energy-Efficient Neuromorphic Computing: Algorithm and Architecture Co-Design
Brand: Wiley-ISTE
Price: 116.99 EUR
Availability: OnlineOnly

Nan Zheng Pinaki Mazumder(Author)

Wiley-ISTE (Publisher)

1st Edition

Published on 18. October 2019

296 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-119-50740-6 (ISBN)

€116.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Explains current co-design and co-optimization methodologies for building hardware neural networks and algorithms for machine learning applications

This book focuses on how to build energy-efficient hardware for neural networks with learning capabilities--and provides co-design and co-optimization methodologies for building hardware neural networks that can learn. Presenting a complete picture from high-level algorithm to low-level implementation details, Learning in Energy-Efficient Neuromorphic Computing: Algorithm and Architecture Co-Design also covers many fundamentals and essentials in neural networks (e.g., deep learning), as well as hardware implementation of neural networks.

The book begins with an overview of neural networks. It then discusses algorithms for utilizing and training rate-based artificial neural networks. Next comes an introduction to various options for executing neural networks, ranging from general-purpose processors to specialized hardware, from digital accelerator to analog accelerator. A design example on building energy-efficient accelerator for adaptive dynamic programming with neural networks is also presented. An examination of fundamental concepts and popular learning algorithms for spiking neural networks follows that, along with a look at the hardware for spiking neural networks. Then comes a chapter offering readers three design examples (two of which are based on conventional CMOS, and one on emerging nanotechnology) to implement the learning algorithm found in the previous chapter. The book concludes with an outlook on the future of neural network hardware.

* Includes cross-layer survey of hardware accelerators for neuromorphic algorithms

* Covers the co-design of architecture and algorithms with emerging devices for much-improved computing efficiency

* Focuses on the co-design of algorithms and hardware, which is especially critical for using emerging devices, such as traditional memristors or diffusive memristors, for neuromorphic computing

Learning in Energy-Efficient Neuromorphic Computing: Algorithm and Architecture Co-Design is an ideal resource for researchers, scientists, software engineers, and hardware engineers dealing with the ever-increasing requirement on power consumption and response time. It is also excellent for teaching and training undergraduate and graduate students about the latest generation neural networks with powerful learning capabilities.

More details

Other editions

Persons

Content

Preface xi

Acknowledgment xix

1 Overview 1

1.1 History of Neural Networks 1

1.2 Neural Networks in Software 2

1.2.1 Artificial Neural Network 2

1.2.2 Spiking Neural Network 3

1.3 Need for Neuromorphic Hardware 3

1.4 Objectives and Outlines of the Book 5

References 8

2 Fundamentals and Learning of Artificial Neural Networks 11

2.1 Operational Principles of Artificial Neural Networks 11

2.1.1 Inference 11

2.1.2 Learning 13

2.2 Neural Network Based Machine Learning 16

2.2.1 Supervised Learning 17

2.2.2 Reinforcement Learning 20

2.2.3 Unsupervised Learning 22

2.2.4 Case Study: Action-Dependent Heuristic Dynamic Programming 23

2.2.4.1 Actor-Critic Networks 24

2.2.4.2 On-Line Learning Algorithm 25

2.2.4.3 Virtual Update Technique 27

2.3 Network Topologies 31

2.3.1 Fully Connected Neural Networks 31

2.3.2 Convolutional Neural Networks 32

2.3.3 Recurrent Neural Networks 35

2.4 Dataset and Benchmarks 38

2.5 Deep Learning 41

2.5.1 Pre-Deep-Learning Era 41

2.5.2 The Rise of Deep Learning 41

2.5.3 Deep Learning Techniques 42

2.5.3.1 Performance-Improving Techniques 42

2.5.3.2 Energy-Efficiency-Improving Techniques 46

2.5.4 Deep Neural Network Examples 50

References 53

3 Artificial Neural Networks in Hardware 61

3.1 Overview 61

3.2 General-Purpose Processors 62

3.3 Digital Accelerators 63

3.3.1 A Digital ASIC Approach 63

3.3.1.1 Optimization on Data Movement and Memory Access 63

3.3.1.2 Scaling Precision 71

3.3.1.3 Leveraging Sparsity 76

3.3.2 FPGA-Based Accelerators 80

3.4 Analog/Mixed-Signal Accelerators 82

3.4.1 Neural Networks in Conventional Integrated Technology 82

3.4.1.1 In/Near-Memory Computing 82

3.4.1.2 Near-Sensor Computing 85

3.4.2 Neural Network Based on Emerging Non-volatile Memory 88

3.4.2.1 Crossbar as a Massively Parallel Engine 89

3.4.2.2 Learning in a Crossbar 91

3.4.3 Optical Accelerator 93

3.5 Case Study: An Energy-Efficient Accelerator for Adaptive Dynamic Programming 94

3.5.1 Hardware Architecture 95

3.5.1.1 On-Chip Memory 95

3.5.1.2 Datapath 97

3.5.1.3 Controller 99

3.5.2 Design Examples 101

References 108

4 Operational Principles and Learning in Spiking Neural Networks 119

4.1 Spiking Neural Networks 119

4.1.1 Popular Spiking Neuron Models 120

4.1.1.1 Hodgkin-Huxley Model 120

4.1.1.2 Leaky Integrate-and-Fire Model 121

4.1.1.3 Izhikevich Model 121

4.1.2 Information Encoding 122

4.1.3 Spiking Neuron versus Non-Spiking Neuron 123

4.2 Learning in Shallow SNNs 124

4.2.1 ReSuMe 124

4.2.2 Tempotron 125

4.2.3 Spike-Timing-Dependent Plasticity 127

4.2.4 Learning Through Modulating Weight-Dependent STDP in Two-Layer Neural Networks 131

4.2.4.1 Motivations 131

4.2.4.2 Estimating Gradients with Spike Timings 131

4.2.4.3 Reinforcement Learning Example 135

4.3 Learning in Deep SNNs 146

4.3.1 SpikeProp 146

4.3.2 Stack of Shallow Networks 147

4.3.3 Conversion from ANNs 148

4.3.4 Recent Advances in Backpropagation for Deep SNNs 150

4.3.5 Learning Through Modulating Weight-Dependent STDP in Multilayer Neural Networks 151

4.3.5.1 Motivations 151

4.3.5.2 Learning Through Modulating Weight-Dependent STDP 151

4.3.5.3 Simulation Results 158

References 167

5 Hardware Implementations of Spiking Neural Networks 173

5.1 The Need for Specialized Hardware 173

5.1.1 Address-Event Representation 173

5.1.2 Event-Driven Computation 174

5.1.3 Inference with a Progressive Precision 175

5.1.4 Hardware Considerations for Implementing the Weight-Dependent STDP Learning Rule 181

5.1.4.1 Centralized Memory Architecture 182

5.1.4.2 Distributed Memory Architecture 183

5.2 Digital SNNs 186

5.2.1 Large-Scale SNN ASICs 186

5.2.1.1 SpiNNaker 186

5.2.1.2 TrueNorth 187

5.2.1.3 Loihi 191

5.2.2 Small/Moderate-Scale Digital SNNs 192

5.2.2.1 Bottom-Up Approach 192

5.2.2.2 Top-Down Approach 193

5.2.3 Hardware-Friendly Reinforcement Learning in SNNs 194

5.2.4 Hardware-Friendly Supervised Learning in Multilayer SNNs 199

5.2.4.1 Hardware Architecture 199

5.2.4.2 CMOS Implementation Results 205

5.3 Analog/Mixed-Signal SNNs 210

5.3.1 Basic Building Blocks 210

5.3.2 Large-Scale Analog/Mixed-Signal CMOS SNNs 211

5.3.2.1 CAVIAR 211

5.3.2.2 BrainScaleS 214

5.3.2.3 Neurogrid 215

5.3.3 Other Analog/Mixed-Signal CMOS SNN ASICs 216

5.3.4 SNNs Based on Emerging Nanotechnologies 216

5.3.4.1 Energy-Efficient Solutions 217

5.3.4.2 Synaptic Plasticity 218

5.3.5 Case Study: Memristor Crossbar Based Learning in SNNs 220

5.3.5.1 Motivations 220

5.3.5.2 Algorithm Adaptations 222

5.3.5.3 Non-idealities 231

5.3.5.4 Benchmarks 238

References 238

6 Conclusions 247

6.1 Outlooks 247

6.1.1 Brain-Inspired Computing 247

6.1.2 Emerging Nanotechnologies 249

6.1.3 Reliable Computing with Neuromorphic Systems 250

6.1.4 Blending of ANNs and SNNs 251

6.2 Conclusions 252

References 253

A Appendix 257

A.1 Hopfield Network 257

A.2 Memory Self-Repair with Hopfield Network 258

References 266

Index 269

Preface

In 1987, when I was wrapping up my doctoral thesis at the University of Illinois, I had a rare opportunity to listen to Prof. John Hopfield of the California Institute of Technology describing his groundbreaking research in neural networks to spellbound students in the Loomis Laboratory of Physics at Urbana-Champaign. He didactically described how to design and fabricate a recurrent neural network chip to rapidly solve the benchmark Traveling Salesman Problem (TSP), which is provably NP-complete in the sense that no physical computer could solve the problem in asymptotically bounded polynomial time as the number of cities in the TSP increases to a very large number.

This discovery of algorithmic hardware to solve intractable combinatorics problems was a major milestone in the field of neural networks as the prior art of perceptron-type feedforward neural networks could merely classify a limited set of simple patterns. Though, the founder of neural computing, Prof. Frank Rosenblatt of Cornel University had built a Mark 1 Perceptron computer in the late 1950s when the first waves of digital computers such as IBM 650 were just commercialized. Subsequent advancements in neural hardware designs were stymied mainly because of lack of integration capability of large synaptic networks by using the then technology, comprising vacuum tubes, relays, and passive components such as resistors, capacitors, and inductors. Therefore, in 1985, when AT&T Bell Labs fabricated the first solid-state proof-of-concept TSP chip by using MOS technology to verify Prof. John Hopfield's neural net architecture, it opened the vista for solving non-Boolean and brain-like computing on silicon.

Prof. John Hopfield's seminal work established that if the "objective function" of a combinatorial algorithm can be expressed in quadratic form, the synaptic links in a recurrent artificial neural network could be accordingly programmed to reduce (i.e. locally minimize) the value of the objective function through massive interactions between the constituent neurons. Hopfield's neural network consists of laterally connected neurons that can be randomly initialized and then the network can iteratively reduce the intrinsic Lyapunov energy function of the network to reach a local minima state. Notably, the Lyapunov function decreases in a monotone fashion under the dynamics of the recurrent neural networks, where neurons are not provided with self-feedback.1

Prof. Hopfield used a combination of four separate quadratic functions to represent the objective function of the TSP. The first part of the objective function ensures that the energy function minimizes if the traveling salesman traverses cities exactly once, the second part ensures that the traveling salesman visits all cities in the itinerary, the third part ensures that no two cities are visited simultaneously, and the fourth part of the quadratic function is designed to determine the shortest route connecting all cities in the TSP. Because of massive simultaneous interactions between neurons through the connecting synapses that are precisely adjusted to meet the constraints in the above quadratic functions, a simple recurrent neural network could rapidly generate a very good quality solution. However, unlike well-tested software procedures such as simulated annealing, dynamic programming, and the branch-and-bound algorithm, neural networks generally fail to find the best solution because of their simplistic connectionist structures.

Therefore, after listening to Prof. Hopfield's fascinating talk I harbored a mixed feeling about the potential benefit of his innovation. On the one hand, I was thrilled to learn from his lecture how computationally hard algorithmic problems could be solved very quickly by using simple neuromorphic CMOS circuits having very small hardware overheads. On the other hand, I thought that the TSP application that Prof. Hopfield selected to demonstrate the ability of neural networks to solve combinatorial optimization problems was not the right candidate, as software algorithms are well crafted to obtain nearly the best solution that the neural networks can hardly match. I started contemplating developing self-healing VLSI chips where the power of neural-inspired self-repair algorithms could be used to automatically restructure faulty VLSI chips. Low overheads and the ability to solve a problem concurrently through parallel interactions between neurons are two salient features that I thought could be elegantly deployed for automatically repairing VLSI chips by built-in neural net circuitry.

Soon after I joined the University of Michigan as an assistant professor, working with one of my doctoral students [2], and, at first, I developed a CMOS analog neural net circuitry with asynchronous state updates, which lacked robustness due to process variation within a die. In order to improve the reliability of the self-repair circuitry, an MS student [3] and I designed a digital neural net circuitry with synchronous state updates. These neural circuits were designed to repair VLSI chips by formulating the repair problem in terms of finding the node cover, edge cover, or node pair matching in a bipartite graph. In our graph formalism, one set of vertices in the bipartite graph represented the faulty circuit elements, and the other set of vertices represented the spare circuit elements. In order to restructure a faulty VLSI chip into a fault-free operational chip, the spare circuit elements were automatically invoked through programmable switching elements after identifying the faulty elements through embedded built-in self-testing circuitry.

Most importantly, like the TSP problem, the two-dimensional array repair can be shown to be an NP-complete problem because the repair algorithm seeks the optimal number of spare rows and spare columns that can be assigned to bypass faulty components such as memory cells, word-line and bit-line drivers, and sense amplifier bands located inside the memory array. Therefore, simple digital circuits comprising counters and other blocks woefully fail to solve such intractable self-repair problems. Notably, one cannot use external digital computers to determine how to repair embedded arrays, as input and output pins of the VLSI chip cannot be deployed to access the fault patterns in the deeply embedded arrays.

In 1989 and 1992, I received two NSF grants to expand the neuromorphic self-healing design styles to a wider class of embedded VLSI modules such as memory array [4], processors array [5], programmable logic array, and so on [6]. However, this approach to improving VLSI chip yield by built-in self-testing and self-repair was a bit ahead of its time as the state-of-the-art microprocessors in the early 1990s contained only a few hundred thousands of transistors as well as the submicron CMOS technology that was relatively robust. Therefore, after developing the neural-net based self-healing VLSI chip design methodology for various types of embedded circuit blocks, I stopped working on CMOS neural networks. I was not particularly interested in pursuing applications of neural networks for other types of engineering problems, as I wanted to remain focused on solving emerging problems in VLSI research.

On the other hand, in the late 1980s there were mounting concerns among CMOS technology prognosticators about the impending red brick wall heralding the end of the shrinking era in CMOS. Therefore, to promote several types of emerging technologies that might push the frontier of VLSI technology, the Defense Advanced Research Projects Agency (DARPA) in the USA had initiated (around 1990) the Ultra Electronics: Ultra Dense, Ultra Fast Computing Components Research Program. Concurrently, the Ministry of International Trade & Industry (MITI) in Japan had launched the Quantum Functional Devices (QFD) Project. Early successes with a plethora of innovative non-CMOS technologies in both research programs led to the launching of the National Nanotechnology Initiative (NNI), which is a U.S. Government research and development (R&D) initiative, involving 20 departments and independent agencies to bring about revolution in nanotechnology to impact the industry and society at large.

During the period of 1995 and 2010, my research group had at first focused on a quantum physics based device and circuit modeling for quantum tunneling devices, and then we extensively worked on cellular neural network (CNN) circuits for image and video processing by using one-dimensional (double barrier resonant tunneling device), two-dimensional (self-assembled nanowire), and three-dimensional (quantum dot array) constrained quantum devices. Subsequently, we developed learning-based neural network circuits by using resistive synaptic devices (commonly known as memristors) and CMOS neurons. We also developed analog voltage programmable nanocomputing architectures by hybridizing quantum tunneling and memristive devices in computing nodes of a two-dimensional processing element (PE) ensemble. Our research on nanoscale neuromorphic circuits will soon be published in our new book, titled: Neuromorphic Circuits for Nanoscale Devices, River Publishing, U.K., 2019.

After spending a little over a decade developing...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Learning in Energy-Efficient Neuromorphic Computing: Algorithm and Architecture Co-Design

Description

More details

Other editions

Additional editions

Persons

Content

Preface

System requirements