A Survey on System-Level Design of Neural Network Accelerators

  • Kenshu Seto Tokyo City University - Japan
Keywords: Convolutional Neural Network, CNN, Inference, Accelerator, Optimization

Abstract

In this paper, we present a brief survey on the system-level optimizations used for convolutional neural network (CNN) inference accelerators. For the nested loop of convolutional (CONV) layers, we discuss the effects of loop optimizations such as loop interchange, tiling, unrolling and fusion on CNN accelerators. We also explain memory optimizations that are effective with the loop optimizations. In addition, we discuss streaming architectures and single computation engine architectures that are commonly used in CNN accelerators. Optimizations for CNN models are briefly explained, followed by the recent trends and future directions of the CNN accelerator design.

Published
2021-08-18
Section
Special Issue on Intelligent, Energy-Efficient and Robust Computing Systems