
作者:岳金山 著
页数:280
出版社:清华大学出版社
出版日期:2024
ISBN:9787302656005
电子书格式:pdf/epub/txt
内容简介
神经网络算法在诸多领域具有广泛应用,本书针对神经网络在低功耗智能设备上实际应用中遇到的关键挑战,研究了融合数字电路与存内计算的高能效神经网络处理器。
本书创新成果包括:①提出了一种基于特定卷积核的数据复用架构,提升了能量效率;②针对不规则稀疏的额外硬件开销,设计了频域神经网络处理器,支持高效FFT计算和频域二维数据复用;③提出了一种存内计算系统架构,支持分块结构化稀疏、数据复用与ADC动态关断;④针对现有存内计算芯片难以支持大规模网络的挑战,提出了组相联稀疏存内计算架构,支持乒乓权重更新,并进行了流片验证。
本书展示了融合数字电路与存内计算的技术路线。证明充分利用数字电路与存内计算的优势,结合器件、电路、架构和算法应用等多层次联合优化能够实现更高能效。
作者简介
岳金山博士于2016年和2021年分别获得清华大学本科和博士学位,于2021年进入 微电子研究所从事博士后研究。主要研究方向包括高能效神经网络加速芯片、存内计算和基于新型器件的电路与系统。目前已发表30余篇期刊/会议论文,包括2篇一作JSSC和3篇一作ISSCC。曾获得ASP-DAC2021学生论坛 海报奖,清华大学 博士学位论文奖,入选北京市科技新星计划。
本书特色
《融合数字电路与存内计算的高能效神经网络处理器(英文版)》展示了融合数字电路与存内计算的技术路线,证明了充分利用数字电路与存内计算的优势,结合器件、电路、架构和算法应用等多层次联合优化能够实现 高能效的神经网络处理器。
目录
1 Introduction
1.1 Research Background and Significance
1.1.1 Development Trends of Neural Network
1.1.2 Requirements of NN Processor
1.1.3 Energy-Efficient NN Processors
1.2 Summary of the Research Work
1.2.1 Overall Framework of the Research Work
1.2.2 Main Contributions of This Book
1.3 Overall Structure of This Book
References
2 Basics and Research Status of Neural Network Processors
2.1 Basics of Neural Network Algorithms
2.2 Basics of Neural Network Processors
2.3 Research Status of Digital-Circuits-Based NN Processors
2.3.1 Data Reuse
2.3.2 Low-Bit Quantization
2.3.3 NN Model Compression and Sparsity
2.3.4 Summary of Digital-Circuits-Based NN Processors
2.4 Research Status of CIM NN Processors
2.4.1 CIM Principle
2.4.2 CIM Devices
2.4.3 CIM Circuits
2.4.4 CIM Macro
2.4.5 Summary of CIM NN Processors
2.5 Summary of This Chapter
References
3 Energy-Efficient NN Processor by Optimizing Data Reuse for Specific Convolutional Kernels
3.1 Introduction
3.2 Previous Data Reuse Methods and the Constraints
3.3 The KOP3 Processor Optimized for Specific Convolutional Kernels
3.4 Processing Array Optimized for Specific Convolutional Kernels
3.5 Local Memory Cyclic Access Architecture and Scheduling Strategy
3.6 Module-Level Parallel Instruction Set and the Control Circuits
3.7 Experimental Results
3.8 Conclusion
References
4 Optimized Neural Network Processor Based on Frequency-Domain Compression Algorithm
4.1 Introduction
4.2 The Limitations of Irregular Sparse Optimization and CirCNN Frequency-Domain Compression Algorithm
4.3 Frequency-Domain NN Processor STICKER-T
4.4 Global-Parallel Bit-Serial FFT Circuits
4.5 Frequency-Domain 2D Data-Reuse MAC Array
4.6 Small-Area Low-Power Block-Wise TRAM
4.7 Chip Measurement Results and Comparison
4.8 Summary of This Chapter
References
5 Digital Circuits and CIM Integrated NN Processor
5.1 Introduction
5.2 The Advantage of CIM Over Pure Digital Circuits
5.3 Design Challenges for System-Level CIM Chips
5.4 Sparse CIM Processor STICKER-IM
5.5 Structural Block-Wise Weight Sparsity and Dynamic Activation Sparsity
5.6 Flexible Mapping and Scheduling and Intra/Inter-Macro Data Reuse
5.7 Energy-Efficient CIM Macro with Dynamic ADC Power-Off
5.8 Chip Measurement Results and Comparison
5.9 Summary of This Chapter
References
6 A “Digital+CIM” Processor Supporting Large-Scale NN Models
6.1 Introduction
6.2 The Challenges of System-Level CIM Chips to Support Large-Scale NN Models
6.3 “Digital+CIM” NN Processor STICKER-IM
6.4 Set-Associate Block-Wise Sparse Zero-Skipping Circuits
6.5 Ping-Pong CIM and Weight Update Architecture
6.6 Ping-Pong CIM Macro with Dynamic ADC Precision
6.7 Chip Measurement Results and Comparison
6.8 Summary of This Chapter
References
7 Summary and Prospect
7.1 Summary of This Book
7.2 Prospect of This Book
1.1 Research Background and Significance
1.1.1 Development Trends of Neural Network
1.1.2 Requirements of NN Processor
1.1.3 Energy-Efficient NN Processors
1.2 Summary of the Research Work
1.2.1 Overall Framework of the Research Work
1.2.2 Main Contributions of This Book
1.3 Overall Structure of This Book
References
2 Basics and Research Status of Neural Network Processors
2.1 Basics of Neural Network Algorithms
2.2 Basics of Neural Network Processors
2.3 Research Status of Digital-Circuits-Based NN Processors
2.3.1 Data Reuse
2.3.2 Low-Bit Quantization
2.3.3 NN Model Compression and Sparsity
2.3.4 Summary of Digital-Circuits-Based NN Processors
2.4 Research Status of CIM NN Processors
2.4.1 CIM Principle
2.4.2 CIM Devices
2.4.3 CIM Circuits
2.4.4 CIM Macro
2.4.5 Summary of CIM NN Processors
2.5 Summary of This Chapter
References
3 Energy-Efficient NN Processor by Optimizing Data Reuse for Specific Convolutional Kernels
3.1 Introduction
3.2 Previous Data Reuse Methods and the Constraints
3.3 The KOP3 Processor Optimized for Specific Convolutional Kernels
3.4 Processing Array Optimized for Specific Convolutional Kernels
3.5 Local Memory Cyclic Access Architecture and Scheduling Strategy
3.6 Module-Level Parallel Instruction Set and the Control Circuits
3.7 Experimental Results
3.8 Conclusion
References
4 Optimized Neural Network Processor Based on Frequency-Domain Compression Algorithm
4.1 Introduction
4.2 The Limitations of Irregular Sparse Optimization and CirCNN Frequency-Domain Compression Algorithm
4.3 Frequency-Domain NN Processor STICKER-T
4.4 Global-Parallel Bit-Serial FFT Circuits
4.5 Frequency-Domain 2D Data-Reuse MAC Array
4.6 Small-Area Low-Power Block-Wise TRAM
4.7 Chip Measurement Results and Comparison
4.8 Summary of This Chapter
References
5 Digital Circuits and CIM Integrated NN Processor
5.1 Introduction
5.2 The Advantage of CIM Over Pure Digital Circuits
5.3 Design Challenges for System-Level CIM Chips
5.4 Sparse CIM Processor STICKER-IM
5.5 Structural Block-Wise Weight Sparsity and Dynamic Activation Sparsity
5.6 Flexible Mapping and Scheduling and Intra/Inter-Macro Data Reuse
5.7 Energy-Efficient CIM Macro with Dynamic ADC Power-Off
5.8 Chip Measurement Results and Comparison
5.9 Summary of This Chapter
References
6 A “Digital+CIM” Processor Supporting Large-Scale NN Models
6.1 Introduction
6.2 The Challenges of System-Level CIM Chips to Support Large-Scale NN Models
6.3 “Digital+CIM” NN Processor STICKER-IM
6.4 Set-Associate Block-Wise Sparse Zero-Skipping Circuits
6.5 Ping-Pong CIM and Weight Update Architecture
6.6 Ping-Pong CIM Macro with Dynamic ADC Precision
6.7 Chip Measurement Results and Comparison
6.8 Summary of This Chapter
References
7 Summary and Prospect
7.1 Summary of This Book
7.2 Prospect of This Book















