國立中山大學,National Sun Yat-sen University,學位論文,thesis/dissertation,高線性和高解析之大範圍溫度偵測器與高效能、低功耗神經網路硬體架構,A High-Linearity and High-Resolution Wide Range Temperature Detector and A High-Performance Low-Power Neural Network Hardware Architecture

論文名稱 Title	高線性和高解析之大範圍溫度偵測器與高效能、低功耗神經網路硬體架構 A High-Linearity and High-Resolution Wide Range Temperature Detector and A High-Performance Low-Power Neural Network Hardware Architecture
系所名稱 Department	電機工程學系 Department of Electrical Engineering
畢業學年期 Year, semester	112 學年度第 1 學期 The fall semester of Academic Year 112	語文別 Language	中文 Chinese
學位類別 Degree	碩士 Master	頁數 Number of pages	104
研究生 Author	羅時亨 Shih-Heng Luo
指導教授 Advisor	王朝欽 Wang,Chua-Chin
召集委員 Convenor	李博明 Lee,Po-Ming
口試委員 Advisory Committee	郭可驥, 李宗哲 Kuo,Ko-Chi; Lee,Tzung-Je
口試日期 Date of Exam	2024-01-12	繳交日期 Date of Submission	2024-01-15
關鍵字 Keywords	高線性度溫度偵測器、正負溫度係數電流產生器、電流頻率轉換器、高效能神經網路加速器、光學卷積神經網路架構 high linearity, PTAT/CTAT current generator, current-to-frequency, hardware accelerator, photonic neural network
統計 Statistics	本論文已被瀏覽 110 次，被下載 0 次 The thesis/dissertation has been browsed 110 times, has been downloaded 0 times.

中文摘要
隨著神經網路的蓬勃發展，複雜的模型架構逐漸問世。傳統硬體架構無法兼具高資料吞吐與高操作頻率，因此專門設計給神經網路的硬體架構已成為開啟AI世代的一把鑰匙。且為了操作過程的穩定與可靠性，環境變異的偵測成為重要的角色。本論文首先針對溫度偵測器提出新型的架構，提高可偵測的溫度範圍並達到高線性度、高解析度的表現。本論文另外配合TSMC JDP 產學合作案，提出針對卷積網路優化的高效能硬體加速器與光學卷積神經網路架構。本論文第一個研究主題為高線性和高解析之大範圍溫度偵測器，係以正、負溫度係數電流產生器，搭配可調節差動對之延遲震盪單元組成的電流頻率轉換器，得到兩組因溫度呈現相反特性的頻率，經由頻率相除減少同相誤差和提高溫度間的差異。本設計可偵測之溫度範圍為-40 ◦C ~ 100 ◦C，並能偵測每0.5 ◦C 溫度變化，晶片量測的最大線性誤差為0.436 %。本論文第二個研究主題包含兩大電路系統，分別為高效能神經網路硬體加速器與光學神經網路架構。第一系統係提出專門用於卷積運算的新型硬體架構，係以高度平行化的運算單元陣列、記憶體資料處理的Reshpae 模組、允許同步讀取與計算的設計，晶片量測可達54.61 GOPS 與96.35 mW 的高效能成果。第二系統提出新穎的光學卷積神經網路架構，係以光學晶片高速且低功耗的特點，處理大量繁瑣的卷積運算，配合電學電路進行特萃取，達成首例光電整合的神經架構。並在一萬筆MNIST 手寫辨識中，具有93.8% 的辨識率。
Abstract
As neural networks flourish, many sophisticated models are emerging. Traditional hardware architectures struggle to achieve high throughput and low power at the same time. Consequently, hardware architectures meticulously designed for neural networks have become an indispensable key to resolve the mentioned issues. Ensuring stability and reliability during any operation, the on-chip sensors play a pivotal role. The first topic in this thesis presents a novel architecture for temperature sensing, enhancing the detectable temperature range while achieving high linearity and high resolution. Additionally, thanks to the support of TSMC JDP project, this thesis secondly presents an efficient hardware accelerator optimized for convolution neural networks (CNN) and an innovative photonic neural network architecture. The high-linearity and high-resolution wide range temperature detector employing PTAT and CTAT current generators is coupled to a current-to-frequency converter composed of differential delay cells. Two frequency sets with opposite temperature characteristics are attained. By the division of these frequencies, common-mode errors are reduced enhancing the differentiation in temperature variations. The proposed sensor can measure temperatures range from -40 ◦C ~ 100 ◦C with 0.5 ◦C resolution. Maximum linearity error of 0.436 % is found by on-silicon measurement. The second topic includes two systems: a high-performance neural network hardware accelerator and a photonic neural network architecture. The first system presents a novel hardware architecture specifically designed for convolutional computations. Besides a highly parallelized PE array, the hardware accelerator is featured with a reshape module for memory data processing, which allows synchronous reading and computation. The on-silicon measurement shows the performance of 54.61 GOPS at 96.35 mW. The second system presents an innovative photonic neural network (PNN) structure. Take advantage of the high-speed and low-power consumption advantages of optical devices, extensive convolutional computations combined with feature extraction is realized. Based on 10,000 MNIST benchmark testing, the recognition rate of 93.8% is achieved.

目次 Table of Contents
論文審定書. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 論文摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 圖目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 表目錄. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii 1 背景說明與研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 前言. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 相關文獻探討. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 溫度偵測器架構介紹. . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 類神經網路硬體加速器. . . . . . . . . . . . . . . . . . . . . . 6 1.2.3 光學神經網路介紹. . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.1 高線性與高解析之大範圍溫度偵測器. . . . . . . . . . . . . . 8 1.3.2 高效能神經網路硬體加速器、光學卷積神經網路架構. . . . 9 1.4 論文大綱. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2 高線性與高解析之大範圍溫度偵測器. . . . . . . . . . . . . . . . . . . . . 12 2.1 簡介. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2 系統架構與原理說明. . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 系統之子電路設計與分析. . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 正溫度係數(PTAT) 頻率產生器. . . . . . . . . . . . . . . . . 13 2.3.2 負溫度係數頻率產生器. . . . . . . . . . . . . . . . . . . . . . 15 2.3.3 電流頻率轉換器. . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.4 測試電路. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 電路模擬結果與預計規格. . . . . . . . . . . . . . . . . . . . . . . . . 22 2.5 電路佈局前後模擬比較. . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.6 晶片佈局與晶片照相. . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.7 晶片量測環境及結果. . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.7.1 量測環境與設定. . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.7.2 量測結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.8 量測結果與討論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.8.1 量測結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2.8.2 量測結果與佈局後模擬比較分析. . . . . . . . . . . . . . . . 39 2.9 電路規格以及與先前文獻比較. . . . . . . . . . . . . . . . . . . . . . 42 2.9.1 電路規格比較與分析. . . . . . . . . . . . . . . . . . . . . . . 42 2.9.2 與先前文獻比較. . . . . . . . . . . . . . . . . . . . . . . . . . 42 3 高效能神經網路硬體加速器與光學卷積神經網路. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.1 簡介與架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2 第一電路系統-類神經網路原理說明. . . . . . . . . . . . . . . . . . . 46 3.3 第一電路系統-硬體加速器電路原理與設計. . . . . . . . . . . . . . . 50 3.3.1 處理單元陣列(PE array) . . . . . . . . . . . . . . . . . . . . . 50 3.3.2 Reshape and Line Buffer . . . . . . . . . . . . . . . . . . . . . . 52 3.3.3 Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3.4 靜態隨機存取記憶體(SRAM) . . . . . . . . . . . . . . . . . . 54 3.4 硬體加速器模擬與效能分析. . . . . . . . . . . . . . . . . . . . . . . 57 3.4.1 類神經網路加速器系統模擬. . . . . . . . . . . . . . . . . . . 57 3.4.2 Clock Tree 分析. . . . . . . . . . . . . . . . . . . . . . . . . . 62 3.4.3 可測試性設計(Design for Testability, DFT) 模擬結果. . . . . . 62 3.5 DLA 晶片實現. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.6 DLA 晶片量測. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.6.1 DLA 量測環境. . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.6.2 DLA 之狀態機跳轉量測結果. . . . . . . . . . . . . . . . . . . 64 3.7 DLA 電路規格與文獻比較. . . . . . . . . . . . . . . . . . . . . . . . 67 3.8 第二電路系統-光學卷積神經網路電路與架構. . . . . . . . . . . . . . 69 3.8.1 PNN 原理與光學晶片. . . . . . . . . . . . . . . . . . . . . . . 70 3.8.2 全連接層之Hidden Layer 層模擬. . . . . . . . . . . . . . . . 71 3.8.3 ZCU111 全連接層電路架構. . . . . . . . . . . . . . . . . . . . 72 3.8.4 DAC 與ADC 電路系統. . . . . . . . . . . . . . . . . . . . . . 73 3.9 PNN 模擬結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.9.1 FPGA 模擬與分析. . . . . . . . . . . . . . . . . . . . . . . . . 74 3.9.2 PNN 實驗與結果. . . . . . . . . . . . . . . . . . . . . . . . . 75 3.10 兩大電路系統測試結果與討論. . . . . . . . . . . . . . . . . . . . . . 76 3.10.1 DLA 結果與討論. . . . . . . . . . . . . . . . . . . . . . . . . 76 3.10.2 PNN 結果與討論. . . . . . . . . . . . . . . . . . . . . . . . . 77 4 結論與未來研究方向. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.1 研究結果與結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.2 未來研究方向. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2.1 高線性與高解析之大範圍溫度偵測器. . . . . . . . . . . . . . 79 4.2.2 高效能神經網路硬體加速器與光學卷積神經網路. . . . . . . 80 Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

參考文獻 References
[1] H. J. Chou, High-performance transformer edge computing accelerator chip [Online]. Available: https://tie.twtm.com.tw/zh-tw/exhibition-detail/183. [2] C.-Z. Shao, and Y.-T. Liao, “A 950-pW 39-pJ/conversion leakage based temperature to digital converter with 43 mk resolution,” in Proc. 2020 IEEE Asian Solid-State Circuits Conference (A-SSCC), pp. 1-4, Oct. 2020. [3] 尤志成, “具熱敏電阻線性化校正電路之溫度對頻率轉換器與量測範圍自動切換之熱敏電阻線性化校正電路,” Master’s thesis, 國立中山大學, Dec. 2017. [4] S. Yin, P. Ouyang, S. Tang, F. Tu, X. Li, S. Zheng, T. Lu, J. Gu, L. Liu, and S. Wei, “A high energy efficient reconfigurable hybrid neural network processor for deep learning applications,” IEEE Journal of Solid-State Circuits, vol. 12, no. 9, pp. 968-982, Apr. 2018. [5] Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, and M. Soljačić, “Deep learning with coherent nanophotonic circuits,” Nature Photonics, vol. 11, pp. 441-446, Jun. 2017. [6] J. Feldmann, N. Youngblood, M. Karpov, H. Gehring, X. Li, M. Stappers, M. Le Gallo, X. Fu, A. Lukashchuk, A. S. Raja, J, Liu, C. D. Wright, A. Sebastian, T. J. Kippenberg, W. H. Pernice, and H. Bhaskaran, “Parallel convolutional processing using an integrated photonic tensor core,” Nature, vol. 589, pp. 52-58, Jan. 2021. [7] X. Wang, P.-H. P. Wang, Y. Cao, and P. P. Mercier, “A 0.6V 75nW all-CMOS temperature sensor with 1.67 m◦C/mV supply sensitivity,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 64, no. 9, pp. 2274-2283, Sep. 2017. [8] C.-C. Wang, Z.-Y. Hou, and J.-C. You, “Temperature-to-frequency converter with 1.47 % error using thermistor linearity calibration,” IEEE Sensors Journal, vol. 19, no. 13, pp. 4804-4811, Jul. 2019. [9] H. Shi, B. Zhou, and F. Zhao, “A high-linear low-power temperature-to-frequency converter with high robustness,” in Proc. 2019 IEEE 3rd Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), pp. 828-831, Oct. 2019. [10] T. Someya, A. K. M. M. Islam, and K. Okada, “A 6.4 nW 1.7 % relative inaccuracy CMOS temperature sensor utilizing sub-thermal drain voltage stabilization and frequency-locked loop,” IEEE Solid-State Circuits Letters, vol. 3, pp. 458-461, Sep. 2020. [11] P.-Y. Lou, Y.-X. Chen, and C.-C. Wang, “On-chip CMOS corner detector design for panel drivers,” in Proc. 18th International SoC Design Conference (ISOCC), pp. 11-12, Oct. 2021. [12] 陳穎萱, “高精準度之PVTL 變異偵測器與具高解析度之溫度偵測器,” Master’s thesis, 國立中山大學, Dec. 2022. [13] Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE Journal of Solid-State Circuits, vol. 52, no. 1, pp. 127-138, Jan. 2017. [14] J. Jo, S. Kim, and I.-C. Park, “Energy-efficient convolution architecture based on rescheduled dataflow,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 65, no. 12, pp. 4196-4207, Dec. 2018. [15] J. S. P. Giraldo, S. Lauwereins, K. Badami, and M. Verhelst, “Vocell: A 65-nm speech-triggered wake-up soc for 10 μw keyword spotting and speaker verification,” IEEE J. Solid-State Circuit, vol. 55, no. 4, pp. 868-878, Apr. 2020. [16] S.-F. Hsiao, K.-C. Chen, C.-C. Lin, H.-J. Chang, and B.-C. Tsai, “Design of a sparsity-aware reconfigurable deep learning accelerator supporting various types of operations,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 10, no. 3, pp. 376-387, Sep. 2020. [17] C.-C. Wang, R. G. B. Sangalang, C.-P. Kuo, H.-C. Wu, Y. Hsu, S.-F. Hsiao, and C.-H. Yeh, “A 40.96-GOPS 196.8-mW digital logic accelerator used in DNN for underwater object recognition,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 69, no. 12, pp. 4860-4871, Dec. 2022. [18] 吳昕哲, “四階可隨機選擇式平行開關架構之負電壓電荷幫浦與光學卷積神經網路FPGA 驗證平台建構,” Master’s thesis, 國立中山大學, Dec. 2022. [19] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998. [20] Andrew N. Sloss, Chris Wright, ARM System Developer’s Guide: Designing and Optimizing System Software, Morgan Kaufmann, 2004. [21] Xilinx , Zynq UltraScale+ RFSoC RFData converter evaluation tool (ZCU111) [Online]. Available: https://docs.xilinx.com/v/u/en-US/ug1287-zcu111-rfsoc-eval-tool. [22] J. Kim, C. Park, M. Yang and W. Jung, “A wide range, energy-efficient temperature sensor based on direct temperature-voltage comparison,” IEEE Solid-State Circuits Letters, vol. 6, pp. 113-116, Apr. 2023. [23] S. Moon, H. -G. Mun, H. Son and J. -Y. Sim, “A 127.8 TOPS/W arbitrarily quantized 1-to-8b scalable-precision accelerator for general-purpose deep learning with reduction of storage logic and latency waste,” in Proc. 2023 IEEE International Solid- State Circuits Conference (ISSCC), pp. 21-23, Feb. 2023. [24] Y. Chen, M. Rouhsedaghat, S. You, R. Rao and C. -C. Kuo, “Pixelhop++: A small successive-subspace-learning-based (ssl-based) model for image classification,” in Proc. 2020 IEEE International Conference on Image Processing (ICIP), pp. 3294- 3298, Sep. 2020. [25] C.-C. Wang, R. G. B. Sangalang, and S.-H. Luo, “A high resolution and wide range temperature detector using 180-nm CMOS process,” in Proc. 20th International Conference on IC Design and Technology (ICICDT), pp. 64-67, Sep. 2023. [26] C.-C. Wang, R. G. B. Sangalang, S.-H. Luo, H.-C. Wu, B.-Q. He, S.-F. Hsiao, C.-P. Jou, H. Hsia, and C.-H. Yu, “A power effective DLA for PBs in opto-electrical neural network architecture,” in Proc. 2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), pp. 46-49, Nov. 2022.

電子全文 Fulltext
本電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。論文使用權限 Thesis access permission：自定論文開放時間 user define 開放時間 Available：校內 Campus：開放下載的時間 available 2027-01-15 校外 Off-campus：開放下載的時間 available 2027-01-15 您的 IP(校外) 位址是 216.73.216.180 現在時間是 2025-05-28 論文校外開放下載的時間是 2027-01-15 Your IP address is 216.73.216.180 The current date is 2025-05-28 This thesis will be available to you on 2027-01-15.
紙本論文 Printed copies
紙本論文的公開資訊在102學年度以後相對較為完整。如果需要查詢101學年度以前的紙本論文公開資訊，請聯繫圖資處紙本論文服務櫃台。如有不便之處敬請見諒。開放時間 available 2027-01-15

QR Code

國立中山大學圖書與資訊處 │ 諮詢服務：2453 論文審查小組 │ 服務信箱 │ 系統開發維運：圖資處知識創新組

Office of Library and Information Services, National Sun Yat-sen University │ Contact Us : 2453 Thesis Format Review Team , Mail │ Development and operations : Knowledge Innovation Division, LIS