Design and Implementation of Low-Power, Energy-Efficient Neural Network Training Hardware
Accelerators Based on Brain Floating-Point Computing and Sparsity Aware
ºKn:
¦b¥»¬ã¨s¤¤¡A§ÚÌ´£¥X¤@Ө㦳°ª®Ä¯à¡B°ªÆF¬¡©Êªº°V½m³B²z¾¹¡A§Ú̧⥦©R¦W¬°EESA¡C
ÀÀijªº°V½m³B²z¾¹¨ã¦³§C¥\¯Ó¡B°ª§]¦R¶q©M°ª¯à®Äµ¥¯SÂI¡CEESA§Q¥Î¯«¸g¤¸¿E¬¡ªºµ}²¨©Ê¨Ó´î¤Ö°O¾ÐÅé³X°Ýªº¦¸¼Æ¥H¤Î°O¾ÐÅéÀx¦sªºªÅ¶¡¡A
¥H¹ê²{°ª®Äªº°V½m¥[³t¾¹¡C©Ò´£¥Xªº³B²z¾¹¨Ï¥Î¤F¤@ºØ·s¿oªº¥i«·s°t¸mªºpºâ¬[ºc¡A¦b¥¿¦V¶Ç¼½¡]FP¡^¥H¤Î¤Ï¦V¶Ç¼½¡]BP¡^¹Lµ{¤¤«O«ù°ª©Ê¯à¡C
¸Ó³B²z¾¹±Ä¥Î¥x¿n¹q40 nm¤uÃÀ§Þ³N¹ê²{¡A¯à¹B¦æªº¾Þ§@ÀW²v¬°294 MHz¡A¾ãÓ´¹¤ùªº¥\¯Ó¬°87.12 mW¡A¨Ï¥Îªº®Ö¤ß¹qÀ£¬°0.9 V¡C¦b¾ãÓ´¹¤ù¤¤¡A
§Ų́ϥÎ16¦ì¤¸ªº¸£¯BÂI¹Bºâºë«×®æ¦¡¨Ó§¹¦¨©Ò¦³¸ê®Æªº¼ÆÈ¹Bºâ¡A³Ì²×¸Ó³B²z¾¹¹ê²{¤F1.72 TOPS/Wªº°ª¯à®Äªí²{¡C
¬ã¨s°^Äm:
q
ÀÀijªº³B²z¾¹¨Ï¥Î
16-bits brain floating point
³oºØ·s¿oªº¹Bºâºë«×®æ¦¡¡C
q³q¹L¨Ï¥Î¤@Ó·s¿oªº¥i«·s°t¸mªº³B²z¤¸¥ó(PE)¬[ºc¡A¨Ó§¹¦¨¥þ³s±µ¼hªº°V½m¤Î±À²z¶¥¬q¡C
q ÀÀijªº³B²z¾¹§Q¥Î¯«¸g¤¸ªºµ}²¨©Ê¤Îµ²¦X¤F¹êÅç«Ç´£¥XªºÀu¤Æ°O¾ÐÅé³X°Ý¤èªk¡A¨Ó´î¤Ö¥¿¦V¶Ç¼½¤Î¤Ï¦V¶Ç¼½¹Bºâ©Ò»Ýnªº°O¾ÐÅéªÅ¶¡¤Î°O¾ÐÅé³X°Ýªº¦¸¼Æ¡A¨Ó´£°ª¯à·½®Ä²v¡C
q ©Ò´£¥XªºµwÅé³]p¦b¥x¿n¹q40nm¤uÃÀ§Þ³N¤¤¹ê²{¡A¦b294MHz©M0.9Vªº®Ö¤ß¹qÀ£¤U¡A¹ê²{¤F87.12mWªº¥\¯Ó¤Î1.72TOPS/Wªº¯à·½®Ä²v¡C
Proposed Overall Architecture:
Implementation Results:
Made
by ªL©w¨¹