Design and Implementation of Low-Power, Energy-Efficient Neural Network Training Hardware

Accelerators Based on Brain Floating-Point Computing and Sparsity Aware

ºK­n:

¦b¥»¬ã¨s¤¤¡A§Ú­Ì´£¥X¤@­Ó¨ã¦³°ª®Ä¯à¡B°ªÆF¬¡©Êªº°V½m³B²z¾¹¡A§Ú­Ì§â¥¦©R¦W¬°EESA¡C ÀÀijªº°V½m³B²z¾¹¨ã¦³§C¥\¯Ó¡B°ª§]¦R¶q©M°ª¯à®Äµ¥¯SÂI¡CEESA§Q¥Î¯«¸g¤¸¿E¬¡ªºµ}²¨©Ê¨Ó´î¤Ö°O¾ÐÅé³X°Ýªº¦¸¼Æ¥H¤Î°O¾ÐÅéÀx¦sªºªÅ¶¡¡A ¥H¹ê²{°ª®Äªº°V½m¥[³t¾¹¡C©Ò´£¥Xªº³B²z¾¹¨Ï¥Î¤F¤@ºØ·s¿oªº¥i­«·s°t¸mªº­pºâ¬[ºc¡A¦b¥¿¦V¶Ç¼½¡]FP¡^¥H¤Î¤Ï¦V¶Ç¼½¡]BP¡^¹Lµ{¤¤«O«ù°ª©Ê¯à¡C ¸Ó³B²z¾¹±Ä¥Î¥x¿n¹q40 nm¤uÃÀ§Þ³N¹ê²{¡A¯à¹B¦æªº¾Þ§@ÀW²v¬°294 MHz¡A¾ã­Ó´¹¤ùªº¥\¯Ó¬°87.12 mW¡A¨Ï¥Îªº®Ö¤ß¹qÀ£¬°0.9 V¡C¦b¾ã­Ó´¹¤ù¤¤¡A §Ú­Ì¨Ï¥Î16¦ì¤¸ªº¸£¯BÂI¹Bºâºë«×®æ¦¡¨Ó§¹¦¨©Ò¦³¸ê®Æªº¼Æ­È¹Bºâ¡A³Ì²×¸Ó³B²z¾¹¹ê²{¤F1.72 TOPS/Wªº°ª¯à®Äªí²{¡C

 

 

¬ã¨s°^Äm:

 

q ÀÀijªº³B²z¾¹¨Ï¥Î 16-bits brain floating point ³oºØ·s¿oªº¹Bºâºë«×®æ¦¡¡C

q³q¹L¨Ï¥Î¤@­Ó·s¿oªº¥i­«·s°t¸mªº³B²z¤¸¥ó(PE)¬[ºc¡A¨Ó§¹¦¨¥þ³s±µ¼hªº°V½m¤Î±À²z¶¥¬q¡C

q ÀÀijªº³B²z¾¹§Q¥Î¯«¸g¤¸ªºµ}²¨©Ê¤Îµ²¦X¤F¹êÅç«Ç´£¥XªºÀu¤Æ°O¾ÐÅé³X°Ý¤èªk¡A¨Ó´î¤Ö¥¿¦V¶Ç¼½¤Î¤Ï¦V¶Ç¼½¹Bºâ©Ò»Ý­nªº°O¾ÐÅéªÅ¶¡¤Î°O¾ÐÅé³X°Ýªº¦¸¼Æ¡A¨Ó´£°ª¯à·½®Ä²v¡C

q ©Ò´£¥XªºµwÅé³]­p¦b¥x¿n¹q40nm¤uÃÀ§Þ³N¤¤¹ê²{¡A¦b294MHz©M0.9Vªº®Ö¤ß¹qÀ£¤U¡A¹ê²{¤F87.12mWªº¥\¯Ó¤Î1.72TOPS/Wªº¯à·½®Ä²v¡C

Proposed Overall Architecture:

 

 

 

Implementation Results:

 

 

 

 

 

 

 

 

 

 

 

 

Made by  ªL©w¨¹