°ò©ó¨ã¦³®y¼Ðª`·N¤O©MÃä½tÀË´ú»²§U¤§ÂùÃä¤À³Îºô¸ôªº¹ê®É»y¸q¤À³Î¥ô°È
ºKn:
»y¸q¤À³Î¥ô°È¦bpºâ¾÷µøı»â°ì¤¤¤@ª½¬O¤@Ó«nijÃD¡Cªñ¦~¨Ó¡A
¨÷¿n¯«¸gºô¸ô(Convolutional Neural Network)ªº§@ªk¤]±q¤ñ¸û¦´Áªº½s½X¾¹-¸Ñ½X¾¹(Encoder-Decoder)¬[ºc¡A
ºtÅܦܤµ¦UºØ¬[ºc³£¦³¤H¨Ï¥Î¡A¹ï©ó»y¸q¤À³Î¥ô°È¨Ó»¡¡AªÅ¶¡°T®§©M·P¨ü³õ(receptive field)¬O¤£¥i¯Ê¤Öªº¡A
¬°¤F¨Ï»y¸q¤À³Î¼Æ¤èªk´X¥G³£¿ï¾Ü¦b¹Ï¤ù¸ÑªR«×©M§C¼h¦¸ªº²Ó¸`°T®§¤W°µ¥X§´¨ó¡A³o¾ÉP¤F·Ç½T©Êªº¤j´T¤U°¡C
¦b¥»¤å¤¤¡A§ÚÌ´£¥X¤F¤@Ó°ò©óÂùÃä¤À³Îºô¸ô(BiSeNet)ªº·s¬[ºc¡AºÙ¬°BiSeNet V3¡C
§Ṳ́ޤJ¤F¤@Ó·sªº¯S¼x²Ó¤Æ¼Ò²Õ¨ÓÀu¤Æ¯S¼x¹Ï¡A¥H¤Î¤@Ó¯S¼x¿Ä¦X¼Ò²Õ¨Ó¦³®Äµ²¦X¯S¼x¡A
¤Þ¤J¤F¤@Óª`·N¤O¾÷¨î¨ÓÀ°§U¼Ò«¬´£¨ú¤W¤U¤å°T®§¡A¬°¤F¯à§ó¦nªºÀò¨ú¯S¼x¡A§ÚÌÁ٨ϥÎÃä½tÀË´ú¨Ó¼W±jÃä¬Éªº¯S¼x¡C
¬[ºc³]p: §ÚÌ´£¥X¤F¤@Ó§ï¶iªºÂùÃä¤À³Îºô¸ô(BiSeNet V3)¡A³o¬O¤@ӥΩó»y¸q¤À³Î¥ô°Èªº°ª®Ä©M·s¿oªº¬[ºc¡A
¦b§Ú̪º¬ã¨s°ò¦¤W¡A§ÚÌ°ª«×»{¥iBiSeNetªºÂù¦V¬[ºc©MSTDC Net¬°»y·N¤À³Î¥ô°È³]pªº½s½X¾¹¡A§ÚÌ«·s³]p¤Fºô¸ô¤º³¡ªº¼Ò²Õ¡A
BiSeNet¤¤ªºì¨Óªºª`·N¤O²Ó¤Æ¼Ò²Õ(Attention Refinement Module, ARM)©M¯S¼x¿Ä¦X¼Ò²Õ(Feature Fusion Module, FFM)¿ù¹L¤FªÅ¶¡°T®§¡C
°ò©ó®y¼Ðª`·N¤Oªº·§©À¡A§Ú̳]p¤F®y¼Ð¯S¼x²Ó¤Æ¼Ò²Õ(Coordinate Attention Refinement Module, CFRM)
©M®y¼Ð¯S¼x¿Ä¦X¼Ò²Õ(Coordinate Feature Fusion Module, CFFM)¨Ó¥[±j¾ãÅé¬[ºc¦b¯S¼x¹Ï¤Wªºªí¥Ü¡A
¦bCFRM²Ó¤Æ¯S¼x¹Ïªº§@¥Î¤U¡A§Ų́ϥÎÃä½tÀË´ú¤èªk¨Ó»²§U³oÓ¼Ò²Õ¡A CFFM¦³®Ä¦a§Q¥Î®y¼Ðª`·N¤O¨Ó¿Ä¦X¤£¦P¼h¦¸ªº¯S¼x¹Ï¡A
³o¨Ç§Þ³N³£¨Ï»y¸q¤À³Îªºµ²ªG§ó¥[·Ç½T¡A¦Ò¼{¨ì±À²zªº³t«×¡A©Ò´£¥Xªº¬[ºc¦b³t«×©M·Ç½T©Ê¤§¶¡¹ê²{¤F¨}¦nªºÅv¿Å¡C
¹Ï¤@¡G(a)§¤¼Ð¯S¼x²Ó¤Æ¼Ò²Õ(Coordinate Feature Refinement Module,CFRM)ªº¬[ºc¹Ï
(b)§¤¼Ð¯S¼x¿Ä¦X¼Ò²Õ(Coordinate Feature Fusion Module,CFFM)ªº¬[ºc¹Ï
¹Ï¤G: ÂùÃä¤À³Îºô¸ôV3ªº·§z¡C Edge/Seg-Headªí¥Ü¤À³ÎÀY¡A¥¦¥]¬A¤@Ó3¡Ñ3ªºConv-batchnorm-ReLU¼h©M¤@Ó1¡Ñ1ªºConv-batchnorm-ReLU¼h¡C
¤À³ÎÀYªº¿é¥Xºû«×¬OÃþªº¼Æ¶q¡CÂŦâµê½u®Ø¤ºªº¾Þ§@¬O§Ú̪ºBiSeNet V3¡A¦Ó¾í¦âµê½u®Ø¤ºªº¾Þ§@¬O¥u¦b°V½m¶¥¬q¨Ï¥Îªº³¡¤À¡C
Made
by ´¿¬RÞ³