AI & Machine Learning

StripWinformer: Locally-Enhanced Transformer for Image Motion Deblurring

Single image motion deblurring is a critical low-level computer vision task, aiming to restore clear and sharp images from motion-corrupted counterparts. Traditional image motion deblurring methods often face challenges in handling complex motion patterns and preserving fine details. Inspired by the success of vision transformer models in various tasks, we propose an innovative framework.....

SeeDetail 

Design of 3D Hand Mesh Reconstruction from Monocular Image

Graph Convolutional Networks (GCNs) are well-suited for human action recognition using skeleton data, as they handle non-Euclidean structures like human joints and avoid issues with environmental noise affecting RGB images. However, GCNs often suffer from high latency and low power efficiency on CPU and GPU platforms due to computational complexity. To address this.....

SeeDetail 

基於具有座標注意力和邊緣檢測輔助之雙邊分割網路的實時語義分割任務

語義分割任務在計算機視覺領域中一直是一個重要議題。近年來,卷積神經網路(Convolutional Neural Network)的作法也從比較早期的編碼器-解碼器(Encoder-Decoder)架構,演變至今各種架構都有人使用,對於語義分割任務來說,空間訊息和感受場(receptive field)是不可缺少的,為了使語義分割數方法幾乎都選擇在圖片解析度和低層次的細節訊息上做出妥協,這導致了準確性的大幅下降。在本文中.....

SeeDetail 

Memory Access Optimization for On-Chip Transfer Learning

Training of Deep Neural Network (DNN) at the edge faces the challenge of high energy consumption due to the requirements of a large number of memory accesses for gradient calculations....

SeeDetail 

以非局部的解碼器-擠壓-激勵網路及自適應深度列表達成基於編碼器-解碼器的單鏡頭深度估計任務

單鏡頭深度估計是計算機視覺中的一個重要議題。近年來,基於卷積神經網路的編碼器-解碼器架構中展現了合理的結果。在一個強大的編碼器下,人們發現即使是簡單的上採樣過程也能達到良好的準確度....

SeeDetail 

MONOCULAR 3D BASED HUMAN POSE ESTIMATION WITH REFINEMENT BLOCK AND SPECIAL LOSS FUNCTION

In this architecture, we present a 3D HPE by monocular. We use the multi-loss method that depends on 2D heatmaps and volumetric heatmaps and a refinement block to locate root-relative 3D human pose....

SeeDetail 

Self-Defined Text-dependent Wake-Up-Words Speaker Recognition System

In recent years, wake-up-words (WUW) technology is highly developed in some speaker recognition system. It is the progress of verifying a person's claimed identity from their voice characteristics, and can be efficiently deployed in some consumer applications....

SeeDetail 

Multitask Learning on 3D Hand Pose Estimation with Continuous Joints Heatmap

In recent years, deep learning algorithms have been accelerated with GPUs or other volume acceleration hardware, and deep neural networks have gained significant improvements in various tasks....

SeeDetail 

A Single-Stage Face Detection and Face Recognition Deep Neural Network Based on Feature Pyramid and Triplet Loss

A practical deep learning face recognition system can be divided into several tasks. These tasks can be time-consuming if executed each task with the original image as the input data....

SeeDetail 

G2LGAN:對不平衡資料集進行資料擴增應用於晶圓圖缺陷分類

Semiconductor manufacturing requires multiple complex chemical processes. Errors in any link will cause defects in the produced wafers. Therefore, wafer map defect classification is a key task for the semiconductor industry to maintain and improve yield.....

SeeDetail 

Speech Densely Connected Convolutional Networks for Small Footprint Keyword Spotting

In a society where human-computer interaction is becoming increasingly important, voice assistants that use voice recognition to drive or control devices are becoming more common....

SeeDetail 

Dual-Sequences Gated Attention Unit Architecture for Speaker Verification

Speaker verification (SV), is the progress of verifying a person's claimed identity from their voice characteristics which are recorded by a device such as a microphone. A speaker verification system can be text-dependent and text-independent cases....

SeeDetail