AI & Machine Learning
StripWinformer: Locally-Enhanced Transformer for Image Motion Deblurring
Single image motion deblurring is a critical low-level computer vision task, aiming to restore clear and sharp images from motion-corrupted counterparts. Traditional image motion deblurring methods often face challenges in handling complex motion patterns and preserving fine details. Inspired by the success of vision transformer models in various tasks, we propose an innovative framework.....
SeeDetail
Design of 3D Hand Mesh Reconstruction from Monocular Image
Graph Convolutional Networks (GCNs) are well-suited for human action recognition using skeleton data, as they handle non-Euclidean structures like human joints and avoid issues with environmental noise affecting RGB images. However, GCNs often suffer from high latency and low power efficiency on CPU and GPU platforms due to computational complexity. To address this.....
SeeDetail
基於具有座標注意力和邊緣檢測輔助之雙邊分割網路的實時語義分割任務
語義分割任務在計算機視覺領域中一直是一個重要議題。近年來,卷積神經網路(Convolutional Neural Network)的作法也從比較早期的編碼器-解碼器(Encoder-Decoder)架構,演變至今各種架構都有人使用,對於語義分割任務來說,空間訊息和感受場(receptive field)是不可缺少的,為了使語義分割數方法幾乎都選擇在圖片解析度和低層次的細節訊息上做出妥協,這導致了準確性的大幅下降。在本文中.....
SeeDetail
Memory Access Optimization for On-Chip Transfer Learning
Training of Deep Neural Network (DNN) at the edge faces the challenge of high energy consumption due to the requirements of a large number of memory accesses for gradient calculations....
SeeDetail
以非局部的解碼器-擠壓-激勵網路及自適應深度列表達成基於編碼器-解碼器的單鏡頭深度估計任務
單鏡頭深度估計是計算機視覺中的一個重要議題。近年來,基於卷積神經網路的編碼器-解碼器架構中展現了合理的結果。在一個強大的編碼器下,人們發現即使是簡單的上採樣過程也能達到良好的準確度....
SeeDetail
MONOCULAR 3D BASED HUMAN POSE ESTIMATION WITH REFINEMENT BLOCK AND SPECIAL LOSS FUNCTION
In this architecture, we present a 3D HPE by monocular. We use the multi-loss method that depends on 2D heatmaps and volumetric heatmaps and a refinement block to locate root-relative 3D human pose....
SeeDetail
Self-Defined Text-dependent Wake-Up-Words Speaker Recognition System
In recent years, wake-up-words (WUW) technology is highly developed in some speaker recognition system. It is the progress of verifying a person's claimed identity from their voice characteristics, and can be efficiently deployed in some consumer applications....
SeeDetail
Multitask Learning on 3D Hand Pose Estimation with Continuous Joints Heatmap
In recent years, deep learning algorithms have been accelerated with GPUs or other volume acceleration hardware, and deep neural networks have gained significant improvements in various tasks....
SeeDetail
A Single-Stage Face Detection and Face Recognition Deep Neural Network Based on Feature Pyramid and Triplet Loss
A practical deep learning face recognition system can be divided into several tasks. These tasks can be time-consuming if executed each task with the original image as the input data....
SeeDetail
G2LGAN:對不平衡資料集進行資料擴增應用於晶圓圖缺陷分類
Semiconductor manufacturing requires multiple complex chemical processes. Errors in any link will cause defects in the produced wafers. Therefore, wafer map defect classification is a key task for the semiconductor industry to maintain and improve yield.....
SeeDetail
Speech Densely Connected Convolutional Networks for Small Footprint Keyword Spotting
In a society where human-computer interaction is becoming increasingly important, voice assistants that use voice recognition to drive or control devices are becoming more common....
SeeDetail
Dual-Sequences Gated Attention Unit Architecture for Speaker Verification
Speaker verification (SV), is the progress of verifying a person's claimed identity from their voice characteristics which are recorded by a device such as a microphone. A speaker verification system can be text-dependent and text-independent cases....
SeeDetail
|