Authors
Jia-Ming Lin, Kuan-Ting Lai, Bin-Ray Wu, Ming-Syan Chen
Publication date
2021
Conference
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Pages
3076-3080
Description
Action recognition is an important research field that has many applications in surveillance, video search, autonomous vehicles, etc. However, current state-of-the-art action classifiers are still not widely adopted in embedded applications yet. The major reason is that action recognition needs to process both spatial and temporal streaming data to precisely identify actions, which is compute-intensive and power hungry. To solve this issue, researchers start using FPGA to run action recognition models with minimum power. In this paper, we propose a new hardware architecture of action recognition on FPGA. Our model is based on the popular two-stream neural network. By optimizing the optical flow and convolution operations in the temporal domain, our method can achieve similar accuracy with one order of magnitude less operations than other C3D baseline models. We have implemented our model on Xilinx ZCU102 and released the source code.
Total citations
2023202461
Scholar articles
JM Lin, KT Lai, BR Wu, MS Chen - Proceedings of the IEEE/CVF conference on computer …, 2021