TY - GEN
T1 - DeepPyNet
T2 - 36th International Conference on Image and Vision Computing New Zealand, IVCNZ 2021
AU - Jeny, Afsana Ahsan
AU - Islam, Md Baharul
AU - Aydin, Tarkan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Recent advances in optical flow prediction have been made possible by using feature pyramids and iterative refining. Though downsampling in feature pyramids may cause foreground items to merge with the background, the iterative processing could be incorrect in optical flow experiments. Particularly the outcomes of the movement of narrow and tiny objects can be more invisible in the flow scene. We introduce a novel method called DeepPyNet for optical flow estimation that includes feature extractor, multi-channel cost volume, and flow decoder. In this method, we propose a deep recurrent feature pyramid-based network for the end-to-end optical flow estimation. The feature extraction from each pixel of the feature map keeps essential information without modifying the feature receptive field. Then, a multi-scale 4D correlation volume is built from the visual similarity of each pair of pixels. Finally, we utilize the multi-scale correlation volumes to continuously update the flow field through an iterative recurrent method. Experimental results demonstrate that DeepPyNet significantly eliminates flow errors and provides state-of-the-art performance in various datasets. Moreover, DeepPyNet is less complex and uses only 6.1M parameters 81% and 35% smaller than the popular FlowNet and PWC-Net+, respectively.
AB - Recent advances in optical flow prediction have been made possible by using feature pyramids and iterative refining. Though downsampling in feature pyramids may cause foreground items to merge with the background, the iterative processing could be incorrect in optical flow experiments. Particularly the outcomes of the movement of narrow and tiny objects can be more invisible in the flow scene. We introduce a novel method called DeepPyNet for optical flow estimation that includes feature extractor, multi-channel cost volume, and flow decoder. In this method, we propose a deep recurrent feature pyramid-based network for the end-to-end optical flow estimation. The feature extraction from each pixel of the feature map keeps essential information without modifying the feature receptive field. Then, a multi-scale 4D correlation volume is built from the visual similarity of each pair of pixels. Finally, we utilize the multi-scale correlation volumes to continuously update the flow field through an iterative recurrent method. Experimental results demonstrate that DeepPyNet significantly eliminates flow errors and provides state-of-the-art performance in various datasets. Moreover, DeepPyNet is less complex and uses only 6.1M parameters 81% and 35% smaller than the popular FlowNet and PWC-Net+, respectively.
KW - Convolutional neural network
KW - Feature pyramid networks
KW - Iterative recurrent unit
KW - Multi-scale correlation volume
KW - Optical flow estimation
UR - http://www.scopus.com/inward/record.url?scp=85124390712&partnerID=8YFLogxK
U2 - 10.1109/IVCNZ54163.2021.9653193
DO - 10.1109/IVCNZ54163.2021.9653193
M3 - Conference contribution
AN - SCOPUS:85124390712
T3 - International Conference Image and Vision Computing New Zealand
BT - Proceedings of the 2021 36th International Conference on Image and Vision Computing New Zealand, IVCNZ 2021
A2 - Cree, Michael J.
PB - IEEE Computer Society
Y2 - 9 December 2021 through 10 December 2021
ER -