TY - GEN
T1 - Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module
AU - Imani, Hassan
AU - Zaim, Selim
AU - Islam, Md Baharul
AU - Junayed, Masum Shah
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - Deep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset.
AB - Deep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset.
KW - 3D convolutional neural networks
KW - Deep learning
KW - Disparity
KW - Parallax attention mechanism
KW - Quality assessment
KW - Stereoscopic video
UR - http://www.scopus.com/inward/record.url?scp=85119829051&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-90421-0_4
DO - 10.1007/978-3-030-90421-0_4
M3 - Conference contribution
AN - SCOPUS:85119829051
SN - 9783030904203
T3 - Lecture Notes in Mechanical Engineering
SP - 39
EP - 50
BT - Digitizing Production Systems - Selected Papers from ISPR 2021
A2 - Durakbasa, Numan M.
A2 - Gençyılmaz, M. Güneş
PB - Springer Science and Business Media Deutschland GmbH
T2 - International Symposium for Production Research, ISPR2021
Y2 - 7 October 2021 through 9 October 2021
ER -