TY - GEN
T1 - Triplet Loss-based Convolutional Neural Network for Static Sign Language Recognition
AU - Sadeghzadeh, Arezoo
AU - Islam, Md Baharul
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Sign language (SL) is a non-verbal visual language used as a primary communication tool by deaf or hearing-impaired community. Owing to availability of large number of SLs with wide varieties, a great effort is required for public majority to master in interpreting them which is not feasible. Despite the recent advances in developing automatic sign language recognition (SLR) systems, their performance undergoes tremendous degradation when low resolution images with large intra-class and slight inter-class variations are employed. To deal with these issues, a novel end-to-end Convolutional Neural Network (CNN) is proposed to extract the features from the low resolution input images. This feature extractor is trained based on the semi-hard triplet loss function so that the images belonging to the same class are placed close to one another in a lower dimensional embedding space while the distance between the samples from separate classes is maximized. In addition to the efficient loss function, proper selection of the filter and kernel sizes, activation functions, and regularization methods in the proposed CNN leads to effective feature vectors from the small-sized images while the number of the parameters is reduced. The embedded features with a fixed small vector length are utilized to train a Support Vector Machine (SVM) classifier for final recognition. Experimental results on two datasets from two SLs of American (MNIST) and Arabic (ArSL2018) with an accuracy of 100% and 97.54%, respectively, demonstrate that the proposed model outperforms the existing approaches without any need for increasing the quantity of the dataset with augmentation which proves its feasibility.
AB - Sign language (SL) is a non-verbal visual language used as a primary communication tool by deaf or hearing-impaired community. Owing to availability of large number of SLs with wide varieties, a great effort is required for public majority to master in interpreting them which is not feasible. Despite the recent advances in developing automatic sign language recognition (SLR) systems, their performance undergoes tremendous degradation when low resolution images with large intra-class and slight inter-class variations are employed. To deal with these issues, a novel end-to-end Convolutional Neural Network (CNN) is proposed to extract the features from the low resolution input images. This feature extractor is trained based on the semi-hard triplet loss function so that the images belonging to the same class are placed close to one another in a lower dimensional embedding space while the distance between the samples from separate classes is maximized. In addition to the efficient loss function, proper selection of the filter and kernel sizes, activation functions, and regularization methods in the proposed CNN leads to effective feature vectors from the small-sized images while the number of the parameters is reduced. The embedded features with a fixed small vector length are utilized to train a Support Vector Machine (SVM) classifier for final recognition. Experimental results on two datasets from two SLs of American (MNIST) and Arabic (ArSL2018) with an accuracy of 100% and 97.54%, respectively, demonstrate that the proposed model outperforms the existing approaches without any need for increasing the quantity of the dataset with augmentation which proves its feasibility.
KW - CNN
KW - SVM
KW - feature embedding
KW - semi-hard triplet loss
KW - static sign language recognition
UR - http://www.scopus.com/inward/record.url?scp=85142729226&partnerID=8YFLogxK
U2 - 10.1109/ASYU56188.2022.9925490
DO - 10.1109/ASYU56188.2022.9925490
M3 - Conference contribution
AN - SCOPUS:85142729226
T3 - Proceedings - 2022 Innovations in Intelligent Systems and Applications Conference, ASYU 2022
BT - Proceedings - 2022 Innovations in Intelligent Systems and Applications Conference, ASYU 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 Innovations in Intelligent Systems and Applications Conference, ASYU 2022
Y2 - 7 September 2022 through 9 September 2022
ER -