PoseTED: A Novel Regression-Based Technique for Recognizing Multiple Pose Instances

Afsana Ahsan Jeny, Masum Shah Junayed, Md Baharul Islam

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Pose estimation for multiple people can be viewed as a hierarchical set predicting challenge. Algorithms are needed to classify all persons according to their physical components appropriately. Pose estimation methods are divided into two categories: (1) heatmap-based, (2) regression-based. Heatmap-based techniques are susceptible to various heuristic designs and are not end-to-end trainable, while regression-based methods involve fewer intermediary non-differentiable stages. This paper presents a novel regression-based multi-instance human pose recognition network called PoseTED. It utilizes the well-known object detector YOLOv4 for person detection, and the spatial transformer network (STN) used as a cropping filter. After that, we used a CNN-based backbone that extracts deep features and positional encoding with an encoder-decoder transformer applied for keypoint detection, solving the heuristic design problem before regression-based techniques and increasing overall performance. A prediction-based feed-forward network (FFN) is used to predict several key locations’ posture as a group and display the body components as an output. Two available public datasets are tested in this experiment. Experimental results are shown on the COCO and MPII datasets, with an average precision (AP) of 73.7% on the COCO val. dataset, 72.7% on the COCO test dev. dataset, and 89.7% on the MPII datasets, respectively. These results are comparable to the state-of-the-art methods.

Original languageEnglish
Title of host publicationAdvances in Visual Computing - 16th International Symposium, ISVC 2021, Proceedings
EditorsGeorge Bebis, Vassilis Athitsos, Tong Yan, Manfred Lau, Frederick Li, Conglei Shi, Xiaoru Yuan, Christos Mousas, Gerd Bruder
PublisherSpringer Science and Business Media Deutschland GmbH
Pages573-585
Number of pages13
ISBN (Print)9783030904388
DOIs
Publication statusPublished - 2021
Externally publishedYes
Event16th International Symposium on Visual Computing, ISVC 2021 - Virtual Online
Duration: 4 Oct 20216 Oct 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13017 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Symposium on Visual Computing, ISVC 2021
CityVirtual Online
Period4/10/216/10/21

Keywords

  • FFN
  • Keypoints estimation
  • Person detection
  • Pose recognition
  • STN
  • Transformer encoder-decoder

Fingerprint

Dive into the research topics of 'PoseTED: A Novel Regression-Based Technique for Recognizing Multiple Pose Instances'. Together they form a unique fingerprint.

Cite this