AN EFFICIENT END-TO-END IMAGE COMPRESSION TRANSFORMER

Afsana Ahsan Jeny, Masum Shah Junayed, Md Baharul Islam

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Image and video compression received significant research attention and expanded their applications. Existing entropy estimation-based methods combine with hyperprior and local context, limiting their efficacy. This paper introduces an efficient end-to-end transformer-based image compression model, which generates a global receptive field to tackle the long-range correlation issues. A hyper encoder-decoder-based transformer block employs a multi-head spatial reduction self-attention (MHSRSA) layer to minimize the computational cost of the self-attention layer and enable rapid learning of multi-scale and high-resolution features. A Casual Global Anticipation Module (CGAM) is designed to construct highly informative adjacent contexts utilizing channel-wise linkages and identify global reference points in the latent space for end-to-end rate-distortion optimization (RDO). Experimental results demonstrate the effectiveness and competitive performance of the KODAK dataset.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Image Processing, ICIP 2022 - Proceedings
PublisherIEEE Computer Society
Pages1786-1790
Number of pages5
ISBN (Electronic)9781665496209
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event29th IEEE International Conference on Image Processing, ICIP 2022 - Bordeaux, France
Duration: 16 Oct 202219 Oct 2022

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference29th IEEE International Conference on Image Processing, ICIP 2022
Country/TerritoryFrance
CityBordeaux
Period16/10/2219/10/22

Keywords

  • Image compression
  • encoder-decoder
  • entropy model
  • transformer

Fingerprint

Dive into the research topics of 'AN EFFICIENT END-TO-END IMAGE COMPRESSION TRANSFORMER'. Together they form a unique fingerprint.

Cite this