Deep Imbalanced Attribute Classification Using Visual Attention Aggregation

Nikolaos Sarafianos, Xiang Xu, Ioannis A. Kakadiaris

Research output: Chapter in Book/Report/Conference proceedingConference contribution

37 Scopus citations

Abstract

For many computer vision applications, such as image description and human identification, recognizing the visual attributes of humans is an essential yet challenging problem. Its challenges originate from its multi-label nature, the large underlying class imbalance and the lack of spatial annotations. Existing methods follow either a computer vision approach while failing to account for class imbalance, or explore machine learning solutions, which disregard the spatial and semantic relations that exist in the images. With that in mind, we propose an effective method that extracts and aggregates visual attention masks at different scales. We introduce a loss function to handle class imbalance both at class and at an instance level and further demonstrate that penalizing attention masks with high prediction variance accounts for the weak supervision of the attention mechanism. By identifying and addressing these challenges, we achieve state-of-the-art results with a simple attention mechanism in both PETA and WIDER-Attribute datasets without additional context or side information.

Original languageEnglish (US)
Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditorsVittorio Ferrari, Cristian Sminchisescu, Yair Weiss, Martial Hebert
PublisherSpringer-Verlag
Pages708-725
Number of pages18
ISBN (Print)9783030012519
DOIs
StatePublished - 2018
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: Sep 8 2018Sep 14 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11215 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th European Conference on Computer Vision, ECCV 2018
Country/TerritoryGermany
CityMunich
Period9/8/189/14/18

Keywords

  • Deep imbalanced learning
  • Visual attention
  • Visual attributes

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Deep Imbalanced Attribute Classification Using Visual Attention Aggregation'. Together they form a unique fingerprint.

Cite this