Automatic Understanding of Image and Video Advertisements

Introduction

There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action. We propose the novel problem of automatic advertisement understanding. To enable research on this problem, we create two datasets: an image dataset of 64,832 image ads, and a video dataset of 3,477 ads. Our data contains rich annotations encompassing the topic and sentiment of the ads, questions and answers describing what actions the viewer is prompted to take and the reasoning that the ad presents to persuade the viewer ("What should I do according to this ad, and why should I do it?"), and symbolic references ads make (e.g. a dove symbolizes peace). We also analyze the most common persuasive strategies ads use, and the capabilities that computer vision systems should have to understand these strategies. We present baseline classification results for several prediction tasks, including automatically answering questions about the messages of the ads.

[top]

Publications

Cross-Modality Personalization for Retrieval. Nils Murrugarra-Llerena and Adriana Kovashka. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. (Oral.) [pdf] [supp] [poster] [data]

ADVISE: Symbolism and External Knowledge for Decoding Advertisements. Keren Ye and Adriana Kovashka. To appear, Proceedings of the European Conference on Computer Vision (ECCV), September 2018. [pdf] [supp] [related code]

Story Understanding in Video Advertisements. Keren Ye, Kyle Buettner, Adriana Kovashka. To appear, Proceedings of the British Machine Vision Conference (BMVC), September 2018. [pdf]

Persuasive Faces: Generating Faces in Advertisements. Christopher Thomas and Adriana Kovashka. To appear, Proceedings of the British Machine Vision Conference (BMVC), September 2018. [pdf] [supp]

Automatic Understanding of Image and Video Advertisements. Zaeem Hussain, Mingda Zhang, Xiaozhong Zhang, Keren Ye, Christopher Thomas, Zuha Agha, Nathan Ong, Adriana Kovashka. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. (Spotlight) [pdf] [supplementary] [poster] [spotlight]

[top]

Funding

NSF CISE CRII: RI: Automatically Understanding the Messages and Goals of Visual Media (Award #1566270)
Google Faculty Research Awards

This material is based upon work supported by the National Science Foundation under Grant Number 1566270. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

[top]

Image Data and Annotations

[interface with new object transformation annotations]
Explore the image dataset visualization here
(thanks to Mingda Zhang and Narges Honarvar Nazari)

[original interface] Explore the image dataset visualization here

Type	Count	Example
Topic	204,340	Electronics
Sentiment	102,340	Cheerful
Q+A	202,090	I should bike because it’s healthy.
Symbol	64,131	Danger (+ bounding box)
Strategy	20,000	Contrast
Slogan	11,130	Save the planet... save you.

Readme
Download images: Email us for image URLs
Download image annotations
Ad or not-ad classifier (thanks to Chris Thomas)

[top]

Video Data and Annotations

Type	Count	Example
Topic	17,345	Cars and automobiles, Safety
Sentiment	17,345	Cheerful, Amazed
Action/Reason	17,345	I should buy this car because it is pet-friendly.
Funny?	17,374	Yes/No
Exciting?	17,374	Yes/No
English?	15,380	Yes/No/Does not matter
Effective?	16,721	Not/.../Extremely Effective

[top]

Contact

For any questions, issues, concerns, and comments, please email Adriana Kovashka at kovashka AT cs DOT pitt DOT edu

[top]