Recent advances in computer vision have set the stage for more and more challenging tasks to be tackled. Many of these tasks, such as visual question answering and modeling physical forces, would have until recently been thought of as impossible for machines. These problems are challenging and interesting, but they all analyze images that arise naturally, i.e. ones taken by a photographer. In contrast, many images in the media, in particular image advertisements, are often carefully artificially constructed, with a certain goal in mind, i.e. to convey a particular message to the target audience. This poses the interesting challenge of inferring not the physical content of the image, but its visual rhetoric. This rhetoric does rely on physical content but the latter (1) is often portrayed in non-traditional ways, (2) must be involved in reasoning steps to infer what a certain juxtaposition of objects implies, and (3) must be understood in the context of cultural phenomena.

For example, consider the "Junk deer" ad. To infer its message, the viewer must first visually recognize a deer which is made of junk (a rather imaginative incarnation of deer). Next, the viewer must reason about what this implies-- the deer perhaps ate trash, which would be bad for the deer. Thus, the ad implies that pollution is harmful to wildlife. In "Melting earth", the viewer must recognize the melting process happening to Earth. In "Straws striving towards Pepsi can", the viewer must recognize that the straws are striving towards the can, and infer this implies the contents of the can are desirable. In "Owl and coffee", the owl symbolizes wakefulness. In "Natural cow", the ice-cream is natural because the "cow" is made of natural ingredients. In "Porcelain man", the man shares the texture of a porcelain vase and is similarly fragile. In "Zebra chasing lion", the viewer is surprised to see a zebra chasing a lion rather than vice versa. In "Heavy metal fries", the viewer must recognize a cultural symbol (the devil's horns popular in heavy metal culture). In "Boot crushing sandal", the viewer must "fill in" the presence of a man and a woman, and recognize an implied crushing action.
Junk deer Melting earth Straws striving towards Pepsi can Owl and coffee Natural cow Porcelain man Zebra chasing lion Heavy metal fries Boot crushing sandal

Understanding advertisements poses many challenges, and provides context for the type of problems we aim to solve in computer vision. Developing methods for automatic understanding of the messages of ads requires participation from computer vision researchers with diverse backgrounds. Related topics include: A large annotated dataset of image and video ads is available here. In this dataset, we provide over 64,000 ad images annotated with the topic of the ad (e.g. the product or topic, in case of public service announcements), the sentiment that the ad provokes, any symbolic references that the ad makes (e.g. an owl symbolizes wakefulness, ice symbolizes freshness, etc.), including bounding boxes containing the physical content that alludes symbolically to concepts outside of the ad, and questions and answers about the meaning of the ad ("What should I do according to the ad? Why should I do it, according to the ad?")


Program and Speakers

Date: June 22, 2018
Location: Room 150 - DEF

Time Speaker/Topic
9am-9:30amwelcome and brainstorming
9:30am-10am Jiebo Luo (Univ. of Rochester)
10am-10:30am Jesse Berent (Google) [slides]
10:30am-11amcoffee break
11am-11:30ammore brainstorming
11:30am-12pmchallenge winner talk: Mayu Otani, Yuki Iwazaki, Kota Yamaguchi [slides]
1:30pm-2:15pm Jungseock Joo (UCLA) [slides]
2:15pm-3pm Lydia Chilton (Columbia Univ.) [slides]
3pm-4:30pmposters and coffee break:
  • Emotional Style Transfer for Stock Assets. Kazuhiro Ota and Kota Yamaguchi (CyberAgent, Inc.)
  • Understanding Visual Ads by Aligning Symbols and Objects using Co-Attention. Karuna Ahuja, Karan Sikka, Anirban Roy and Ajay Divakaran (SRI International)
  • Interpreting Visual Metaphors in Advertising. Savvas D Petridis and Lydia B Chilton (Columbia University)
4:30pm-5:15pmbrainstorming and closing remarks



We are running a competition prior to the workshop, with results announced at the workshop. The tentative timeline is as follows: Please access the competition details and data here. We look forward to your submission!



We are looking for 3-page abstracts (work in progress, unpublished or previously published work) on topics related to ad-understanding (see example topics above).

Submission is now open: CVPRADS2018

Submission deadline: April 27, 2018 (extended)



Adriana Kovashka (Univ. of Pittsburgh)
James Hahn (Univ. of Pittsburgh)