Introduction

Recent advances in computer vision have set the stage for more and more challenging tasks to be tackled. Many of these tasks, such as visual question answering and modeling physical forces, would have until recently been thought of as impossible for machines. These problems are challenging and interesting, but they all analyze images that arise naturally, i.e. ones taken by a photographer. In contrast, many images in the media, in particular image advertisements, are often carefully artificially constructed, with a certain goal in mind, i.e. to convey a particular message to the target audience. This poses the interesting challenge of inferring not the physical content of the image, but its visual rhetoric. This rhetoric does rely on physical content but the latter (1) is often portrayed in non-traditional ways, (2) must be involved in reasoning steps to infer what a certain juxtaposition of objects implies, and (3) must be understood in the context of cultural phenomena.

For example, consider the "Junk deer" ad. To infer its message, the viewer must first visually recognize a deer which is made of junk (a rather imaginative incarnation of deer). Next, the viewer must reason about what this implies-- the deer perhaps ate trash, which would be bad for the deer. Thus, the ad implies that pollution is harmful to wildlife. In "Melting earth", the viewer must recognize the melting process happening to Earth. In "Straws striving towards Pepsi can", the viewer must recognize that the straws are striving towards the can, and infer this implies the contents of the can are desirable. In "Owl and coffee", the owl symbolizes wakefulness. In "Natural cow", the ice-cream is natural because the "cow" is made of natural ingredients. In "Porcelain man", the man shares the texture of a porcelain vase and is similarly fragile. In "Zebra chasing lion", the viewer is surprised to see a zebra chasing a lion rather than vice versa. In "Heavy metal fries", the viewer must recognize a cultural symbol (the devil's horns popular in heavy metal culture). In "Boot crushing sandal", the viewer must "fill in" the presence of a man and a woman, and recognize an implied crushing action.
Junk deer Melting earth Straws striving towards Pepsi can Owl and coffee Natural cow Porcelain man Zebra chasing lion Heavy metal fries Boot crushing sandal

Understanding advertisements poses many challenges, and provides context for the type of problems we aim to solve in computer vision. Developing methods for automatic understanding of the messages of ads requires participation from computer vision researchers with diverse backgrounds. Related topics include: A large annotated dataset of image and video ads is available here. In this dataset, we provide over 64,000 ad images annotated with the topic of the ad (e.g. the product or topic, in case of public service announcements), the sentiment that the ad provokes, any symbolic references that the ad makes (e.g. an owl symbolizes wakefulness, ice symbolizes freshness, etc.), including bounding boxes containing the physical content that alludes symbolically to concepts outside of the ad, and questions and answers about the meaning of the ad ("What should I do according to the ad? Why should I do it, according to the ad?")

[top]

Speakers

Speaker Tentative topics
Shih-Fu Chang (Columbia Univ.) multimedia, knowledge discovery and representation, sentiment
Fei-Fei Li (Stanford Univ.) (tentative) reasoning and knowledge for question answering, language and vision
Jungseock Joo (UCLA) visual persuasion, political ads
Zoya Bylinskii (MIT) memorability, saliency, infographics
Ekaterina Shutova (Univ. of Cambridge) (tentative) metaphor understanding in NLP

[top]

Challenge

We will run a competition prior to the workshop, with results announced at the workshop. Details TBD

Tentative deadline: April 27, 2018

[top]

Submission

In addition to challenge entries, we will also solicit paper/abstract submissions on related topics. Details TBD

Tentative deadline: April 27, 2018

[top]

Tentative schedule

9am-10am intro and brainstorming
10am-10:30am invited talk 1
10:30am-10:45am coffee break
10:45am-11:15am invited talk 2
11:15am-11:45am invited talk 3
11:45am-1:15pm lunch
1:15pm-1:45pm challenge winner talk
1:45pm-2:15pm invited talk 4
2:15pm-2:45pm invited talk 5
2:45pm-4:15pm posters and coffee break
4:15pm-5:15pm brainstorming and conclusion

[top]

Organizer

Adriana Kovashka (Univ. of Pittsburgh)


[top]