Predicting Social Image Popularity Dynamics

ICIP 2020 Challenge: Predicting Social Image Popularity Dynamics

The level of engagement of an image posted on a social network is usually referred as "Image Popularity" [1]. Social image popularity is a score of the level of engagement achieved by pictures shared through social media platforms, where the engagement can be measured in terms of number of views, likes or shares. In the state of the art, the popularity of an image is a normalized score of the cumulative engagement achieved up to the download time, without considering the evolution of the image popularity over time.
However, the popularity is a dynamic parameter. As consequence, two images with the same popularity dynamic could be ranked differently, depending on the time of analysis (i.e., download time). This is clear by considering Figure 1, where the temporal popularity scores of three different photos are shown.

Figure 1: popularity scores over time for three different photos. After the first period, the popularity score decreases with the post age. This effect is caused by the very low engagement around the photo after the early period.|-



This problem has been addressed in [2,3]. In particular, a new task and a related large-scale dataset of about 20 thousands Flickr images have been presented in [2]. The paper in [2] also presents a first approach to accomplish the task considering a subset of the data, whereas a more extended analysis and experimental evaluation has been presented in [3], that can be considered as a baseline for the challenge participants.
The challenge is meant to develop systems able to predict the popularity of social images over time (i.e., popularity dynamics), given the information of the image post available only at posting time. The task will be addressed by leveraging on the first large-scale annotated dataset of Flickr image posts monitored for a period of 30 days since their upload. Given an image post, the output of the developed predictors will be the temporal sequence of the predicted 30 engagement values (i.e., number of views).

Significance of the Challenge

In the context of social media analysis, there are several applications that could benefit from the assessment and the prediction of the level of engagement achieved by a post shared by a user on a social platform. Application examples are, among others, social media marketing, brand monitoring, and political parties’ popularity. The users’ engagement can be measured considering their activities and interactions with the content published on the social platform (e.g., comments, likes, views or shares). This information is often available and is usually compared with statistics of companies/advertisers websites and the users’ queries on web search engines with the aim to assess the correlation between social advertising campaigns and their desired outcome (e.g., brand reputation, website/store visits, product dissemination and sale, etc.).
So far, the task of predicting the temporal popularity of social images was not explored, mainly due to the lack of large-scale benchmark datasets. Indeed, the social media platforms only provide the cumulative values of the posts’ engagement scores. For the above reasons, a new large-scale dataset offers the opportunity for researchers to address this challenging task.

The main contributions of the task proposed for this challenge are the following:
• it poses new questions around on-line behaviour, popularity, and social media content lifecycle;
• it addresses a very challenging task which finds several practical uses in the context of social media analysis applications such as recommender systems and advertisement campaigns analysis/placement;
• the developed systems will allow the definition of applications to support the publication and effective diffusion of contents through social media, by implementing a forecast of the engagement evolution over time. As instance, such a system can indicate when old contents should be replaced by new ones before they become obsolete.

Rules of participation

A set of social features related to the training image posts will be given to participants. In addition, the Ground Truth 30-days sequence of the number of views achieved by the picture on Flickr within a period of 30 days after the image posting will be provided for the training set. In particular, for each user, the following information will be provided:

• number of contacts;
• if the user is a professional photographer;
• number of photos;
• mean number of views;
• number of groups;
• the average number of people of the user’s groups;
• the average number of images of the user’s groups.

The information related to the photo will be:
• title length;
• description length;
• number of albums;
• number of groups;
• the average number of people in the groups in which the picture has been shared in;
• the average number of photos in the groups in which the picture has been shared in;
• the social tags associated to the photo.


For each social entity (i.e., users, photo and group) the IDs on the Flickr platform will be also included in the dataset. In addition, for each picture, the GPS coordinates (when available) as well as the timestamps related to upload, download and acquisition date and time will be available. These information can be exploited to extend the crawling with more specific data and analysis. As instance, the picture can be downloaded and proper visual features can be extracted, such as done in [2].

Test data will be released in a blind version, i.e., Ground Truth information will be not available. Participants will be requested to predict the 30-days sequence of the number of views of each test photo. The prediction results will be uploaded for evaluation according to the provided format.

Criteria for judging a submission

Test results will be evaluated using the 25% trimmed RMSE (i.e., interquartile mean) and the Median RMSE hereby described:

• Trimmed RMSE (tRMSE): the mean test RMSE is computed after discarding an equal amount of both high and low tails of the error distribution. The 25% trimmed mean will be considered, also known as the interquartile mean;
• Median RMSE (RMSE MED): after performing the prediction of the test sequences, the median of the test RMSE is considered.

In addition to the ICIP 2020 challenge section, the authors of the best selected works will be invited to submit their contribution to a special issue of a valuable Journal.


Dates

● Registration opening: April 15th
● Training data available: April 30th
● Testing data available: May 30th
● Submission deadline: June 7th
● Announcement of the results: August 15th

Contacts

For any further information, please contact us at ortis@dmi.unict.it .


Scientific Committee

  • Sebastiano Battiato, Università degli Studi di Catania, Italy
  • Giovanni Maria Farinella, Università degli Studi di Catania, Italy
  • Alessandro Ortis, Università degli Studi di Catania, Italy


Bibliography

[1] Khosla, Aditya, Atish Das Sarma, and Raffay Hamid. "What makes an image popular?." Proceedings of the 23rd international conference on World wide web. ACM, 2014.
[2] Ortis, Alessandro, Giovanni Maria Farinella, and Sebastiano Battiato. "Prediction of Social Image Popularity Dynamics." International Conference on Image Analysis and Processing. Springer, Cham, 2019.
[3] A. Ortis, G. M. Farinella and S. Battiato, “Predicting Social Image Popularity Dynamics at Time Zero” in IEEE Access.