CONTINT - Contenus numériques et interactions

Adaptive Learning for Intelligent Crowdsourcing and Information Access – ALICIA

Adaptive Learning for Intelligent Crowdsourcing and Information Access Durée du projet : 36 Mois

Our project’s goal is to study models and algorithms that rely on adaptive learning techniques to improve the effectiveness, performance, and scalability of user-centric applications.

In this proposal, we are interested in two families of user-centric applications: information access and intelligent crowdsourcing.

Data management is becoming increasingly user-centric, with users shifting from their role of pure information consumers to generators and evaluators of content of all sorts. Indeed, we are witnessing the emergence of a plethora of systems, especially on the Web, in which users contribute, access and evaluate information, collaborate and interact in complex environments, either explicitly or implicitly. Prominent examples of such systems are social networking (Facebook), blogging (Skyrock), microblogging (Twitter), social bookmarking (Delicious), collaborative tagging and rating (Flickr, MovieLens), crowdsourcing evaluation tasks (Mechanical Turk), or Web advertising.

As users interact with those systems, they leave footprints that can be exploited to develop useful applications. A common aspect between those two applications is the need to define, collect, and maintain user profiles. User profiles are the cornerstone of successful applications and need to be continuously refined to maintain quality applications. For information access, such as search and recommendation, preference profiles help better personalize content provided to users as a result of a search query or as a recommendation. Indeed, when dealing with users as consumers of
information, applications may have to satisfy very diverse preferences about how query result relevance is assessed. For intelligent crowdsourcing, such as data sourcing and micro-task completion, expertise profiles help better assign tasks to users; when looking for expert users, expertise levels may be very diverse and, at the same time, difficult to understand. In both scenarios, user preferences and user expertise cannot be known in advance; also, they can rarely be expected to be declared explicitly by users in a reliable way or to remain stable over time. Consequently, preferences and expertise
need to be discovered over time via mundane interactions with users, using a principled approach. Given the growth rate of rich and diverse content and of the user base, a learning approach is unavoidable. Our project’s goal is to study models and algorithms that rely on adaptive learning techniques to improve the effectiveness, performance, and scalability of user-centric applications. Our thesis is that information access applications could greatly benefit from a learning process that “closes the loop”, continuously accounting for user feedback, actions, evaluations and interactions, in order to better analyze and extract data, index it and address users’ information needs.

We obtained important results in several key areas of the project:

- Models for social data management
- Search in social media
- Crowd-augmented social-aware search
- Movie summarization
- Bandits with multiple plays
- Online nonparametric regression
- Mechanism design for intelligent crowdsourcing
- Adaptive recommender systems
- Adaptive movie recommendation

We are currently investigating several directions from improving the way information needs are handled in social media, based on adaptiveness. On one hand, we strive to learn how to evaluate query relevance based on the social and textual dimensions, in a generic framework that is user-centric but independent of the queries being formulated. On another hand, we strive to understand “query nature”, and to chose the right relevance ingredients for each incoming query. Furthermore, we consider other levels of service in the style of as-you-type-search, with lean more towards the recommendation paradigm. First, in the initial stages of the as-you-type search scenarios, the answers that can be produced are more ameanable to a recommendation approach, since the input query is under-specified. Second, instead of showing actual documents as results, we can also suggest / recommend queries, exploiting also the social dimension of the data; we plan to study these problems both without learning mechanisms and based on an adaptive approach with multi-armed bandits.

[CIKM’15] Building Representative Composite Items. V. Leroy, E. Gaussier, S. Amer-Yahia, H. Mirisaee, “, CIKM 2015
[CIKM-2’15] A Network-Aware Approach for Searching As-You-Type in Social Media. Paul Lagrée, Bogdan Cautis, Hossein Vahabi. CIKM 2015.
[COLT'15] A Chaining Algorithm for Online Nonparametric Regression. Pierre Gaillard and Sébastien Gerchinovitz. Proceedings of the 28th Conference on Learning Theory (COLT 2015), pp. 764–796, 2015.
[CORIA'15] Algorithmes de bandit pour la recommandation à tirages multiples. Jonathan Louëdec, Max Chevalier, Aurélien Garivier, and Josiane Mothe. 12ème Conférence en Recherche d’Information et Applications, Paris, mars 2015
[ECIR’15] Time-Sensitive Collaborative Filtering through Adaptive Matrix Completion. Julien Gaillard, Jean-Michel Renders ECIR 2015.
[FLAIRS'15] A Multiple-Play Bandit Algorithm Applied to Recommender Systems. Jonathan Louëdec, Max Chevalier, Josiane Mothe, Aurélien Garivier, and Sébastien Gerchinovitz. The 28th International Flairs Conference
[JDS'15] Systèmes de recommandations : algorithmes de bandits et évaluation expérimentale. Jonathan Louëdec, Max Chevalier, Aurélien Garivier, Josiane Mothe. 47èmes Journées de Statistique de la SfdS, Lille, Juin 2015
[PVLDB’15] Worker Skill Estimation in Team-Based Tasks. H. Rahman, S.B. Roy, S. Thirumuruganahan, S. Amer-Yahia, G. Das, VLDB 2015
[UPSud’15] CANTO: Crowd Augmented Network-Aware TOp-k Search. Bogdan Cautis, Soudip Roy Chowdhury, under submission
[VLDBJ’15] Task-Assignment Optimization in Knowledge Intensive Crowdsourcing. S.B. Roy, I. Lykourentzou, S. Thirumuruganahan, S. Amer-Yahia, G. Das, VLDB Journal 2015.
[WWW’15] From Complex Object Retrieval?to Complex Crowdsourcing. S. Amer-Yahia, S.B. Roy, WWW Tutorial 2015

Data management is becoming increasingly user-centric, with users shifting from their role of pure information consumers to generators and evaluators of content of all sorts. Indeed, we are witnessing the emergence of a plethora of systems, especially on the Web, in which users contribute, access and evaluate information, collaborate and interact in complex environments, either explicitly or implicitly. Prominent examples of such systems are social networking (Facebook), blogging (Skyrock), microblogging (Twitter), social bookmarking (Delicious), collaborative tagging and rating (Flickr, MovieLens), crowdsourcing evaluation tasks (Mechanical Turk), or Web advertising.

As users interact with those systems, they leave footprints that can be exploited to develop useful applications. In this proposal, we are interested in two families of user-centric applications: information access and intelligent crowdsourcing. A common aspect between those two applications is the need to define, collect, and maintain user profiles. User profiles are the cornerstone of successful applications and need to be continuously refined to maintain quality applications. For information access, such as search and recommendation, preference profiles help better personalize content provided to users as a result of a search query or as a recommendation. Indeed, when dealing with users as consumers of information, applications may have to satisfy very diverse preferences about how query result relevance is assessed. For intelligent crowdsourcing, such as data sourcing and micro-task completion, expertise profiles help better assign tasks to users; when looking for expert users, expertise levels may be very diverse and, at the same time, difficult to understand.

In both scenarios, user preferences and user expertise cannot be known in advance; also, they can rarely be expected to be declared explicitly by users in a reliable way or to remain stable over time. Consequently, preferences and expertise need to be discovered over time via mundane interactions with users, using a principled approach. Given the growth rate of rich and diverse content and of the user base, a learning approach is unavoidable. Our project’s goal is to study models and algorithms that rely on adaptive learning techniques to improve the effectiveness, performance, and scalability of user-centric applications.

Our thesis is that information access applications could greatly benefit from a learning process that “closes the loop”, continuously accounting for user feedback, actions, evaluations and interactions, in order to better analyze and extract data, index it and address users’ information needs; similarly, large-scale crowdsourcing could be enhanced with a continuous monitoring and categorization of workers according to their skills and expertise in order to improve task assignment and completion.

Teaming up with providers of important user-centric applications (Skyrock – social networking and blogging, Xerox – crowdsourcing, Vodkaster – social networking and collaborative movie rating, AlephD – personalized Web advertising, image recommendation), our consortium includes researchers from the main areas in which the ALICIA project is rooted: data management, information retrieval, data mining, machine learning and distributed algorithms.

The project's goals are ambitious: we intend to contribute to the development of highly adaptive learning mechanisms for non-stationary, strongly contextualized information sources and needs – key features of social media and user-centric applications – while promoting information relevance, completeness and diversity in how content or users are selected in response to users’ needs. In order to deliver on this research goal, we intend to focus on adaptive learning algorithms that have the potential to perform well under the conditions that may arise in online, highly dynamic, user-centric environments, such as Multi-Armed Bandits algorithms.

Project coordination

Bogdan Cautis (UNIVERSITE DE PARIS SUD / Laboratoire de Recherche en Informatique)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

PSUD/LRI UNIVERSITE DE PARIS SUD / Laboratoire de Recherche en Informatique
NAVER NAVER France - (Labs Europe)
Vodkaster Vodkaster
AlephD AlephD
LIG Laboratoire d'Informatique de Grenoble
UPS/IMT UNIVERSITE PAUL SABATIER/INSTITUT DE MATHEMATIQUES DE TOULOUSE
XEROX XEROX SAS
Télécom ParisTech Institut Mines Télécom/Télécom ParisTech

Help of the ANR 808,318 euros
Beginning and duration of the scientific project: January 2014 - 42 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter