Advanced Machine/Deep learning for Heterogeneous Large scale data – AML-HELAS
Professor M. Vazirgiannis is involved in data science research since the 1990’s, worked in different research areas of this domain maintaining significant scientific impact. He established and leads the Data Science and Mining group at LIX/École Polytechnique (ÉP), being active in attracting significant research funding from important industrial partners proving the innovative nature of his research. He has exhibited achievements beyond the state of the art in the area of machine learning and text mining including i. Graph based text representations (Graph-of-Words) successfully applied on of NLP tasks like keyword extraction, summarization, event detection etc. The results (especially summarization) are used in industrial projects (i.e. LINTO, see here), ii. Machine learning for Graphs with kernels and Deep Learning (DL), including a best paper award in the prestigious IJCAI 2018 conference and an open source python library for graph kernels (Grakel). On the side of DL for graphs he has already active research with international publications on Graph CNNs, and Structural Node Representations.
M. Vazirgiannis has significant teaching, supervision and scientific organization activity while in ÉP. He introduced for the first time courses on “Machine Learning”, “Text Mining and NLP” - very popular among the students. He supervised several postdocs and 19 completed Ph.D. theses. His long term collaborations include Tsinghua and Columbia Universities.
The proposed chair aspires towards ambitious research topics connected to real life applications and industrial needs. Graphs emerge a universal structure for information representation and learning for different applications including social networks, NLP, biomedical/neuro-computing etc. The chair main research axis focuses on DL for Graph Representations. Learning graph representations is central to many real-world applications. Graph Neural Networks (GNNs) emerged as a general framework for addressing graph-related machine learning tasks. We aim at designing GNNs architectures advancing the state of the art but also efficient in terms of computational complexity able to capture structural information and properties that are not captured sufficiently.
These new graph representations will also contribute towards advanced methods for NLP in the context of the DL for Spoken Language Understanding and Summarization an open problem in NLP, with countless applications. We will consider energy-based meta-architectures and sequence-to-sequence architectures aiming at tackling the abstractive summarization challenge with applications in French. In this context to facilitate French language linguistics - in collaboration with industrial partners - we will create, based on large scale data/corpora, resources including contextual and distributed word-vectors, n-gram collections, dictionaries etc.
The impact of the chair for ÉP is many fold: i. Scientific Influence and reputation, via publications in prestigious journals and conferences in the area of DL/AI and applications in graphs and NLP, ii. International collaborations and impact with prestigious academic partners including Tsinghua and Columbia University, iii.increasing of knowledge capital of the University via Industrial projects based on the research results and knowledge acquired in the context of the chair, also software protections and patents. The impact of the chair extends to the industrial partners as research is to a great extend based on their real world needs and results are transferred to them.
The chair impact on teaching and training consists in new methods that will be transfused to advanced academic and executive master courses. As for the national IA programme the chair contributes to objectives: GT 1.1 (maintain talent in France, launch large scale infrastructures for AI aiming to creation of French text corpora, and GT 1.2: developing a French AI training ecosystem, … encourage … work for legal documents.
Monsieur Michalis Vazirgiannis (Laboratoire d'Informatique de l'Ecole Polytechnique)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
LIX Laboratoire d'Informatique de l'Ecole Polytechnique
Help of the ANR 599,400 euros
Beginning and duration of the scientific project: August 2020 - 48 Months