AT2TA - Analogies: from Theory to Tools and Applications – AT2TA
|AT2TA - Analogies: from Theory to Tools and Applications
|The general objective of the AT2TA project is to propose a ML (machine learning) framework that integrates AR (reasoning by analogy) and adapts it to different real-world use cases. The novelty of the AT2TA project lies in this unification, which is also its main technical challenge.
|Main issues raised & general objectives
|The general objective of the AT2TA project is to propose a ML (machine learning) framework that integrates AR (reasoning by analogy) and adapts it to different real-world use cases. The novelty of the AT2TA project lies in this unification, which is also its main technical challenge. This general objective is broken down into four challenges: (C1) Bridging the gap between ML and RCR (knowledge representation and reasoning). We believe that AR can bridge the gap between RCR and machine learning and reveal their potential in terms of transparency and explainability. (C2) Choosing the analogy model and learning appropriate representations for AR. This objective aims to study formal analogy models and learn representation spaces that are adapted to the types of objects and application domains. Indeed, the choices of modeling and appropriate representations are essential to adapt AR to various domains, and to manage diverse types of objects, such as texts and tables, or more complex objects, such as patient data or knowledge graphs (KG). (C3) Adapt the AR framework to several domains. The third objective of the project is the use of AR in the following application domains: NLP (Natural Language Processing) and natural language understanding, medical informatics, software engineering and knowledge management and engineering. (C4) Design a platform for multi-domain AR. The fourth objective is to develop an open science tool to create, solve and reason with analogies, which integrates the different methods and architectures proposed by the consortium partners.
|WP1: Theory and Practice of Analogy by Machine Learning
This WP is dedicated to the proposal and establishment of a fundamental and unifying framework of analogies for the different challenges of AT2TA that relies on a strong axiomatic framework to provide formal models of analogy adapted to different application domains. It will exploit recent methodologies of representation, learning and generation to address and solve the two main problems of analogy creation and resolution.
WP2: Platform
The main objective of the AT2TA platform is to serve as a centralized service for public demonstration and communication on the software and methods developed during the project. The target users of the platform will be researchers, teachers and industrialists in AI/ML who wish to exploit AR for their own use cases.
WP3: Use cases and applications
This WP is dedicated to various applications of the AR-based ML framework that we propose to develop in the project, in order to show its beneficial impact on scientific aspects (NLU and NLG challenges), societal (biomedical and health challenges) and industrial (software and knowledge engineering challenges). These use cases provide the empirical framework to evaluate and show the potential of the methodology and tools developed in WP1-2.
WP4: Project coordination
This WP is dedicated to project management tasks (steering and scheduling of work and collaborations within the consortium), dissemination of results and tools developed by AT2TA, and large-scale activities through community challenges in the form of a shared task at the end of the project duration.
|WP1: Theory and practice of analogy by machine learning
• Galois theory for classifiers compatible with the principle of analogical inference, which establishes a correspondence between pairs of analogy models (space of instances and labels) and the classifiers (REF).
• Unifying framework of digital analogies based on generalized means, and which subsumes different notions of analogy such as arithmetic, geometric, hyperbolic analogy, etc. (REF). This initial framework has been extended and submitted to a rank A conf.
• We have also revisited the notion of basic case competence, and which we have empirically shown to be correlated with the performance of the CoAT prediction algorithm.
• Design of a Pair Logic, exploiting the fact that analogical proportions describe equivalence classes. This logic allows, from the peers representing improvements between two items, to accumulate the improvements and increase the creative power of the analogy.
WP2: Platform
• Improvement of the ANNa platform: implementation of a software architecture using a computing cluster to perform the tasks of detecting and resolving morphological analogies. This architecture will also allow users to choose which model to use (e.g., CNNs, LLMs) to perform this task.
• Creation of the KGPrune platform for pruning knowledge graphs by analogy. KGPrune allows pruning the open source Wikidata graph (which supports Wikipedia) to create thematic subgraphs (e.g. to seed a corporate knowledge graph or to study specific topics such as the works of art looted by the Nazis during World War II).
WP3: Use cases and applications
• Creation of different datasets, e.g., Siganalogies, Wikidata Thematic Subgraph selection, and Copilote Translations
• Proposal of an automatic and frugal analogy-based knowledge graph pruning approach serving as a basis for the KGPrune platform
WP4: Project coordination
• Co-organization of the international workshops ATA 2022-2023 (co-located with ICCBR) and IARML 2022-2023-2024 (co-located with IJCAI)
• Prof. Yves Lepage (Waseda University, Japan) spent the first semester of 2023-24 at Loria, as a visiting researcher. This visit led to a major contribution to the project by defining a unifying framework for digital analogies.
|• Two months after the project acceptance, OpenAI launched its Large Language Models (LLM) ChatGPT with features that overlap with those of the AT2TA project, notably, allowing to solve some analogy types. Afterwards, other LLMs appeared. We decided to continue developing our more frugal proposal (as demonstrated in our work) for some downstream tasks such as Semantic Table Interpretation, Semantic Role Labelling and KG prune. However, we will integrate components based on Llama 7b and 16b into the platform for comparison, experimentation and hybridization. • Continuation of work on tasks related to KGs (e.g., information extraction, semantic table interpretation, link prediction, graph summarization, alignment)
• Organization of an international conference "Principia Analogiae" bringing together international experts on analogies in Lisbon in June 2025
• International shared task that will be organized on the datasets from AT2TA and proposed at IJCAI-ECAI 2026
• Setting up of international Europe-Asia projects
Analogical reasoning is a remarkable capability of human reasoning. Analogical proportions are statements of the form “A is to B as C is to D”. They are the basis of analogical inference that has been used in machine learning (ML) tasks such as classification, decision making, and automatic translation with competitive results. Analogical extrapolation can solve hard reasoning tasks, such as IQ tests, and support data augmentation when learning models with few labeled samples. What makes analogical inference special is its unique ability to simultaneously process similarities and dissimilarities. Analogical reasoning links the two main axes of AI (knowledge representation and reasoning, and machine learning), and contributes to the transparency and explainability of AI as it is close to human reasoning and enables explanations based on examples and counter-examples.
This motivates our efforts to develop an analogy-based ML framework and to demonstrate its usefulness in real world applications. We will explore analogical reasoning for transfer learning and case-based reasoning, where the idea is to take advantage of what has been learned on a source domain in order to improve the learning process in a target domain related to the source domain. Suitable representations are the key to transfer the analogy-based framework to other settings and to handle different object types. This asks for a thorough study of representation spaces for analogy-based frameworks with different object types, not only textual and tabular, but also complex and structured, e.g., patient data, knowledge graphs and abstract syntax trees. As the final goal, the AT2TA project aims to provide an open access platform to detect, solve, and reason with analogies, illustrated by noteworthy applications in NLP, medical sciences, as well as in knowledge management and software engineering, which have a major impact in industry.
Project coordination
Miguel COUCEIRO (Institut national de la recherche en informatique et automatique)
The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.
Partnership
LORIA Institut national de la recherche en informatique et automatique
IRIT Université Toulouse 3 - Paul Sabatier
Orange ORANGE SA
Centre de Recherche Inria de Paris
INSTITUT DES MALADIES GÉNÉTIQUES (IHU)
INFOLOGIC RECHERCHE & DEVELOPPEMENT
UNIVERSITE COTE D'AZUR UNIVERSITE COTE D'AZUR
Help of the ANR 669,867 euros
Beginning and duration of the scientific project:
January 2023
- 42 Months