CONTINT - Contenus et Interactions

Parsing and synthesis with abstract categorial grammars : from lexicon to discourse – Polymnie

Submission summary

The project of basic research we are proposing relies on the formalism of Abstract Categorial Grammars (ACG). One of the features of this framework is to consider surface forms and more abstract forms with the same mathematical tools, both levels sharing the same status. It results in the facts that:
* ACG can encode a wide range of grammatical formalisms such as context-free grammars, tree adjoining grammars, etc.
* an ACG defines two languages: an abstract language, the on of the abstract forms, and an object language, the one of surface forms.

It's worth noting that abstract and surface forms are two notions that are defined relatively to each other. Theses notions are independent from any external system. So if one can think of strings as surface forms and syntactic tree as the associated abstract form, one can also think to associate a logical formula as surface form to this abstract form. This property is crucial for our proposal because it allows us to have a unified account, in particular with respect to the underlying algorithms, of parsing and synthesis.

ACG relies on type theory and lambda calculus. From this perspective, it integrates very smoothly to the formal semantic models initiated by Montague. For theories that also take the dynamic effects of the discourse into account, such as DRT or DPL, even if these theories were not formulated within the type theory in the first place, they still can be presented in this theory. It makes them very easy to integrate to the ACG framework. Discourse related phenomena, more specifically the ones related to anaphora resolution or discourse relation inference, can then be expressed using the semantic recipe belonging to the lexicon or using specific information given by the syntactic constructs.

It has been shown that discourse structure plays an important role in text understanding, not only for people, but also for natural language processes such as automatic summarization by dropping less important pars of a discourse.

Our project focuses on the study and the modelling of sentences and texts in a compositional paradigm that takes into account the dynamics and the structure of the discourses. We are interested both in the parsing and in the generation process. We rely on the ACG formal framework. The kind of processing we're considering belongs to the field of summarization or text simplification. In our models, we limit ourselves to considering the underlying linguistic ability.

Because of the complexity of the phenomena, of their description, and because of the complexity of their interaction, we need to set up a suitable environment to test and develop our linguistic modelling. It will consists in extending and improving a software the implements ACG related features. Together with the developed linguistic resources, this software aims at providing a parser and a generator of sentences and texts that takes into account dynamic effects. It will allow us to experiment and validate the approach.

Project coordination

Sylvain Pogodalla (Centre de Recherche Inria Nancy - Grand Est) – sylvain.pogodalla@inria.fr

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

LABRI LABORATOIRE BORDELAIS DE RECHERCHE EN INFORMATIQUE
INRIA Institut National de Recherche en Informatique et Automatique
UPS-IRIT Université Paul Sabatier Toulouse 3 – Institut de Recherche en Informatique de Toulouse
INRIA NGE Centre de Recherche Inria Nancy - Grand Est

Help of the ANR 520,288 euros
Beginning and duration of the scientific project: August 2012 - 36 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter