1. Towards an analysis of contemporary literary genres, through their specific lexico-syntactic expressions

The chief goal of the PhraseoRom project has been to elaborate a structural and functional typology of lexico-syntactic constructions specific to English, French and German novelistic discourse. Drawing on large trilingual corpora of literary language, we have used corpus-linguistic (or, more precisely, lexicometric) methods to explore the usefulness and relevance of the concept of phraseology, and more particularly, the notion of ‘motif’ to the description of literary language. The writings that make up the corpus were divided into six sub-genres : general, sentimental, crime, historical, fantasy and science fiction. The aim was to give an account of genre in terms of corpus-extracted recurrent lexico-syntactic patterns rather than in the traditional manner of stylistics, which has tended to focus on rhetoric and individual style. This approach has been shown to be of great benefit in uncovering the linguistic and cognitive stereotypes underlying the genre of the novel.

Our lexicometric methodology has allowed us to a) automatically extract from our trilingual literary corpora several thousand constructions and b) to sort these according to predefined criteria. A comparable corpus of newspaper and academic language was used to determine the significance of recurrent constructions. We have also developed a systematic new approach to annotating the expressions in question both semantically and stylistically. The syntactic-semantic analysis of the data was systematically combined with a discursive and stylistic analysis aiming at cross-genre and cross-linguistic comparison. The motif thus becomes not so much a fictional element imbued with literary symbolism or contributing to the construction of a fictional universe, but rather an observable phraseological phenomenon made up of either continuous or discontinuous units that may combine various types of elements : word forms, morpho-syntactic categories, function words. The results of this interdisciplinary project have implications for linguistics and contrastive stylistics, translation studies and creative writing.

The project has shown that the notion of ‘textual motif’ can be fruitfully applied to the description of literary genres and has allowed us to introduce this new concept into Anglo-American discourse on the Digital Humanities. It extends the traditional armentarium of literary studies through a new corpus-linguistic methodology. It has contributed to current debate and reflection on methodological and theoretical issues surrounding corpus stylistics. Contacts have been established with researchers engaged with related concerns. PhraseoRom is also a contribution to the elaboration of an «operational gender theory« (Rastier 2011: 72), different from the thematic definition through the universes of reference. The members of the French and German teams carried out a total of 90 works (individual and collective), including 2 international books. An open source database has been made available to the community of researchers, teachers, students and PHD students in linguistics, stylistics and NLP:

The PhraseoRom project teams wish to continue the cross-linguistic studies (French-English-German) of the phraseology in the literary language . The methodological contributions of the project will be published at the beginning of 2021 in the online journal Corpus. This publication will serve as a reference for the use of our PhraseoBase database . We also wish to extend our experience to new creative whriting writing workshops, in collaboration with SFR Création (UGA), UMR Litt&Arts (UGA) and the Association Les Mots voyageurs
It is also planned to develop studies based on our parallel multilingual corpora in partnership with researchers from the UMR Litt&Arts (UGA). Moreover, the project's results will be extended to the exploration of textual motifs in a diachronic perspective within the framework of an emerging project (young researchers), which will cover a vast period of the history and evolution of French. Finally, we plan to set up a new collaboration and to submit a project with colleagues from the University of Lausanne in the field of historical stylistics, which will aim to apply digital approaches to literary texts of the 19th and 20th centuries.
More generally, the results of the project could be used in cognitive narratology, in translation studies, in creative writing workshops, etc. The tools created in the project will have an impact in the field of deep learning and new technologies for text mining.

The project, which brought together 32 participants from ten universities, has given rise to more than 90 papers, publications, and invited lectures. Highlights include a symposium on the Phraseology and Stylistics of Literary Language, held at the University of Erlangen-Nuremberg in March 2019 (45 participants from 15 countries) and the book Phraseology and Style in Subgenres of the Novel, Palgrave-Macmillan. The PhraseoBase, a web-based collection of our annotated corpora, methodological manuals and data, was inaugurated in December 2019.

The chief goal of this project is to set up a corpus-driven structural and functional typology of lexico-syntactic constructions in English, French and German 20th century novels, the novel being the most widely read literary genre. This typology will be used for two types of comparison:
a) a comparison between high literature and popular fiction (French paralittérature, German Trivialliteratur; science fiction; detective fiction; romance fiction)
b) a comparison between the stylistic practices found in different literary traditions (United Kingdom, France, Germany)
The first stage of the project will involve the automatic extraction of statistically significant fiction-specific constructions from the novelistic corpus, using newspaper and science texts as reference corpora. The constructions thus obtained will then be subjected to detailed analysis with a view to determining the extent to which they are instrumental in the construction of literary texts, and a typology of relevant constructions will be set up. The linguistic analysis, which will comprise the semantic, syntactic and discourse levels, will be combined with a literary-stylistic comparative analysis that will take account of several novelistic genres. The aim is to lay the groundwork for a lexico-grammar of fiction-specific constructions, with implications for linguistics, literary and contrastive stylistics, and translation studies.
The project is interdisciplinary in nature, bringing together linguists and literary scholars and integrating phraseology, stylistics, genre theory, corpus linguistics and natural language processing. Both in terms of its research focus (phraseology in the novel) and its methodology (corpus-driven linguistics), it falls within the larger domain of the digital humanities.

Iva NOVAKOVA (Université Grenoble Alpes, Laboratoire de Linguistique et Didactique des Langues Etrangères et Maternelles)

