Blanc SIMI 2 - Blanc - SIMI 2 - Science informatique et applications

African Languages in the Field : speech Fundamentals and Automation. – ALFFA

Submission summary

The number of languages spoken in Africa ranges from 1 000 to 2500, according to estimates and definitions. Monolingual States do not really exist on this continent since languages cross borders. The number of languages varies from 2 or 3, in Burundi and Rwanda, to more than 400 in Nigeria. Multilingualism is indeed ubiquitous in sub-Saharan African societies. To support the development and use of languages, many institutions and organizations have been created, often under the auspices of the UNESO or the African Union. In summary, the major issues met by these initiatives are:
-the development and standardization of linguistic resources in many languages, not just the higher-resourced ones,
-the introduction of national languages in the digital space through the creation and dissemination of content in local languages,
-the multilingual access to digital resources.
If equipped with linguistic and computer resources, languages having a written form can be integrated into the development products of major players in the digital world, attracted by a market with great economic potential. For instance, the mobile phone manufacturers offer more and more models with textual and graphical interfaces in African language. Nevertheless the use of written / textual interfaces requires to be literate! According to Denis Gikunda, director of the development program in African languages at Google, one of the highlights of the online market development in Africa is to ensure that applications talk to Africans in the true sense of the word. Several publications of UNESCO make explicit reference to the speech synthesis (and recognition) as a technological facilitator (one can cite, for instance, the following: “The illiteracy rate remains high: the use of voice interfaces is relevant”).
Thus, today is very favorable to the development of a market for speech in African languages. People's access to ICT is done mainly through mobile (and keyboard) and the need for voice services can be found in all sectors : from higher priority (health, food) to more fun (games, social media).

For this, overcoming the language barrier is needed and this is what we propose in this project where two main aspects are involved: fundamentals of speech analysis (language description, phonology, dialectology) and speech technologies (ASR and TTS) for African languages. ALFFA project is really inter-disciplinary since it not only gathers technology experts (LIA, LIG, VOXYGEN) but includes fieldwork linguists / phoneticians (DDL). Such a partnership is very important since we want to reuse the strong experience of field linguisists in data collection, as well as their knowledge on dialectal/regional differences, particularly important in Africa. In the project, developped ASR and TTS technologies would be used to build micro speech services for mobile phones in Africa (for instance, a phone service to consult the “price of commodities” or provide “voice reporting for information systems”). If accepted, the ALFFA project would help a young start-up (Voxygen) to interact with academics on the fundamentals aspects of African languages and start deploying prototypes / services in a continent where the telecom market has a strong potential. In addition, the project would help the academic partners to reach an international leadership in the domain of speech processing and analysis for African languages which will reinforce their (already large) collaboration network on this continent. On this purpose, subcontracting is planned in the framework of the ALFFA project in order to set up sustainable collaborations with local actors (academics, NGO) in Africa. The scientific challenges associated to the ALFFA project are detailed in the ANR project proposal form.

Project coordination

Laurent Besacier (Laboratoire d'Informatique de Grenoble)

The author of this summary is the project coordinator, who is responsible for the content of this summary. The ANR declines any responsibility as for its contents.

Partner

DDL Laboratoire Dynamique Du Langage
VOX Voxygen SA
LIG Laboratoire d'Informatique de Grenoble
LIA Laboratoire d'Informatique d'Avignon

Help of the ANR 395,596 euros
Beginning and duration of the scientific project: September 2013 - 48 Months

Useful links

Explorez notre base de projets financés

 

 

ANR makes available its datasets on funded projects, click here to find more.

Sign up for the latest news:
Subscribe to our newsletter