Langues et civilisations
à tradition orale

          CNRS           INSHS home          Other web sites          
cnrs paris3 inalco paris3 paris3
Intranet Lacito français

  Home > Collaborative research projects > EuroSlav 2010


EuroSlav 2010 Program        Citer le corpus


Electronic database of endangered Slavic varieties in non-Slavic speaking European countries (2010-2012)

Co-PIs: Evangelia Adamou & Walter Breu (U. Konstanz)


Awarding Body:

anrdfgFrench National Research Agency (ANR) & Deutsche Forschungsgemeinschaft (DFG)


The EuroSlav 2010 project has been set up to create an electronic database of the endangered Slavic varieties found in non Slavic speaking European countries. Five Slavic varieties in particular will be targeted: those spoken in Italy (Molise Slavic: Acquaviva Collecroce, Montemitro and San Felice del Molise), in Austria (Burgenland Croatian), in Germany (Colloquial Upper Sorbian) and in Greece (Liti and Hrisa).
Access the EuroSlav2010 corpus here.


This project will be carried out by specialists of these varieties from French and German research teams. Previous works by these specialists have already contributed to the available knowledge on the languages, especially in the domains of language contact and typology.

The EuroSlav 2010 Project will build on these works and will create an oral data base using data gathered in the field by its research members. The data will be made accessible on line in xml format. The materials archived will be both previously recorded data and additional data gathered on location. The corpus will contain standardized morphosyntactic annotation, in conformity with the glossing traditions established in linguistic typology and Slavic studies. In the data base, the recordings and transcriptions will be paired, and will be available either sentence by sentence or as a single stream. Each variety will be represented by one hour of recordings. To ensure maximum sustainability and availability, the corpus will also be integrated in the Pangloss collection, Lacito-CNRS “Oral Archives” program, which already contains 430 documents in 70 languages, as well as in COCOON (COllections de COrpus Oraux Numériques) platform, which houses 3400 documents in 100 different languages. Both of these archives are freely accessible to the public.

Acquaviva Collecroce (Molise)
Photo : ©W. Breu, 2000

The EuroSlav 2010 Project is unique because:

  • it will create an annotated corpus of previously undocumented disappearing Slavic varieties, and will make the corpus available on line for the international scientific community;
  • the corpus will be rich with language contact phenomena and will thus fuel research carried out in this domain;
  • it will grant access to non standardized Slavic languages for both typologists and general linguists;
  • this rapidly vanishing linguistic heritage will be preserved for future use.


Women of San Felice
Photo : ©W. Breu, 2005

(created on the 17 December 2009 – updated 09/01/2013)

Imprimer Contacter le webmestre Plan du site Crédits Accueil