Voices of the great war: A richly annotated corpus of Italian texts on the first world war

  • Alessandro Lenci
  • , Simonetta Montemagni
  • , Federico Boschetti
  • , Irene de Felice
  • , Stefano dei Rossi
  • , Felice Dell'Orletta
  • , Michele Di Giorgio
  • , Martina Miliani
  • , Lucia C. Passaro
  • , Angelica Puddu
  • , Giulia Venturi
  • , Nicola Labanca

Risultato della ricerca: Capitolo in libro/report/atti di convegnoContributo a conferenzapeer review

Abstract

Voci della Grande Guerra (“Voices of the Great War”) is the first large corpus of Italian historical texts dating back to the period of First World War. This corpus differs from other existing resources in several respects. First, from the linguistic point of view it gives account of the wide range of varieties in which Italian was articulated in that period, namely from a diastratic (educated vs. uneducated writers), diaphasic (low/informal vs. high/formal registers) and diatopic (regional varieties, dialects) points of view. From the historical perspective, through a collection of texts belonging to different genres it represents different views on the war and the various styles of narrating war events and experiences. The final corpus is balanced along various dimensions, corresponding to the textual genre, the language variety used, the author type and the typology of conveyed contents. The corpus is annotated with lemmas, part-of-speech, terminology, and named entities. Significant corpus samples representative of the different “voices” have also been enriched with meta-linguistic and syntactic information. The layer of syntactic annotation forms the first nucleus of an Italian historical treebank complying with the Universal Dependencies standard. The paper illustrates the final resource, the methodology and tools used to build it, and the Web Interface for navigating it.

Lingua originaleInglese
Titolo della pubblicazione ospiteLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
EditorNicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
EditoreEuropean Language Resources Association (ELRA)
Pagine911-918
Numero di pagine8
ISBN (elettronico)9791095546344
Stato di pubblicazionePubblicato - 2020
Pubblicato esternamente
Evento12th International Conference on Language Resources and Evaluation, LREC 2020 - Marseille, France
Durata: 11 mag 202016 mag 2020

Serie di pubblicazioni

NomeLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

???event.eventtypes.event.conference???

???event.eventtypes.event.conference???12th International Conference on Language Resources and Evaluation, LREC 2020
Paese/TerritorioFrance
CittàMarseille
Periodo11/05/2016/05/20

Fingerprint

Entra nei temi di ricerca di 'Voices of the great war: A richly annotated corpus of Italian texts on the first world war'. Insieme formano una fingerprint unica.

Cita questo