Index Thomisticus Treebank

Description

The Index Thomisticus is considered to be one of the pioneer projects in the research areas that today are known as Computational Linguistics, Humanities Computing and Digital Humanities. Started in 2006 at Università Cattolica del Sacro Cuore (Milan, Italy), the Index Thomisticus Treebank is a dependency-based syntactically annotated corpus built upon the texts of Thomas Aquinas provided by the Index Thomisticus corpus. The annotation style of the treebank is based on the guidelines developed in Prague for the so-called 'analytical' layer of annotation of the Prague Dependency Treebank for Czech. Furthermore, specific guidelines for the syntactic annotation of Latin texts were built together with the Latin Dependency Treebank, developed by the Perseus Digital Library at Tufts University in Boston, MA (USA). Recently, the project has started to perform also the semantic and pragmatic annotation of the already available syntactically annotated data of the Index Thomisticus. This new level of annotation resembles the so called 'tectogrammatical' layer of the Prague Dependency Treebank; it features semantic role labelling, ellipsis resolution, coreference analysis and sentential information structure.A more general aim of the Index Thomisticus Treebank project is to develop and make available different kinds of advanced language resources for Latin. Beyond the Index Thomisticus Treebank, the project includes also the following resources: a semantically/pragmatically annotated portion of the Latin Dependency Treebank (with the same annotation style used for the Index Thomisticus Treebank), which features texts of authors of the Classical era; a syntactically-based valency lexicon (IT-VaLex) automatically induced from the syntactic layer of annotation of the Index Thomisticus Treebank; a semantically-based valency lexicon (VALLEX) built in close connection with the semantic/pragmatic annotation of both the Latin Dependency Treebank and the Index Thomisticus Treebank.

Primary Subjects

Index Thomisticus, Romance philology and linguistics, Informatics.

Start Date

2006-01-01

Associated Entities

Università Cattolica del Sacro Cuore, Centro Interdisciplinare di Ricerche per la Computerizzazione dei Segni dell’Espressione (CIRCSE).

Contributors

Savina Raynaud, Marco Passarotti, Paolo Ruffolo, Flavio Massimiliano Cecchini, Marinella Testori.

Related Projects

LiLa: Linking Latin, The Ancient Greek and Latin Dependency Treebank.

Location

Milan.

Research Activities

Disseminating, Annotating, Lemmatizing.

Technologies Used

XML

Outputs

Index Thomisticus Treebank Resources.

Homepage

https://itreebank.marginalia.it/

Project Status

Completed

Research Project

This record catalogues digital scholarly activity (academic research projects) as an instance of the PROV-O Activity class. See the Documentation page for more information.

Permalink

http://purl.org/knot/data/index-thomisticus-project