WaCky - The Web-As-Corpus Kool Yinitiative

Description

A community of linguists and information technology specialists who got together to develop a set of tools (and interfaces to existing tools) that will allow linguists to crawl a section of the web, process the data, index and search them. The project produced some corpora in English, French, German, and Italian as well as some tools for building corpora from online texts.

Primary Subjects

Historical and general linguistics, Informatics.

Start Date

2007-12-01

Associated Entities

Università degli studi di Trento, Università di Pisa, Kokuritsu Kokugo Kenkyūsho, Technische Universität Darmstadt, Universitetet i Oslo, University of Leeds, United States Naval Academy, Universität Hildesheim, Universität Stuttgart, Università degli Studi "G. D'Annunzio", Università di Bologna.

Contributors

Silvia Bernardini, Marco Baroni.

Related Projects

LiMiNe (Linguistic Mining of the Net)

Location

Trento, Bologna, Pisa, Pescara.

Research Activities

Annotating, Web Scraping, Gathering.

Technologies Used

BootCaT, Corpus Tools.

Outputs

WaCky Corpora.

Bibliographic References

https://wacky.sslmit.unibo.it/doku.php?id=publications

Homepage

https://wacky.sslmit.unibo.it/

Project Status

Ongoing

Research Project

This record catalogues digital scholarly activity (academic research projects) as an instance of the PROV-O Activity class. See the Documentation page for more information.

Permalink

http://purl.org/knot/data/wacky-project