WaCky - The Web-As-Corpus Kool Yinitiative

Description

A community of linguists and information technology specialists who got together to develop a set of tools (and interfaces to existing tools) that will allow linguists to crawl a section of the web, process the data, index and search them. The project produced some corpora in English, French, German, and Italian as well as some tools for building corpora from online texts.

Primary Subjects

Historical and general linguistics, Informatics.

Start Date

2007-12-01

Associated Entities

Universität Hildesheim, Universität Stuttgart, Università degli Studi "G. D'Annunzio", Università di Bologna, Università degli studi di Trento, Università di Pisa, Kokuritsu Kokugo Kenkyūsho, Technische Universität Darmstadt, Universitetet i Oslo, University of Leeds, United States Naval Academy.

Contributors

Marco Baroni, Silvia Bernardini.

Related Projects

LiMiNe (Linguistic Mining of the Net)

Location

Pescara, Trento, Bologna, Pisa.

Research Activities

Web Scraping, Gathering, Annotating.

Technologies Used

BootCaT, Corpus Tools.

Outputs

WaCky Corpora.

Bibliographic References

https://wacky.sslmit.unibo.it/doku.php?id=publications

Homepage

https://wacky.sslmit.unibo.it/

Project Status

Ongoing

Research Project

This record catalogues digital scholarly activity (academic research projects) as an instance of the PROV-O Activity class. See the Documentation page for more information.

Permalink

http://purl.org/knot/data/wacky-project