Il quaderno di Paolo Bufalini

Overview

Paolo Bufalini’s notebook Scholarly Semantic Digital Edition (SSDE) project is based on the notebook of handwritten notes by Paolo Bufalini, member of the Italian Communist Party, latinist and translator.

Between 1981 and 1991, Bufalini kept a private notebook which he titled Appunti 1981–1991 (transl. Notes 1981–1991). The contents of the notebook reflect Bufalini's intellectual and social life: excerpts from literary works, personal notes and comments, narrations of events, unpublished translations from Latin into Italian. Given the private nature of the notebook, the relationships between the various textual components (text-text, text-comment, text-translation, translation-translation) are not explicitly declared. The same applies to the sources of the cited texts. Due to its complexity in terms of conceptualisation, Paolo Bufalini’s notebook has been chosen as a starting point on which to elaborate a non-linear reading.

The project resulted in a Semantic Scholarly Digital Edition (SSDE) web application which aims to provide tools to access, browse, visualise, study and preserve the notebook and to contextualise it in a semantically-rich and metatextual framework.

The project documentation is organised as follows:

Project objectives - This section introduces the specific objectives underlying the SSDE project.
Methodology - This section explains the methodology used to develop the SSDE web application. Specifically, the development workflow and the reused data models.
RDF dataset components - This section explains the structure of the RDF graph by dividing it into a series of smaller models.
web application development - This section explains the various steps to develop the SSDE web application.
Filters for semantic queries - This section illustrates the main SSDE web application features in terms of user interaction and data visualisation, specifically focusing on the design of Filters, a user-friendly tools to access, extract, browse and visualise semantic data.

Project objectives

The objectives are to identify, analyse, formalise and semantically represent the implicit intratextual and intertextual relations that characterise the notebook.

Textual encoding and Semantic Web technologies are integrated, respectively aiming at visualising the logical structure of the document and extracting semantic information from the document.

The result is a knowledge graph where nodes represent the entities of the edition (quotations, comments, translations, authors, works, persons, places, etc.) and arcs represent relations between entities.

The edition becomes a network of entities which are uniquely identified by means of persistent URIs.

Methodology

We introduce here the XML/TEI P5 markup model, the conversion process from the XML/TEI P5 document into an HTML document, the process of data extraction, and the transformation into RDF according to the specific ontologies.

Markup model

The text of the notebook is marked up in XML/TEI P5, with a specific focus on a number of entities that are meant to populate the knowledge base.

Person

In <teiHeader> each personal name is included in a <person> element and uniquely identified by an @xml:id. Corresponding VIAF/DBpedia records are included via the attribute @sameAs. Each person’s name occurrence in the document is linked to the corresponding identifier in <teiHeader> via the attribute @ref pointing at the ID assigned to the person in <teiHeader>.


<person xml:id="MTC">
        <persName sameAs="http://viaf.org/viaf/126144814493626773404 http://it.dbpedia.org/resource/Marco_Tullio_Cicerone">Cicerone, Marco Tullio (106 a.C.–43 a.C.)</persName>
</person>


<bibl><author ref="#MTC"></author>Cicerone</author></bibl>

Comments by Bufalini or other authors

<note> encloses the excerpt corresponding to a comment. @type="comment" specifies that the given excerpt corresponds to a comment.

@corresp points to the @xml:id attribute of the referenced excerpt, i.e., a quotation or a translation (see below).


<p>note type="comment" xml:id="comm-002" corresp="#quot-001-1" resp="#PB">
        <p>p>Gettare olio sulle acque torbide</p>
        <p>p>Gettare olio sulle acque burrascose</p>
        <p>p>Gettare olio sulle acque inasprite.</p>
        <p>p>E si adagiano anzi in placida lentezza</p>
</note>

Quotations.

<quote> includes the text of a quoted excerpt. @source links a quotation to the bibliographical description of the work included in <teiHeader>.


<quote xml:id="quot-150" xml:lang="lat" source="#bibl126">
        <p>Necessitati parēre semper sapientis est habitum</p>
</quote>

Translations by Paolo Bufalini or other authors.

The translations by Bufalini are encoded following the schema below.

<quote> encloses the text of the excerpt in its original language.

<note type="translation"> encloses the corresponding translation by Bufalini or other authors.

@corresp links to the unique identifier of the translated excerpt, which is in turn linked (via @source ) to the bibliographical description of the translated work in the <teiHeader>.


<quote xml:id="quot-002" xml:lang="lat" source="#bibl001">
        <l>Visendus ater flumine languido</l>
        <l>Cocytus errans... </l>
</quote>
<bibl>(Hor)</bibl>
<note type="translation" xml:id="tra-001" corresp="#quot-002" resp="#PB">
        <lg>
                <l>Dobbiamo vederlo il nero fiume lento</l>
                <l>Cocito errante... </l>
        </lg>
</note>

Works.

Like persons, works described in the <teiHeader> are linked to the corresponding VIAF/DBpedia/WorldCat records by @sameAs. An example is reported below.


<bibl xml:id="ad-familiares-4-9" resp="#CSPC">
        <author ref="#MTC"/>
        <title level="m" type="WorkCollection" subtype="Book" sameAs="http://viaf.org/viaf/187066890 http://worldcat.org/entity/work/id/2261383546">Ad familiares</title>
        <biblScope unit="volume" n="4">4</biblScope>
        <title level="a" n="9" type="Work" subtype="Letter">9</title>
        <date when="-0046">46 a.C.</date>
        <citedRange xml:id="bibl126" unit="Excerpt" n="sample-1">sample 1</citedRange>
</bibl>

HTML conversion

An HTML document containing the transcription of the text is obtained by means of a XSL transformation of the XML/TEI P5 file in an HTML file.

Not all the information encoded in the XML/TEI file is converted, but only the ones necessary to the document typesetting. For instance, the authority files are not included in the resulting HTML.

The selected information is represented in the HTML as follows.

Persons

A <span class="persName"> identifies a person’s name in the transcription, whether it is reported as a full name or as an abbreviation. To each identified person, a <span class="tooltiptext"> is nested to make the person’s full name explicit.

@data-ref specifies the ID assigned to the person, as formalised in the XML/TEI encoding.

The class author is added to <span class="persName"> in the case the person is the author of a specific excerpt. An example is reported below.


<span class="tooltip persName" data-ref="#MTC"><span class="tooltiptext">Cicerone, Marco Tullio (106 a.C.–43 a.C.)</span>Cicerone</span>

Comments.

Comments are included in a <section class="note" data-type="comment"> element.

Each comment is uniquely identified by a specific @id. Each assigned ID contains the string "comm-" at its beginning. An example is reported below.


<section class="note" data-type="comment" id="comm-002" data-corresp="#quot-001-1" data-resp="#PB">
        <p>Gettare olio sulle acque torbide</p>
        <p>Gettare olio sulle acque burrascose</p>
        <p>Gettare olio sulle acque inasprite.</p>
        <p>E si adagiano anzi in placida lentezza</p>
</section>

Quotations

Quotations are included in a <blockquote> element.

Each quotation is uniquely identified by an @id. Each assigned ID contains the string "quot-" at its beginning.

@data-source identifies each quotation source work. Its value corresponds to the @source values, as assigned in the XML/TEI document. An example is reported below.


<blockquote id="quot-150" lang="lat" data-source="#bibl126">
        <p>Necessitati parēre semper sapientis est habitum</p>
</blockquote>

Translations

Translations are included in a <section class="note" data-type="translation"> element and in <blockquote class="note" data-type="translation">

Each translation is uniquely identified by a specific @id. Each assigned ID contains the string "tra-" at its beginning.

@data-corresp identifies the ID of the translated excerpt, while @data-resp the ID of the translator. An example is reported below.


<section class="cit" id="horatio-carm-2-14-17">
        <blockquote id="quot-002" lang="lat" data-source="#bibl001">
                <p class="l">Visendus ater flumine languido</p>
                <p class="l">Cocytus errans... </p>
        </blockquote> (<span class="author persName tooltip" data-ref="#QOF"><span class="tooltiptext">Orazio Flacco, Quinto (65 a.C.–8 a.C.)</span>Hor</span>) <section class="note" data-type="translation" id="tra-001" data-corresp="#quot-002" data-resp="#PB">
                <section class="lg">
                        <p class="l">Dobbiamo vederlo il nero fiume lento</p>
                        <p class="l">Cocito errante... </p>
                </section>
        </section>
</section>

Works

Works are identified by @data-source in <blockquote> and <section> elements. Their values correspond to the @source values, as assigned in the XML/TEI document.


<blockquote id="quot-150" lang="lat" data-source="#bibl126">
        <p>Necessitati parēre semper sapientis est habitum</p>
</blockquote>

The final HTML document contains the information relevant to the text transcription visualisation.

RDF conversion

The Resource Description Framework (RDF) is a data model according to which data is organised in a graph data structure.

The resulting graph is composed of a series of statements about the specific knowledge domain. Each statement is structured in a triple according to the schema subject-predicate-object.

RDF graphs may be serialised according to diverse schemas, such as Turtle, N-Triples, N-Quads, etc. The RDF dataset of Paolo Bufalini’s notebook is organised in several graphs, each including information of diverse nature (e.g. edition-specific information, authors and works relations, context information and so on).

The XML/TEI data extraction focused on a specific number of selected TEI elements.

The entities extracted are textual components such as comments, quotations and translations, extracted from the XML/TEI document body, persons mentioned by the author and bibliographical entities cited by the author, extracted from the XML/TEI teiHeader authority.

In addition, intratextual relations (i.e. the relations between a comment/quotation/translation and its author) and intertextual relations (i.e. the relations between an excerpt and its translations and comments) are extracted from the XML/TEI document body.

A further encoding level defines the provenance of the assertions, describing if a phenomenon has been identified by Bufalini or by the editors, and what information support the assessment of that phenomenon, according to the nanopublication data model.

Fig 1. Khun, T., Nanopublications. Provenance-Aware Linked Data Publishing, 2015.

Finally, links to the authority files and external datasets are included to enrich the RDF knowledge base.

The final dataset contains the resulting named graphs composed of a graph including all the textual elements and 64 graphs related to the assertions, their provenance and editorial information, as formalised by the text editors.

RDF graphs are used to serve data in user interface, and to build user-friendly tools to allow their browsing on the user-side. Specifically, they are used to support the browsing of Facsimiles, to develop the Filters sidebar in the Indexes and the Data Visualisation charts.

Reused models

A set of existing ontologies are reused to describe the notebook entities, relationships, and its semantic contextual information.

The FRBR-align Bibliographic Ontology (FaBiO) is an ontology to describe published or potentially publishable entities, as journal articles, conference papers, books, which contain or are referred to bibliographic references.

In this project, FaBiO ontology is used to represent the articulation of the textual object levels, some peritextual elements as page titles, notes and quotes.

Open Annotation Model.

The Open Annotation Model is an ontology to provide a standard description mechanism for sharing Annotations between systems.

In this project, the Open Annotation ontology is used to represent the mentioned persons’ roles.

Citation Typing Ontology (CiTO).

The Citation Typing Ontology (CiTO) is an ontology to characterise the nature or type of citations.

In this project, CiTO ontology is used to formalise the underlying relations between cited works, specifically the relations between works cited by Bufalini and citations between different authors.

Historical Context Ontology (HiCO).

The Historical Context Ontology (HiCO) is an ontology to describe the context information of cultural heritage objects.

In this project, HiCO ontology is used to formalise statements on the texts cited by Bufalini, influences between authors and assumptions made by authors cited by Bufalini.

RDF dataset components

Persons

Each mentioned person in the notebook is formalised by including a URI to uniquely identify the specific person and rdf:resource="http://xmlns.com/foaf/0.1/Person" to identify the specific entity as a person. Its VIAF/DBpedia identifiers URI are associated at this level respectively by owl:sameAs. An example is reported below.


<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/person/mtc">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<rdfs:label>Cicerone, Marco Tullio (106 a.C.–43 a.C.)</rdfs:label>
<owl:sameAs rdf:resource="http://viaf.org/viaf/126144814493626773404"/>
<owl:sameAs rdf:resource="http://it.dbpedia.org/resource/Marco_Tullio_Cicerone"/>
</rdf:Description>

Comments

Each comment excerpt in the notebook is formalised by including:

A URI to uniquely identify the specific comment excerpt.
oa:hasTarget to declare the relationship between an annotation and its target.
oa:hasBody to report the resource body of the annotation, i.e. the full-text of the comment itself.
frbr:realizer to specify the author of the comment.
prism:startingPage to indicate the page in which the comment is reported.
oa:motivatedBy to describe the reason for the creation of the annotation, i.e. to formalise a specific excerpt as a comment.
frbr:complementOf to formalise the relationship between the comment and the excerpt to which it refers.

The body of the comment is included in a <section> element containing the following attributes:

@xmlns to specify the XML namespace for the document.
@type to specify the type of excerpt, i.e. comment.
@xml:id to assign a univocal ID to the excerpt.
@corresp to specify the excerpt to which it refers.
@resp to specify the author of the comment.

An example is reported below.


<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/comment/comm-002">
<rdf:type rdf:resource="http://www.w3.org/ns/oa#Annotation"/>
<oa:hasTarget rdf:resource="http://w3id.org/bufalinis-notebook/citation/livio-m-su-tm"/>
<oa:hasBody>
        <rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/note/comm-002">
        <rdf:value rdf:parseType="XMLLiteral">
                        <section xmlns="http://www.tei-c.org/ns/1.0"
                                        type="comment"
                                        xml:id="comm-002"
                                        corresp="#quot-001-1"
                                        resp="#PB">
                        <p>Gettare olio sulle acque torbide</p>
                        <p>Gettare olio sulle acque burrascose</p>
                        <p>Gettare olio sulle acque inasprite.</p>
                        <p>E si adagiano anzi in placida lentezza</p>
                        </section>
                </rdf:value>
                <frbr:realizer rdf:resource="http://w3id.org/bufalinis-notebook/person/pb"/>
                <prism:startingPage rdf:resource="http://w3id.org/bufalinis-notebook/page/1"/>
        </rdf:Description>
</oa:hasBody>
<oa:motivatedBy>
        <rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/comment">
                <rdfs:label>comment</rdfs:label>
        </rdf:Description>
</oa:motivatedBy>
<frbr:complementOf rdf:resource="http://w3id.org/bufalinis-notebook/quote/quot-001-1"/>
</rdf:Description>

Quotations

Each quotation excerpt in the notebook is formalised by including:

A URI to uniquely identify the specific citation.
cito:hasCitingEntity and cito:hasCitedEntity to allow the citation to be reified, so that it can become the subject or object of other RDF statements.
A URI inserted in a <rdf:Description> element nested into the <cito:hasCitedEntity>.
prism:startingPage to indicate the page in which the quotation is reported.
A URI to uniquely identify the specific quotation excerpt body.
oa:hasTarget to declare the relationship between an annotation and its target.
oa:hasBody to report the resource body of the annotation, i.e. the full-text of the quotation itself indicates, in this case, with its URI assigned in <cito:hasCitedEntity> element.
oa:motivatedBy to describe the reason for the creation of the annotation, i.e. to formalise a specific excerpt as a quotation.

The body of the quotation is included in a <section> elements containing the following attribute @xmlns to specify the XML namespace for the document.

An example is reported below. It reports the formalisation of a quotation from Lettere by Cicerone.


<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/citation/cicerone-lettere-46-ac">
<rdf:type rdf:resource="http://purl.org/spar/cito/Citation"/>
<cito:hasCitingEntity rdf:resource="http://w3id.org/bufalinis-notebook/page/112"/>
<cito:hasCitedEntity>
        <rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/excerpt/bibl126">
                <rdf:value rdf:parseType="XMLLiteral">
                        <section xmlns="http://www.tei-c.org/ns/1.0">
                        <p>Necessitati parēre semper sapientis est habitum</p>
                        </section>
                </rdf:value>
                <prism:startingPage rdf:resource="http://w3id.org/bufalinis-notebook/page/112"/>
        </rdf:Description>
</cito:hasCitedEntity>
</rdf:Description>
<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/quote/quot-150">
<rdf:type rdf:resource="http://www.w3.org/ns/oa#Annotation"/>
<oa:hasTarget rdf:resource="http://w3id.org/bufalinis-notebook/citation/cicerone-lettere-46-ac"/>
<oa:hasBody rdf:resource="http://w3id.org/bufalinis-notebook/excerpt/bibl126"/>
<oa:motivatedBy>
        <rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/quotation">
                <rdfs:label>quotation</rdfs:label>
        </rdf:Description>
</oa:motivatedBy>
</rdf:Description>

Translations

Each translation excerpt in the notebook is formalised by including:

A URI to uniquely identify the specific translation excerpt.
oa:hasTarget to declare the relationship between an annotation and its target.
oa:hasBody to report the resource body of the annotation, i.e. the full-text of the translation itself.
A URI to uniquely identify the specific translation excerpt body.
frbr:translationOf to specify which is the source excerpt of the translation.
frbr:realizer to specify the author of the translation.
prism:startingPage to indicate the page bearing the translation.
oa:motivatedBy to describe the reason for the creation of the annotation, i.e. to formalise a specific excerpt as a translation.

The body of the translation is included in a <section> element containing the following attributes @xmlns to specify the XML namespace for the document, @type to specify the type of excerpt, i.e. translation, @xml:id to assign a univocal ID to the excerpt, @corresp to specify the excerpt to which the translation refers, and @resp to specify the author of the translation.

An example of translation from Horace is reported below.


<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/comment/tra-001">
<rdf:type rdf:resource="http://www.w3.org/ns/oa#Annotation"/>
<oa:hasTarget rdf:resource="http://w3id.org/bufalinis-notebook/citation/horatio-carm-2-14-17"/>
<oa:hasBody>
<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/note/tra-001">
        <rdf:value rdf:parseType="XMLLiteral">
                <section xmlns="http://www.tei-c.org/ns/1.0"
                                type="translation"
                                xml:id="tra-001"
                                corresp="#quot-002"
                                resp="#PB">
                <lg>
                        <l>Dobbiamo vederlo il nero fiume lento</l>
                        <l>Cocito errante... </l>
                </lg>
                </section>
        </rdf:value>
        <frbr:translationOf rdf:resource="http://w3id.org/bufalinis-notebook/quote/quot-002"/>
        <frbr:realizer rdf:resource="http://w3id.org/bufalinis-notebook/person/pb"/>
        <prism:startingPage rdf:resource="http://w3id.org/bufalinis-notebook/page/2"/>
</rdf:Description>
</oa:hasBody>
<oa:motivatedBy>
<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/translation">
        <rdfs:label>translation</rdfs:label>
</rdf:Description>
</oa:motivatedBy>
<frbr:translationOf rdf:resource="http://w3id.org/bufalinis-notebook/quote/quot-002"/>
</rdf:Description>

Works

The works cited in the notebook are formalised using FaBiO.

According to the reference model, each literary work cited in the notebook is formalised as follows.

FRBR Work

A URI to uniquely identify the work.
rdf:resource="http://purl.org/spar/fabio/Poem" to represent poems.
rdf:resource="http://purl.org/spar/fabio/Play" to represent plays.
rdf:resource="http://purl.org/spar/fabio/Work" to represent published or potentially publishable works containing or referring to bibliographic references.
rdf:resource="http://purl.org/spar/fabio/WorkCollection" to represent collections of works.
owl:sameAs to associate VIAF/DBpedia/WorldCat identifiers.

FRBR Expression

A URI to uniquely identify the specific work Expression.
rdf:resource="http://purl.org/spar/fabio/Book" to identify the work realisation as a non-serial document in one or more volumes.
frbr:realizationOf to identify the reference Work.

Volume

A URI to uniquely identify a specific work Expression volume.
rdf:resource="http://purl.org/spar/fabio/Expression" which corresponds to a subclass of FRBR expression.
frbr:partOf to indicate that one entity is part of another entity, i.e. a specific volume is part of a specific work Expression.

Further information

A URI to uniquely identify a specific work subtype of Expression.
rdf:resource="http://purl.org/spar/fabio/Work" to represent published or potentially publishable works containing or referring to bibliographic references.
frbr:partOf to indicate that one entity is part of another entity, i.e. a specific subpart of volume is part of a specific work.
dcterms:creator to identify the subtype of Expression author.

The specific nature of a formalised subtype of Expression is represented also as follows:

rdf:resource="http://purl.org/spar/fabio/Letter".
rdf:resource="http://purl.org/spar/fabio/Novel".
rdf:resource="http://purl.org/spar/fabio/ShortStory".
etc.

An example is reported below, which show the formalisation of Ad familiares work by Cicerone.


<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/work/mtc-ad-familiares">
        <rdf:type rdf:resource="http://purl.org/spar/fabio/WorkCollection"/>
        <rdfs:label>Ad familiares</rdfs:label>
        <owl:sameAs rdf:resource="http://viaf.org/viaf/187066890"/>
        <owl:sameAs rdf:resource="http://worldcat.org/entity/work/id/2261383546"/>
        <dcterms:creator rdf:resource="http://w3id.org/bufalinis-notebook/person/mtc"/>
</rdf:Description>
<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/text/mtc-ad-familiares">
        <rdf:type rdf:resource="http://purl.org/spar/fabio/Book"/>
        <rdfs:label>Ad familiares</rdfs:label>
        <frbr:realizationOf rdf:resource="http://w3id.org/bufalinis-notebook/work/mtc-ad-familiares"/>
        </rdf:Description>
        <rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/text/mtc-ad-familiares/volume-4">
        <rdf:type rdf:resource="http://purl.org/spar/fabio/Expression"/>
        <rdfs:label>Ad familiares, 4</rdfs:label>
        <frbr:partOf rdf:resource="http://w3id.org/bufalinis-notebook/text/mtc-ad-familiares"/>
</rdf:Description>
<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/work/mtc-ad-familiares/9">
        <rdf:type rdf:resource="http://purl.org/spar/fabio/Work"/>
        <rdfs:label>Ad familiares, 9</rdfs:label>
        <frbr:partOf rdf:resource="http://w3id.org/bufalinis-notebook/work/mtc-ad-familiares"/>
        <dcterms:creator rdf:resource="http://w3id.org/bufalinis-notebook/person/mtc"/>
</rdf:Description>
<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/text/mtc-ad-familiares/volume-4/9">
        <rdf:type rdf:resource="http://purl.org/spar/fabio/Letter"/>
        <rdfs:label>Ad familiares, 4, 9</rdfs:label>
        <frbr:partOf rdf:resource="http://w3id.org/bufalinis-notebook/text/mtc-ad-familiares/volume-4"/>
        <frbr:realizationOf rdf:resource="http://w3id.org/bufalinis-notebook/work/mtc-ad-familiares/9"/>
</rdf:Description>
<rdf:Description rdf:about="http://w3id.org/bufalinis-notebook/excerpt/bibl126">
        <rdf:type rdf:resource="http://purl.org/spar/fabio/Excerpt"/>
        <rdfs:label>Ad familiares, 4, 9</rdfs:label>
        <frbr:partOf rdf:resource="http://w3id.org/bufalinis-notebook/text/mtc-ad-familiares/volume-4/9"/>
</rdf:Description>

Web application development

Web application development is the last SSDE development stage which is based on the following resources:

XML/TEI document, named bufalini_quaderno.xml.
HTML document, obtained from the XSL conversion, named quaderno.html.
RDF/XML dataset, named quaderno_rdf.rdf.

The development of the SSDE web application requires:

web.py to develop a server-side web application and manage the RDF data available on the triplestore.
blazegraph to store the dataset, including the aforementioned RDF document of the edition and the annotations included in bespoke named graphs.

To load data on blazegraph, it is necessary to install the rdflib library, run nquads.py and load rdf_dump/all.nq on blazegraph.

On the client-side, the following frameworks and libraries have been reused:

Web pages translation

The texts of the web application are bilingual, thanks to the JS library i18n.

The library allows to assign class/ID to HTML elements to be translated, and the translation is stored in dedicated files, such as Messages_en.properties and Messages_it.properties.

The file notebook.js reads the translations in the method loadBundles() and serves the client-side translation when clicking on the language buttons in the menu.

Filters for semantic queries

Filters are a client-side query tool completely developed in jQuery aimed at analysing the text of the edition from the point of view of its inter and intratextual relationships, as formalised in the SSDE back-end LOD dataset.

Filters make available a set of checkboxes divided into categories reflecting the actual SSDE dataset organisation and structure. The categories are Persons, under which all the persons mentioned in the notebook are collected; Role of persons, through which it is possible to assign a specific role to a mentioned person between Mentioned person and Author; Works, which lists the works cited in the notebook.

Three functions in notebook.js implement the functionalities of Filters: editionFilters() for managing the filters in the Digital Edition section, peopleIndexFilters() for managing the filters in the Index of Persons section, and worksIndexFilters() for managing the filters in the Index of Works section.

These are based on the same algorithm adapted to the needs of each section. The basic algorithm works as follows.

Input

The algorithm takes as input the category assigned as a @class to each checkbox <input> element and its univocal @value to run the functions related to the Filters widget.

In addition, each input filter is assigned with specific attributes to express the semantic relationships between that filter and the other entities formalised in the dataset. For instance, each person filter <input> is assigned with the attribute @semantics which reports all the entities having a relation with the entity represented in the filter - directly extracted from the RDF dataset - according to a specific syntax:

person who mentions the input person__fragment of the work in which the citation is present, author#unique code assigned to the fragment in the dataset

The values assigned to specific attributes express the relationships between entites as set in the dataset. They are useful to represent the same relationships at the user interface level.

(In case of homonyms, a -i string, where i=0 at the beginning of a cycle of data extraction, may be added in advance to avoid malfunctioning)

An example is reported below where @value="Thomas Mann" and other attributes are assigned to Thomas Mann entity filter


<input type="checkbox" class="peopleCheck" value="Thomas Mann" semantics:"Bruno Arzeni__Saggi. Schopenhauer, Nietzche, Freud, Bruno Arzeni#quot-060">
<label>Mann Thomas</label>

Processing

On input change of the given checkboxes, the algorithm searches for specific attributes values assigned to the elements of the reference text in edition/cards in indexes on which the query is performed that matches the input @value.

For instance, in the Digital Edition section, in the HTML file quaderno.html, which is uploaded on index.html by using an AJAX method, and on which the queries are performed, a comment entity is represented as follows.


<span class="tooltip persName" data-role="author" data-ref="#TM">
        <span class="tooltiptext">Mann, Thomas (1875–1955)</span>
        Th. Mann
</span>

By clicking on the checkbox corresponding to Thomas Mann, the algorithm takes as input the checkbox assigned @value corresponding to Thomas Mann and searches for it in .tooltiptext values in quaderno.html. In Digital Edition section, each DOM element containing each match is assigned with specific attributes which contain the same semantic values of filters <input> to show them in the list of results (getRes() function).


<span class="tooltip persName" data-role="author" data-ref="#TM" data-role="author" id="ThomasMann-346" mentionedperson="Thomas Mann" cit="quot-060" work="Saggi. Schopenhauer, Nietzche, Freud, Arzeni, Bruno (1905–1954)" author="Arzeni, Bruno (1905–1954)">
        <span class="tooltiptext">Mann, Thomas (1875–1955)</span>
        Th. Mann
</span>

Output

Depending on the web application section, a specific output is returned.

As explained, the algorithm is completely based on the matching between the attributes values assigned both to the inputs and to the reference corpus HTML DOM elements.

Data Visualisation charts

Different tools are used to develop Data Visualization charts:

Treemap - Google Charts.
Sets - ZingChart.
Graph - amCharts.
Play with data - amCharts.

All the charts are JavaScript charts based on a JSON data structure. Each required JSON has specific structural features depending on the type of Data Visualisation chart. Since the available data are stored in an RDF/XML data structure, it is fundamental to correctly select the data necessary to the visualisation and convert them from RDF/XML into JSON format.

To check the required JSON data structure features of each Data Visualisation chart, you are invited to consult the official documentation of each framework reported below:

Treemap - Google Charts: https://developers.google.com/chart/interactive/docs/gallery/treemap.
Sets - ZingChart: https://www.zingchart.com/docs/chart-types/bubble-pack.
Graph - amCharts: https://www.amcharts.com/docs/v4/chart-types/force-directed/.
Play with data - https://www.amcharts.com/docs/v4/chart-types/force-directed/.

To check Data Visualisation charts in Paolo Bufalini's notebook web application, it is possible to check the following web application sections:

Index of Persons

Treemap mode section.
Sets mode section.
Graph mode section.
Play with data mode section.

Index of Works

Treemap mode section.
Sets mode section.
Play with data mode section.

Specifiche tecniche

Overview

Project objectives

Methodology

RDF dataset components

Web application development

Filters for semantic queries

Go to web application section

Overview

Project objectives

Methodology

Markup model

HTML conversion

RDF conversion

Reused models

RDF dataset components

Persons

Comments

Quotations

Translations

Works

Web application development

Web pages translation

Filters for semantic queries

Input

Processing

Output

Data Visualisation charts

/DH.arc - Digital Humanities Advanced Research Centre