Digital citizen empowerment a sytematic literature review fusionado.pdf

Vista previa de texto
74
Ivan Bedini, Feroz Farazi, David Leoni, Juan Pane, Ivan Tankoyeu, Stefano Leucci
Figure 3: Dataset Selection step of ODR pipeline
Figure 4: An example of a simplified schema matching
Once dataset schema has been determined, during attribute value validation step the user can
adapt the dataset to the schema, exploiting OpenRefine data cleansing capabilities.
Successive attribute value disambiguation step employs Natural Language Processing
techniques for enriching dataset content by linking names to known entities (such as Dante
Alighieri, Florence) and words to concepts (such as male, city). In Figure 5 we see a screenshot of
long text that has been automatically enriched. OpenDataRise will show in red elements that still
require manual intervention from the user.
Within entity alignment step the framework considers rows in the dataset as entities, i.e. real
instances. The goal of this step is to schedule changes to entity storage to be committed in the
next step. Such changes can be either update of existing matching entities or creation of new
entities with values from the source dataset.
CC: Creative Commons License, 2014.
