Jorge Morato visits the Web and Media Group

From Jorge Morato:

I received a short Erasmus stay last September 2014 at the Web & Media group. The goal of these stays is to create links and shared research projects between institutions, to acquire knowledge and good practices, and to motivate students and staff to take part in Erasmus mobility. As a result a new Erasmus agreement has been established. This agreement will strengthen the links between our institutions allowing Erasmus students and researchers to visit each otherís country.

One of the main benefits of these visits is to learn good practices for the coordination and organization of research groups. Despite any research group share a commitment to solve a research problem, the way to achieve this goal has always interesting differences. In this case, the host group is much larger than mine, and includes other universities and research centers in The Netherlands. People were extremely welcoming and lab meetings and other join activities boosted the interest in each other’s research.

The Web & Media group has a long and relevant research in topics such as crowdsourcing, cultural heritage, semantic indexing, and semantic web. We have been working in many of these topics in the past, and I am confident that this is a good basis to carry out collaborations in the future.

During this stay, I learnt about this groupís approach to annotate and manage cultural heritage collections and historiography. Some of the topics that were especially interesting included the way to identify specialists for annotating pictures at the Rijsmuseum , the approach to measure crowd truth, the new developments in SKOS , and the platform Data2Semantics, to publish scientific data in a meaningful way.

Regarding some of these projects, I think that there are interesting aspects to study in the future, as for example applying pooling or positioning algorithms to crowdsourcing results. Although there are always problems due to misbehavior, and it is a necessity to detect spammers, careless answers or bots. There is also a need to study the fact that a worker could perform poorly in one topic. Another interesting aspect is to take into account a lexical analysis in the natural language sentences to assess the degree of difficulty to understand and extract information from that sentence. Some of these clues to know the difficulty could be the frequency of negations, pronouns, locutions, quantifiers, verbs in conditional tense, legibility index or specialized domain terms.

Almahisto is one of our projects that I think has shared goals with some projects of the group. Its goal is to build a repository with annotated historiographical documents with linked data. Many of the lessons learned in my stay could be interesting to be applied in its continuation that will deal with detecting point of views and trustworthiness in RDF statements from the texts.

