Amsterdam Data Science – Coffee & Data: Controversy in Web Data

On 9th of June we are organising a Coffee & Data event with the Amsterdam Data Science community. The topic is “How to deal with controversy, bias, quality and opinions on the Web” and will be organised in the context of the COMMIT ControCurator project. In this project VU and UvA computer scientists and humanities researchers investigate jointly the computational modeling of controversial issues on the Web, and explore its application within real use cases in existing organisational pipelines, e.g. Crowdynews and Netherlands Institute for Sound and Vision.

The Agenda is as follows:

09:00-09:10 Coffee

Introduction & Chair by Lora Aroyo, Full Professor at the Web & Media group (VU, Computer Science)

09:10 – 09:25: Gerben van Eerten – Crowdynews deploying ControCurator

09:25 – 09:40: Kaspar Beelen – Detecting Controversies in Online News Media (UvA, Faculty of Humanities)

09:40 – 09:50: Benjamin Timmermans – Understanding Controversy Using Collective Intelligence (VU, Computer Science)

09:50 – 10:00: Davide Ceolin – (VU, Computer Science)

10:00 – 10:15: Damian Trilling – (UvA, Faculty of Social and Behavioural Sciences)

10:15 – 10:30: Daan Oodijk (Blendle)

10:30 – 10:45: Andy Tanenbaum – “Unskewed polls” in 2012

10:45 – 11:00: Q&A Coffee

The event takes place at the Kerkzaal (HG-16A00) on the top floor of the VU Amsterdam main building.

Posted in CrowdTruth, Projects

VU’s 4th ICT4D symposium: a look back

Yesterday, 18 May 2017, the 4th International ICT4D symposium was held at Vrije Universiteit Amsterdam.  The event was organized by the W4RA team and supported by VU Network Institute, the Netherlands Research School for Information and Knowledge Systems SIKS, VU Computer Science Department and VU International Office. Invited speakers from Ghana, France and the Netherlands highlighted this year’s theme was “Sustainability and ICT4D”.

Keynote speaker Gayo Diallo from Universite de Bordeaux discussed the possibilities of ICT for African Traditional Medicine (ATM). In his talk, he showed how semantic web technologies can play a role here to connect heterogeneous datasets for analytics and end-user services. Such services would need to be based on voice-interaction and localized technologies. His slides can be found here.

Chris van Aart from 2Coolmonkeys discussed a number of smartphone applications developed in the context of W4RA activities, including Mr. Jiri a tree-counting application. He proved there is a market for such applications in the African context (Slides).

After the break, Francis Dittoh from UDS Ghana discussed issues around sustainbility for a meteo application he is currently developing for Northern-Ghana (slides). Wendelien Tuijp from VU’s CIS then presented multiple perspectives on ICT4D (Slides). The symposium was closed by a video presentation from Aske Robenhagen, showcasing the ongoing work in Nepal around mapping knowledge networks and developing a smartphone application supporting information exchange for local accountability extension workers. More information on that project can be found at

The presentations of the day can be found through the links above. The entire symposium was live-streamed and you can watch it all on youtube or below.

Below is a lost of the approximate starting time of the various speakers in the video

  • 6m19 Dr. Gayo Diallo – Université de Bordeaux (FR): Towards a Digital African Traditional Healthcare using Semantic Web.
  • 56m28 Dr. Chris van Aart – 2CoolMonkeys BV (NL) : Developing Smartphone Apps for African farmers.
  • 1h30m00 break.
  • 1h52m00 Francis Dittoh – University for Development Studies (Ghana): ICT business development in rural Africa.
  • 2h23m00 Wendelien Tuyp – CIS-VU : Sustainable Community Initiatives and African Farmer Innovation.
  • 2h52m00 Aske Robenhagen Network Institute Academy Assistant VU – Building resilient applications for sustainable development. Better video of this can be found at

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Big Data Europe Platform paper at ICWE 2017

With the launch of the Big Data Europe platform behind us, we are telling the world about our nice platform and the many pilots in the societal challenge domains that we have executed and evaluated. We wrote everything down in one comprehensive paper which was accepted at the 7th international conference on Web Engineering (ICWE 2017) which is to be held in Rome next month.

High-level BDE architecture (copied from the paper Auer et al.)

The paper “The BigDataEurope Platform – Supporting the Variety Dimension of Big Data”  is co-written by a very large team (see below) and it presents the BDE platform — an easy-to-deploy, easy-to-use and adaptable (cluster-based and standalone) platform for the execution of big data components and tools like Hadoop, Spark, Flink, Flume and Cassandra.  To facilitate the processing of heterogeneous data, a particular innovation of the platform is the Semantic Layer, which allows to directly process RDF data and to map and transform arbitrary data into RDF. The platform is based upon requirements gathered from seven of the societal challenges put forward by the European Commission in the Horizon 2020 programme and targeted by the BigDataEurope pilots. It is validated through pilot applications in each of these seven domains. .A draft version of the paper can be found here.


The full reference is:

Sören Auer, Simon Scerri, Aad Versteden, Erika Pauwels, Angelos Charalambidis, Stasinos Konstantopoulos, Jens Lehmann, Hajira Jabeen, Ivan Ermilov, Gezim Sejdiu, Andreas Ikonomopoulos, Spyros Andronopoulos, Mandy Vlachogiannis, Charalambos Pappas, Athanasios Davettas, Iraklis A. Klampanos, Efstathios Grigoropoulos, Vangelis Karkaletsis, Victor de Boer, Ronald Siebes, Mohamed Nadjib Mami, Sergio Albani, Michele Lazzarini, Paulo Nunes, Emanuele Angiuli, Nikiforos Pittaras, George Giannakopoulos, Giorgos Argyriou, George Stamoulis, George Papadakis, Manolis Koubarakis, Pythagoras Karampiperis, Axel-Cyrille Ngonga Ngomo, Maria-Esther Vidal.   . Proceedings of The International Conference on Web Engineering (ICWE), ICWE2017, LNCS, Springer, 2017


Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Trip report: Museums and the Web Conference 2017 (MW17)

Between 19-22 April 2017, the MW (Museums and the Web) conference took place in Cleveland, Ohio, USA. I was there to give a presentation about DigiBird, a valorization project supported by the Dutch national program COMMIT/. The project lasted for 6 months and the results were summarized in this paper (see proposal). We were given a slot in the panel session titled “How Can We Connect Online Audiences With Online Collections?”. In this panel session the presentations and discussions were focused on describing strategies that can be used to link and engage online users with online collections, but also on ways of connecting online collections together. The presentation of our project can be found below.

Next, I will tell you more about the conference itself and my experience there as an attendee.

The MW conference series started taking place since 1997 and its history can be traced back by more than 1000 papers that can be accessed online from the past 20 years. The conference takes place every year in North America and Asia and mostly gathers together professionals from the cultural heritage domain. But, its attendees also include, as the MW organizers also mention: “webmasters, educators, curators, librarians, designers, managers, directors, scholars, consultants, programmers, analysts, publishers and developers from museums, galleries, libraries, science centers, and archives – as well as the companies, foundations and governments that support them”. Thus, the people that attend the conference come from very diverse backgrounds and have as a common interest the cultural heritage domain, be it from an artistic, a cultural or a technological point of view.

This diversity of the audience is also reflected in the organization of the conference itself. As reflected in this year`s program, the conference hosted panel sessions for presentations of formal papers, professional forums, How-to-sessions, Lightning Talks (Pecha Kucha-style), “Birds of a Feather” round-tables and an exhibition. Also, during this conference, the GLAMi (formerly Best of the Web) awards are given for the best organizations or projects that make innovations in the the cultural heritage domain.

You can read the MW17 notes that I took during some of the presentations that I attended.

The MW17 conference was a great experience for me and I was happy to represent our DigiBird project there. As most of the presentations described US-related projects, ours brought a nice nuance of orange.

In the end, I would like to thank all the people that have contributed to make this presentation and project happen, including Chris Dijkshoorn, Maarten Brinkerink, Sander Pieterse and, last but not least, Lora Aroyo.



Posted in Conferences, Papers, Trip Reports

DANS Linked Data in Research and Cultural Heritage Seminar

On the 1st of May 2017, the Linked Data in Research and Cultural Heritage Seminar (#DANSLOD) took place in Den Haag, the Netherlands. This seminar series was organized by DANS (Data Archiving and Networked Services) – Netherlands Institute for Permanent Access to Digital Research Resources. This seminar series was arranged by Herbert van de Sompel and its main focus was to present new advancements related to Linked Data with special emphasis on creating, keeping and using Linked Data in the context of a distributed and decentralized Web. According to the Schedule and plan of DANS Linked Data in Research and Cultural Heritage Seminar, the seminar was organized into two sessions, each followed by a panel discussion.

The first session of presentations considered different aspects of Linked Data in the context of a distributed and decentralized Web. Ruben Verborgh from Ghent University underlined the imperious need and extra care that proprietors of Linked Data should have when working with their data collections such that the isolated silos of Linked Data would be easily linked together in a knowledge graph. Afterwards, Tobias Kuhn from our group introduced nano-publications as a way of publishing data on the Web in a decentralized manner, a solution that would allow research results to be be both replicated and re-used. You can find the slides of this talk here:

Next, Sarven Capadisli from Bonn University argued how creating a Linked Data ecosystem would help connect contributors, libraries, institutions, publishers and researchers, etc. together and to also help in the social paradigm shift of Linked Data which lags considerably behind the technical aspects of this technology. Lastly, Michel Dumontier of Maastricht University introduced the FAIR (Findable, Accessible, Interoperable, Reusable) principles as a generic metric for evaluation of the quality of data repositories or collections.

The second sessions of presentations showed how Linked Data principles were applied by certain institutions when dealing with their data and how certain applications can help in the incorporation of these principles in the normal data workflow. Miel Vander Sande from Ghent University showed how problems like reproducibility and sustainability can encumber breaking the barriers of the data silos in the world of Linked Data. As solutions he proposed a combination of querying with Linked Data Fragments using the Triple Data Fragments interface and the Memento “Time Travel for the Web” protocol. Valentine Charles and Nuno Freire from Europeana presented their new approaches regarding future data acquisition. Among the technologies that their organization experimented with and that they want to include in the Europeana data acquisition workflow are IIIF (International Image Interoperability Framework), Sitemap and Next, Enno Meijers from KB (Koninklijke Bibliotheek) – The National Library of the Netherlands, introduced the roadmap, strategy and design for creating a distributed web of cultural heritage information within DEN (Digitaal Erfgoed Nederland) – Digital Heritage Netherlands. Lastly, Albert Meroño Peñuela of Vrije Universiteit Amsterdam demonstrated how Github can be used as a hub for (repeatable) SPARQL queries by using a tool called glrc.

The discussions that followed each session of presentations tried to address and tackle the current pressing issues in the world of Linked Data. Also, the presenters answered questions and raised themes for debate. Overall, this seminar was a good way to spend a day for a person interested in Linked Data. Both the presentations and the discussions that followed touched upon pressing and current issues in the field and left place for other open and unresolved issues. Linking data is not so simple after all!



Posted in Events, Workshops

INVENiT² project presentation for the “Digital Humanities” course

On Tuesday, the 11th of April 2017, Cristina-Iulia Bucur, one of the previous academy assistants for INVENiT² gave a presentation about the project during the “Digital Humanities” course at the Vrije Universiteit Amsterdam. The presentation was titled “INVENiT II – New ways of opening up cultural religious heritage” and briefly described the experience of the project team throughout the period in which the project ran.

The presentation focused on why it is important to link data to research as a way to better support scholars and researchers in the humanities field, with emphasis on the cultural religious heritage at the UBVU, the University Library of Vrije Universiteit Amsterdam. The workflow needed for this and the steps that were taken during the project were briefly described. First, two 18th century illustrated bibles from the “Special Collections” of UBVU were digitized, then Linked Data was used as a framework to publish this (meta)data and also link individual prints to bibles. Next, various crowd- and nichesourcing events were organized to further annotate and enrich the data about these prints. Finally, the new information was incorporated into the UBVU system. This way, the biblical prints have been enriched with new information and can better support the research of scholars.

The presentation slides can be found below.



Posted in INVENiT, Projects

Dutch Ships and Sailors SPARQL handson exercises

I made these exercises a while ago but keep re-using them for SPARQL tutorials and hands on sessions. The hands on page lists a number of sparql queries that one can copy-paste into the interactive query field of the Dutch Ships and Sailors live triple store. Note that not all of them are still working as originally intended, as that triple store is constantly changing.

Have a go yourself.

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

[Reading club] Dance in the World of Data and Objects

This is a first post in a new series on Semantic Web reading club. During this weekly reading club we discuss a research paper related to Semantic Web, Human Computation or Computer Science in general. Every week, one group member selects and prepares a paper to discuss. This week it was my time and I chose a paper from 2013: “Dance in the World of Data and Objects” by Katerina El Raheb and Yannis Ioannidis (full citation and abstract below). The paper presents the need for (OWL) ontologies for dance representation. A quite nice slide deck supporting the paper is found here.

‘Dance’. CC-By (Teresa Alexander-Arab) 

Computer-interpretable knowledge representations for dance is something I have been thinking about for a while now. I am mostly interested in representations that actually match the conceptual level at which dancers and choreoraphers communicate and how these are related to low-level representations such as Labanotation. I am currently supervising two Msc students on this topic.

The paper by El Raheb and Ioannidis and our discussion afterwards outlined the potential use of such a formal representations for:

  1. Archiving dance and for retrieval. This is a more ‘traditional’ use of such representations in ICT for Cultural Heritage. An interesting effect of having this represented using standard semantic web languages is that we can connect deep representations of choreographers to highly heterogeneous knowledge about for example dance or musical styles, locations, recordings, emotions etc. An interesting direct connection could be to Albert Merono’s RDF midi representations.
  2. For dance analysis. By having large amounts of data in this representation, we can support Digital Humanities research. Both in more distant reading, but potentially also more close analysis of dance. Machine learning techniques could be of use herer.
  3. For creative support. Potentially very interesting is to investigate to what extent representations of dance can be used to support the creative process of dancers and choreographers. We can think of pattern-based adaptations of choreographies.

Abstract: In this paper, we discuss the challenges that we have faced and the solutions we have identified so far in our currently on-going effort to design and develop a Dance Information System for archiving traditional dance, one of the most significant realms of intangible cultural heritage. Our approach is based on Description Logics and aims at representing dance moves in a way that is both machine readable and human understandable to support semantic search and movement analysis. For this purpose, we are inspired by similar efforts on other cultural heritage artifacts and propose to use an ontology on dance moves (DanceOWL) that is based on the Labanotation concepts. We are thus able to represent dance movement as a synthesis of structures and sequences at different levels of conceptual abstraction, which serve the needs of different potential users, e.g., dance analysts, cultural anthropologists. We explain the rationale of this methodology, taking into account the state of the art and comparing it with similar efforts that are also in progress, outlining the similarities and differences in our respective objectives and perspectives. Finally, we describe the status of our effort and discuss the steps we intend to take next as we proceed towards the original goal.

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Automatic interpretation of spreadsheets

When humans read a spreadsheet table, they look at both the table design and the text within the table. They interpret the table layout, and use background knowledge, to understand the meaning and context of the data in a spreadsheet table. In our research we teach computers to do the same. We describe our method in the paper “Combining information on structure and content to automatically annotate natural science spreadsheets”.

Read more ›

Posted in Papers, Uncategorized

DIVE+ Submitted to LODLAM

Here’s the submission to the annual LODLAM challenge from the DIVE+ team. In this video, we introduce the ideas behind DIVE+ and take you for a exploratory swim in the linked media knowledge graph!

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer