Share, repeat and verify Scientific Experiments with Software Containers

[This post is by Rogier Mars about his Master project]

During my years at the VU as a student Information Sciences, I was often requested to form a project group and work on some kind of problem. Most likely, I was the one to implement the technical part after coming to a solution with my team. I always enjoyed this, mainly because of the high variety in the work performed. A small selection includes an information visualization for crime rates in the Netherlands; analysis of the influence of weather on the Dutch public transportation; using a cognitive system to enhance the performance of a tourist chat bot; programming AI to compete in games against other student groups and developing smart home technologies to aid in elderly care.

All of these prototypes and experiments consisted partly of software development and partly of the clever use of existing information technology to make life easier or to come to new insights and ideas. As you can imagine, during such a project, you are totally sucked into it. There is a deadline to reach, a presentation to prepare, the pressure is high to finish the project successfully. This pressure often results in sloppier working methods: for example, I did not include any documentation at all. I must confess: it would take me quite some time to dig into any of these projects. Even if I could find the code and data, it is highly likely that I could get it working without some serious trouble solving.

As it turns out I’m not the only researcher having these problems. Even for more recent ACM conferences and journals that are backed by code and data, in fifty percent of the time other researchers could not successfully repeat them. This has a negative effect on the efficiency of replication studies: which could result in less quality research. This had me thinking: if I could go back in time, could I do it better now? How would I do that and how hard is it? Could other researchers do this as well?

The software container platform Docker emerged in 2013 and is widely used by businesses throughout the world for web application hosting. With Docker you can easily create software containers that are portable to any other operating system that runs Docker, currently: Windows, Linux and MacOS. In literature is described how Docker can be used efficiently for research:

“By encapsulating the computational environment of scientific experiments into software containers, you bypass many dependency issues and the need for precise documentation.“

My master project included an implementation of Docker on several scientific experiments for means of increasing repeatability and I evaluated of this method with students and researchers in and around the Amsterdam area. By means of a controlled experiment, I’ve created the scenario for researchers to work with Docker on an example project on their own personal computer. Afterwards I’ve evaluated this method by means of existing scales and measures in questionnaires. I’ve compared this method with the traditional approach and participants were equally divided over both methods. The focus lied on usability, perceived usefulness and perceived ease of use. How the Docker method worked exactly was harder for participants to grasp than how the existing method worked, but overall they deemed it as more useful for repeating and verifying scientific experiments. The method was not perceived more usable, but it was definitely more reliable. There was still a difference in the perceived usefulness between new and existing users: it appears that if you understand how Docker works, you perceive it as more useful in general and for research.

If I could do it again, I would use Docker to create a computational environment for my experiments and I incline other researchers to do the same. The responsibility for successful execution of the code could shift from the replicator to the creator of the experiment. Eventually, this could make replication studies for computational science more fun and less time consuming.

Posted in Masters Projects

“New life for old media” to be presented at NEM Summit 2017

The extended abstract “Investigations into Speech Synthesis and Deep Learning-based colorization for audiovisual archives” has been accepted for publication at the NEM (New NEM (cc-by circle ©heese https://www.flickr.com/photos/gratisdbth/7805513264)Eureopean Media) Summit 2017 to be held in Madrid end-of-November. This paper is based on Rudy Marsman’s thesis “Speech technology and colorization for audiovisual archives” and describes his research on using AI technologies in the context of an the Netherlands Institute for Sound and Vision. Specifically, Rudy experimented with developing speech synthesis software based on a library of narrated news videos (using the voice of the late Philip Bloemendal) and with the use of pre-trained deep learning colorization networks to colorize archival videos.

You can read more in the draft paper [PDF]:

Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen New life for old media: Investigations into Speech Synthesis and Deep Learning-based colorization for audiovisual archives. Extended Abstract proceedings of NEM summit 2017

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Lisbon Machine Learning Summer School 2017 – Trip Report

In the second half of July (20th of July – 27th of July) I attended the Lisbon Machine Learning Summer School (LxMLS2017). As every year, the summer school is held in Lisbon, Portugal, at Instituto Superior Técnico (IST). The summer school is organized jointly by IST, the Instituto de Telecomunicações, the Instituto de Engenharia de Sistemas e Computadores, Investigação e Desenvolvimento em Lisboa (INESC-ID), Unbabel, and Priberam Labs.

Around 170 students (mostly PhD students but also master students) attended the summer school. It’s important to mention that around 40% of the applicants are accepted, so make sure you have a strong motivation letter! For eight days we learned about machine learning with focus on natural language processing. The day was divided into 3 parts: lectures in the morning, labs in the afternoon and practical talks in the evening (yes, quite a busy schedule).

Morning Lectures

In general, the morning lectures and the labs mapped really well, first learn the notions and then put them into practice. During the labs we worked with Python and IPython Notebooks. Most of the labs had the base code already implemented and we just had to fill in some functions. However, for some of the lectures/labs this wasn’t that easy. I’m not going to discuss in detail the morning lectures but I’ll mention the speakers and their topics (also, the slides are available of the website of the summer school):

  • Mario Figueiredo: an introduction to probability theory which proved to be fundamental for understanding the following lectures.
  • Stefan Riezler: an introduction to linear learners using an analogy with the perceptual system of a frog, i.e., given that the goal of a frog is to capture any object of the size of an insect or worm providing it moves like one, can we build a model of this perceptual system and learn to capture the right objects?
  • Noah Smith: gave an introduction of sequence models such as Markov models and Hidden Markov models and presented the Viterbi algorithm which is used to find the most likely sequence of hidden states.
  • Xavier Carreras: talked about structured predictors (i.e., given training data, learn a predictor that performs well on unseen inputs) using as running example a named entity recognition task. He also discussed about Conditional Random Fields (CRF), approach that gives good results in such tasks.
  • Yoav Goldberg: talked about syntax and parsing by providing many examples of using them in sentiment analysis, machine translation and many other examples. Compared to the rest of the lectures, this one had much less math and was easy to follow!
  • Bhiksha Raj: gave an introduction to neural networks, more exactly convolutional neural networks (CNN) and recurrent neural networks (RNN). He started with the early models of human cognition, associationism (i.e., humans learn through association) and connectionism (i.e., the information is in the connexions and the human brain is a connectionist machine).
  • Chris Dyer: discussed about modeling sequential data with recurrent networks (but not only). He showed many examples related to language models, long short-term memories (LSTMs), conditional language models, among others. However, even if it’s easy to think of tasks that
 could be solved by conditional language models, most of the times the data does not exist, a problem that seems to appear in many fields and many examples.

Practical Talks

In the last part of the day we had practical talks or special talks of concrete applications that are based on the techniques learnt during the morning lectures. During the first day we were invited to attend a panel discussion named “Thinking machines: risks and opportunities” at the conference “Innovation, Society and Technology” where 6 speakers (Fernando Pereira – VP and Engineering Fellow at Google, Luís Sarmento – CTO at Tonic App’s, André Martins – Unbabel Senior researcher, Mário Figueiredo – Instituto de Telecomunicações at IST, José Santos Victor – president of the Institute for Systems and Robotics at IST and Arlindo Oliveira – president of Instituto Superior Técnico) in the AI field discussed about the benefits and risks of artificial intelligence and automatic learning. Here are a couple of thoughts:

  • Fernando Pereira: In order to enable people to make better use of technology, we need to make machines smarter at interacting with us and helping us.
  • André Martins pointed out an interesting problem: people spend time on solving very specific things but these are never generalized. -> but what if this is not possible?
  • Fernando Pereira: we build smart tools but only a limited amount of people are able to control them, so we need to build the systems in a smarter way and make the systems responsible to humans.

Another evening hosted the Demo Day, an informal gathering that brings together a number of highly technical companies and research institutions, all with the aim of solving machine learning problems through technology. There were a lot of enthuziastic people to talk to, many demos and products. I even discovered a new crowdsourcing platform, DefinedCrowd that soon might start competing with CrowdFlower and Amazon Mechanical Turk.

Here are some other interesting talks that we followed:

  • Fernando Pereira – “Learning and representation in language understanding”: talked about learning language representation using machine learning. However, machine understanding of language is not a solved problem. Learning from labeled data or learning with distant supervision may not yield the desired results, so it’s time to go implicit. He then introduced the work done by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention Is All You Need. In this paper, the authors claim that you do not need complex CNNs or RNNs models, but it’s enough to use attention mechanisms in order to obtain quality machine translation data.
  • Graham Neubig – “Simple and Efficient Learning with Dynamic Neural Networks”: dynamic neural networks such as DyNet can be used as alternatives to TensorFlow or Theano. According to Graham, here as some advantages of using such nets: the API is closer to standard Python/C++ and it’s easier to implement nets with varying structure and some disadvantages: it’s harder to optimize graphs (but still possible) and it’s also harder to schedule data transfer.
  • Kyunghyun Cho – “Neural Machine Translation and Beyond”: showed why sentence-level and word-level machine translation is not desired: (1) it’s inefficient to handle various morphological words variants, (2) we need good tokenisation for every language (not that easy), (3) they are not able to handle typos or spelling errors. Therefore, character-level translation is what we need because it’s more robust to errors and handles better rare tokens (which are actually not necessarily rare).
Posted in CrowdTruth, Projects

Trip Report: Dagstuhl Seminar on Citizen Science

A month ago, I had the opportunity to attend the Dagstuhl Seminar  Citizen Science: Design and Engagement. Dagstuhl is really a wonderful place. This was my fifth time there. You can get an impression of the atmosphere from the report I wrote about my first trip there. I have primarily been to Dagstuhl for technical topics in the area of data provenance and semantic data management as well as for conversations about open science/research communication.

This seminar was a great chance for me to learn more about citizen science and discuss its intersection with the practice of open science. There was a great group of people there covering the gamut from creators of citizen science platforms to crowd-sourcing researchers. 17272.01.l

As usual with Dagstuhl seminars, it’s less about presentations and more about the conversations. There will be a report documenting the outcome and hopefully a paper describing the common thoughts of the participants. Neal Reeves took vast amounts of notes so I’m sure that this will be a good report :-). Here’s a whiteboard we had full of input:

2017-07-05 11.28.24.jpg

Thus, instead of trying to relay what we came up with (you’ll have to wait for the report), I’ll just pull out some of my own brief highlights.

Background on Citizen Science

There were a lot of good pointers on where to start understand current thinking around citizen science. First, two tutorials from the seminar:

What do citizen science projects look like:

Example projects:

How should citizen science be pursued:

And a Book:

Open Science & Citizen Science

Claudia Göbel gave an excellent talk about the overlap of citizen science and open science. First, she gave an important reminder that science in particular in the 1700s was done as public demonstrations walking us through the example painting below. 2017-07-04 11.23.02

She then looked at the overlap between citizen science and open science. Summarized below:

citizenopenscience.png

A follow-on discussion at the with some of the seminar participants led to input for a whitepaper that is being developed through the ECSA on Citizen & Open Science for Europe. Check out the preliminary draft. I look forward to seeing the outcome.

Questioning Assumptions

One thing that I left the seminar thinking about was was the need to question my own (and my field’s) assumptions. This was really inspired by talking to Chris Welty and reflecting on his work with Lora Aroyo on the issues in human annotation and the construction of gold sets.  Some assumptions to question:

  • What qualifications you need to have to be considered a scientist.
  • Interoperability is a good thing to pursue.
  • Openness is a worthy pursuit.
  • We can safely assume a lack of dynamics in computational systems.
  • That human performance is good performance.

Indeed, in Marissa Ponti she pointed to the example below and highlighted some of the potential ramifications of what each of these (what at first blush are positive) citizen science projects could lead to. 2017-07-03 10.06.36

That being said, the ability to rapidly engage more people in the science system seems to be a good thing indeed. An an assumption I’m happy to hold.

Random

Filed under: trip report Tagged: citizen science, dagstuhl, open science
Source: Think Links

Posted in Paul Groth, Staff Blogs

Identifying emotions in email with human-level accuracy

As part of the Master’s degree Business Analytics at the VU Amsterdam, Erwin Huijzer completed his master thesis at Anchormen:
“Identifying effective affective email responses; Predicting customer affect after email conversation”

When customers contact a company with regards to queries and complaints, often they prefer to use email. Handling these emails is a massive task for the Customer Support department. Automating email handling can help improve , reduce costs and shorten response time. However, awareness of customer emotion during the conversation is an important aspect in effective email handling.

In the thesis, sentiment analysis was used on incoming customer emails to determine the initial emotion of a customer. Furthermore, affect analysis was applied to predict the customer’s emotion after the response email from Customer Support. Both analyses were executed using supervised machine learning which trains computer models based on labelled data. This required manual labelling of a set of emails with sentiment (None, Neg, Pos, Mix) and emotions (Anger, Disgust, Fear, Joy, Sadness).

Manual labelling revealed that humans find it very difficult to determine emotions in email. Still, using majority vote, a reliable labelset could be determined. Applying machine learning (voting ensemble of Random Forest and Neural Net) on the labelled data resulted in human-level accuracy for Anger and Joy. For Disgust, the model even significantly outperforms human annotation. Using the same voting ensemble and including SVM, leads to human-level performance on Sentiment too. In both sentiment and emotions, the domain specific models trained on a small (742) set of emails outperforms a commercial model that was trained on millions of news sources.

Machine learning to predict customer affect, showed low performance. Still, results are significantly better than the benchmarks. A more direct measurement of customer affect may however drastically improve performance.

The full thesis is available for download here. The presentation is available here.

Posted in Masters Projects

A Concentric-based Approach to Represent Topics in Tweets and News

[This post is based on the BSc. Thesis of Enya Nieland and the BSc. Thesis of Quinten van Langen (Information Science Track)]

The Web is a rich source of information that presents events, facts and their evolution across time. People mainly follow events through news articles or through social media, such as Twitter. The main goal of the two bachelor projects was to see whether topics in news articles or tweets can be represented in a concentric model where the main concepts describing the topic are placed in a “core”, and the concepts less relevant are placed in a “crust”. In order to answer to this question, Enya and Quinten addressed the research conducted by José Luis Redondo García et al. in the paper “The Concentric Nature of News Semantic Snapshots”.

Enya focused on the tweets dataset and her results show that the approach presented in the aforementioned paper does not work well for tweets. The model had a precision score of only 0.56. After a data inspection, Enya concluded that the high amount of redundant information found in tweets, make them difficult to summarise and identify the most relevant concepts. Thus, after applying stemming and lemmatisation techniques, data cleaning and similarity scores together with various relevance thresholds, she improved the precision to 0.97.

Quinten focused on topics published in news articles. When applying the method described in the reference article, Quinten concluded that relevant entities from news articles can be indeed identified. However, his focus was also to identify the most relevant events that are mentioned when talking about a topic. As an addition, he calculated a term frequency inverse document frequency (TF-IDF) score and an event-relation (temporal relations and event-related concepts) score for each topic. These combined scores determines the new relevance score of the entities mentioned in a news article. The improvements made improved the ranking of the events, but did not improve the ranking of the other concepts, such as places or actors.

Following, you can check the final presentations that the students gave to present their work:

A Concentric-based Approach to Represent News Topics in Tweets
Enya Nieland, June 21st 2017

The Relevance of Events in News Articles
Quentin van Langen, June 21st 2017

Posted in CrowdTruth, Projects

Elevator Annotator: Local Crowdsourcing on Audio Annotation

[This post is based on Anggarda Prameswari’s Information Sciences MSc. Thesis]

For her M.Sc. Project, conducted at the Netherlands Institute for Sound and Vision (NISV), Information Sciences student Anggarda Prameswari (pictured right) investigated a local crowdsourcing application to allow NISV to gather crowd annotations for archival audio content. Crowdsourcing and other human computation techniques have proven their use for collecting large numbers of annotations, including in the domain of cultural heritage. Most of the time, crowdsourcing campaigns are done through online tools. Local crowdsourcing is a variant where annotation activities are based on specific locations related to the task.

The two variants of the Elevator Annotator box as deployed during the experiment.
The two variants of the Elevator Annotator box as deployed during the experiment.

Anggarda, in collaboration with NISV’s Themistoklis Karavellas, developed a platform called “Elevator Annotator”, to be used on-site. The platform is designed as a standalone Raspberry Pi-powered box which can be placed in an on-site elevator for example. It features a speech recognition software and a button-based UI to communicate with participants (see video below).

The effectiveness of the platform was evaluated in two different locations (at NISV and at Vrije Universiteit) and with two different modes of interaction (voice input and button-based input) through a local crowdsourcing experiment. In this experiments, elevator-travellers were asked to participate in an experiment. Agreeing participants were then played a short sound clip from the collection to be annotated and asked to identify a musical instrument.

The results show that this approach is able to achieve annotations with reasonable accuracy, with up to 4 annotations per hour. Given that these results were acquired from one elevator, this new form of crowdsourcing can be a promising method of eliciting annotations from on-site participants.

Furthermore, a significant difference was found between participants from the two locations. This indicates that indeed, it makes sense to think about localized versions of on-site crowdsourcing.

More information:

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Events panel at DHBenelux2017

At the Digital Humanities Benelux 2017 conference, the e-humanities Events working group organized a panel with the titel “A Pragmatic Approach to Understanding and Utilizing Events in Cultural Heritage”. In this panel, researchers from  Vrije Universiteit Amsterdam, CWI, NIOD, Huygens ING, and Nationaal Archief presented different views on Events as objects of study and Events as building blocks for historical narratives.

#DHBenelux #panel: understanding events #fullhouse @ChielvdAkker kicks off with #digital #hermeneutics for #interpretation #support pic.twitter.com/0j9kEAF8SG

— Lora Aroyo (@laroyo) July 5, 2017

The session was packed and the introductory talks were followed by a lively discussion. From this discussion it became clear that consensus on the nature of Events or what typology of Events would be useful is not to be expected soon. At the same time, a simple and generic data model for representing Events allows for multiple viewpoints and levels of aggregations to be modeled. The combined slides of the panel can be found below. For those interested in more discussion about Events: A workshop at SEMANTICS2017 will also be organized and you can join!

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

DIVE+ receives the Grand Prize at the LODLAM Summit in Venice

We are excited to announce that DIVE+ has been awarded the Grand Prize at the LODLAM Summit, held at the Fondazione Giorgio Cini this week. The summit brought together ~100 experts in the vibrant and global community of Linked Open Data in Libraries, Archives and Museums. It is organised bi-annually since 2011. Earlier editions were held in the US, Canada and Australia, making the 2017 edition the first in Europe.

The Grand Prize (USD$2,000) was awarded by the LODLAM community. It’s recognition of how DIVE+ demonstrates social, cultural and technical impact of linked data. The Open Data Prize (of USD$1,000) was awarded to WarSampo for its groundbreaking approach to publish open data

Fondazione Giorgio Cini. Image credit: Johan Oomen CC-BY

.Five finalists were invited to present their work, selected from a total of 21 submissions after an open call published earlier this year. Johan Oomen, head of research at the Netherlands Institute for Sound and Vision presented DIVE+ on day one of the summit. The slides of his pitch have been published, as well as the demo video that was submitted to the open call. Next to DIVE+ (Netherlands) and WarSampo (Finland) the finalists were Oslo public library (Norway), Fishing in the Data Ocean (Taiwan) and Genealogy Project (China). The diversity of the finalists is a clear indication that the use of linked data technology is gaining momentum. Throughout the summit, delegates have been capturing the outcomes of various breakout sessions. Please look at the overview of session notes and follow @lodlam on Twitter to keep track.

Pictured: Johan Oomen (@johanoomen) pitching DIVE+. Photo: Enno Meijers. 

DIVE+ is an event-centric linked data digital collection browser aimed to provide an integrated and interactive access to multimedia objects from various heterogeneous online collections. It enriches the structured metadata of online collections with linked open data vocabularies with focus on events, people, locations and concepts that are depicted or associated with particular collection objects. DIVE+ is the result of a true interdisciplinary collaboration between computer scientists, humanities scholars, cultural heritage professionals and interaction designers. DIVE+ is integrated in the national CLARIAH (Common Lab Research Infrastructure for the Arts and Humanities) research infrastructure.

Pictured: each day experts shape the agenda for that day, following the OpenSpace format. Image credit: Johan Oomen (cc-by)

DIVE+ is a collaborative effort of the VU University Amsterdam (Victor de Boer, Oana Inel, Lora Aroyo, Chiel van den Akker, Susane Legene), Netherlands Institute for Sound and Vision (Jaap Blom, Liliana Melgar, Johan Oomen), Frontwise (Werner Helmich), University of Groningen (Berber Hagendoorn, Sabrina Sauer) and the Netherlands eScience Centre (Carlos Martinez). It is supported by CLARIAH and NWO.

The LODLAM Challenge was generously sponsored by Synaptica. We would also like to thank the organisers, especially Valentine Charles and Antoine Isaac of Europeana and Ingrid Mason of Aarnet for all of their efforts. LODLAM 2017 has been a truly unforgettable experience for the DIVE+ team.

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Getting down with LOD tools at the 2nd CLARIAH Linked Data workshop

[cross-post from clariah.nl]

On Tuesday 13 June 2017, the second CLARIAH Linked Data workshop took place. After the first workshop in September which was very much an introduction to Linked Data to the CLARIAH community, we wanted to organise a more hands-on workshop where researchers, curators and developers could get their hands dirty.

The main goal of the workshop was to introduce relevant tools to novice as well as more advanced users. After a short plenary introduction, we therefore split up the group where for the novice users the focus was on tools that are accompanied by a graphical user interface, like OpenRefine and Gephi; whereas we demonstrated API-based tools to the advanced users, such as the CLARIAH-incubated COW, grlc, Cultuurlink and ANANSI. Our setup, namely to have the participants convert their own dataset to Linked Data and query and visualise, was somewhat ambitious as we had not taken into account all data formats or encodings. Overall, participants were able to get started with some data, and ask questions specific to their use cases.

It is impossible to fully clean and convert and analyse a dataset in a single day, so the CLARIAH team will keep investigating ways to support researchers with their Linked Data needs. For now, you can check out the CultuurLink slides and tutorial materials from the workshop and keep an eye out on this website for future CLARIAH LOD events.

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer