SEMANTiCS2017

This year, I was conference chair of the SEMANTiCS conference, which was held 11-14 Sept in Amsterdam. The conference was in my view a great success, with over 310 visitors across the four days, 24 parallel sessions including academic and industry talks, six keynotes, three awards, many workshops and lots of cups of coffee. I will be posting more looks back soon, but below is a storify item giving an idea of all the cool stuff that happened in the past week.

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Event Extraction From Radio News Bulletins For Linked Data

[This post is based on the BSc. Thesis of Kim van Putten (Computer Science, VU Amsterdam)]

As part of the Bachelor’s degree Computer Science at the VU Amsterdam, Kim van Putten conducted her bachelor thesis in the context of the DIVE+ project .

The DIVE+ demonstrator is an event-centric linked data browser which aims to provide exploratory search within a heterogeneous collection of historical media objects. In order to structure and link the media objects in the dataset, the events need to be identified first. Due to the size of the data collection manually identifying events in infeasible and a more automatic approach is required. The main goal of the bachelor project was to find a more effective way to extract events from the data to improve linkage within the DIVE+ system.

The thesis focused on event extraction from radio news bulletins of which the text content were extracted using optical character recognition (OCR). Data preprocessing was performed to remove errors from the OCR’ed data. A Named Entity Recognition (NER) tool was used to extract named events and a pattern-based approach combined with NER and part-of-speech tagging tools was adopted to find unnamed events in the data. Errors in the data caused by the OCR were found to cause poor performance of the NER tools, even after data cleaning.

The results show that the proposed methodology improved upon the old event extraction method. The newly extracted events improved the searchability of the media objects in the DIVE+ system, however, they did not improve the linkage between objects in the linked data structure. Furthermore,
the pattern-based method of event extraction was found to be too coarse-grained and only allowed for the extraction of one event per object. To achieve a finer granularity of event extraction, future research is necessary to find a way to identify what the relationships between Named Entities and verbs are and which Named Entities and verbs describe an event.

The full thesis is available for download here and the presentation here. Following, we show a poster that summrizes the main findings and the presentation of the thesis.

Poster - Event Extraction for Radio New Bulletins

Posted in DIVE+

Discovering the underlying structure of controversial issues with topic modeling

[This post is by Tibor Vermeij about his Master project]

For the Master project of the Information Sciences programme at the Vrije Universiteit, Tibor Vermeij investigated a solution to discover the structure of controversial issues on the web. The project was done in collaboration with the Controcurator project.

Detecting controversy computationally has been getting more and more attention. Because there is a lot of data available digitally, controversy detection methods that make use of machine learning and natural language processing techniques have become more common. However, a lot of studies try to detect the controversy of articles, blog post or tweets individually. The relation between controversial entities on the web is not often explored.

To explore the structure of controversial issues a combination of topic modeling and hierarchical clustering was used. With topic modeling, the content discussed in a set of Guardian articles was discovered. The resulting topics were used as input for a Hierarchical agglomerative clustering algorithm to find the relations between articles.

The clusters were evaluated with a user study. A Questionnaire was sent out that tested the performance of the pipeline in three categories: The similarity of articles within a cluster, the cohesion of the clusters and the hierarchy and the relation between controversy of single articles compared to the controversy of their corresponding clusters.

The questionnaire showed promising results. The approach can be used to get an indication of the general content of the articles. Articles within the same cluster were more similar compared to articles of different clusters, which means that the chosen clustering method resulted in coherent topics in the controversial clusters that were retrieved. Opinions on controversy itself showed a high amount of variance between participants, re-enforcing the subjectiveness of human controversy estimation. While the deviation between the individual assessments was quite high, averaged rater scores were comparable to calculated scores which suggest a correlation between the controversy of articles within the same cluster.

The full thesis can be found here https://drive.google.com/file/d/0B6qAc8tgJOHUWVo4UURqWkZ1UDg/view?usp=sharing.

The presentation can be found here https://drive.google.com/open?id=14ELkY_9UxppL62uxLg5cAMk8yYKqmHKFQtbAznT3HX0.

Posted in Masters Projects

Share, repeat and verify Scientific Experiments with Software Containers

[This post is by Rogier Mars about his Master project]

During my years at the VU as a student Information Sciences, I was often requested to form a project group and work on some kind of problem. Most likely, I was the one to implement the technical part after coming to a solution with my team. I always enjoyed this, mainly because of the high variety in the work performed. A small selection includes an information visualization for crime rates in the Netherlands; analysis of the influence of weather on the Dutch public transportation; using a cognitive system to enhance the performance of a tourist chat bot; programming AI to compete in games against other student groups and developing smart home technologies to aid in elderly care.

All of these prototypes and experiments consisted partly of software development and partly of the clever use of existing information technology to make life easier or to come to new insights and ideas. As you can imagine, during such a project, you are totally sucked into it. There is a deadline to reach, a presentation to prepare, the pressure is high to finish the project successfully. This pressure often results in sloppier working methods: for example, I did not include any documentation at all. I must confess: it would take me quite some time to dig into any of these projects. Even if I could find the code and data, it is highly likely that I could get it working without some serious trouble solving.

As it turns out I’m not the only researcher having these problems. Even for more recent ACM conferences and journals that are backed by code and data, in fifty percent of the time other researchers could not successfully repeat them. This has a negative effect on the efficiency of replication studies: which could result in less quality research. This had me thinking: if I could go back in time, could I do it better now? How would I do that and how hard is it? Could other researchers do this as well?

The software container platform Docker emerged in 2013 and is widely used by businesses throughout the world for web application hosting. With Docker you can easily create software containers that are portable to any other operating system that runs Docker, currently: Windows, Linux and MacOS. In literature is described how Docker can be used efficiently for research:

“By encapsulating the computational environment of scientific experiments into software containers, you bypass many dependency issues and the need for precise documentation.“

My master project included an implementation of Docker on several scientific experiments for means of increasing repeatability and I evaluated of this method with students and researchers in and around the Amsterdam area. By means of a controlled experiment, I’ve created the scenario for researchers to work with Docker on an example project on their own personal computer. Afterwards I’ve evaluated this method by means of existing scales and measures in questionnaires. I’ve compared this method with the traditional approach and participants were equally divided over both methods. The focus lied on usability, perceived usefulness and perceived ease of use. How the Docker method worked exactly was harder for participants to grasp than how the existing method worked, but overall they deemed it as more useful for repeating and verifying scientific experiments. The method was not perceived more usable, but it was definitely more reliable. There was still a difference in the perceived usefulness between new and existing users: it appears that if you understand how Docker works, you perceive it as more useful in general and for research.

If I could do it again, I would use Docker to create a computational environment for my experiments and I incline other researchers to do the same. The responsibility for successful execution of the code could shift from the replicator to the creator of the experiment. Eventually, this could make replication studies for computational science more fun and less time consuming.

Posted in Masters Projects

“New life for old media” to be presented at NEM Summit 2017

The extended abstract “Investigations into Speech Synthesis and Deep Learning-based colorization for audiovisual archives” has been accepted for publication at the NEM (New NEM (cc-by circle ©heese https://www.flickr.com/photos/gratisdbth/7805513264)Eureopean Media) Summit 2017 to be held in Madrid end-of-November. This paper is based on Rudy Marsman’s thesis “Speech technology and colorization for audiovisual archives” and describes his research on using AI technologies in the context of an the Netherlands Institute for Sound and Vision. Specifically, Rudy experimented with developing speech synthesis software based on a library of narrated news videos (using the voice of the late Philip Bloemendal) and with the use of pre-trained deep learning colorization networks to colorize archival videos.

You can read more in the draft paper [PDF]:

Rudy Marsman, Victor de Boer, Themistoklis Karavellas, Johan Oomen New life for old media: Investigations into Speech Synthesis and Deep Learning-based colorization for audiovisual archives. Extended Abstract proceedings of NEM summit 2017

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Lisbon Machine Learning Summer School 2017 – Trip Report

In the second half of July (20th of July – 27th of July) I attended the Lisbon Machine Learning Summer School (LxMLS2017). As every year, the summer school is held in Lisbon, Portugal, at Instituto Superior Técnico (IST). The summer school is organized jointly by IST, the Instituto de Telecomunicações, the Instituto de Engenharia de Sistemas e Computadores, Investigação e Desenvolvimento em Lisboa (INESC-ID), Unbabel, and Priberam Labs.

Around 170 students (mostly PhD students but also master students) attended the summer school. It’s important to mention that around 40% of the applicants are accepted, so make sure you have a strong motivation letter! For eight days we learned about machine learning with focus on natural language processing. The day was divided into 3 parts: lectures in the morning, labs in the afternoon and practical talks in the evening (yes, quite a busy schedule).

Morning Lectures

In general, the morning lectures and the labs mapped really well, first learn the notions and then put them into practice. During the labs we worked with Python and IPython Notebooks. Most of the labs had the base code already implemented and we just had to fill in some functions. However, for some of the lectures/labs this wasn’t that easy. I’m not going to discuss in detail the morning lectures but I’ll mention the speakers and their topics (also, the slides are available of the website of the summer school):

  • Mario Figueiredo: an introduction to probability theory which proved to be fundamental for understanding the following lectures.
  • Stefan Riezler: an introduction to linear learners using an analogy with the perceptual system of a frog, i.e., given that the goal of a frog is to capture any object of the size of an insect or worm providing it moves like one, can we build a model of this perceptual system and learn to capture the right objects?
  • Noah Smith: gave an introduction of sequence models such as Markov models and Hidden Markov models and presented the Viterbi algorithm which is used to find the most likely sequence of hidden states.
  • Xavier Carreras: talked about structured predictors (i.e., given training data, learn a predictor that performs well on unseen inputs) using as running example a named entity recognition task. He also discussed about Conditional Random Fields (CRF), approach that gives good results in such tasks.
  • Yoav Goldberg: talked about syntax and parsing by providing many examples of using them in sentiment analysis, machine translation and many other examples. Compared to the rest of the lectures, this one had much less math and was easy to follow!
  • Bhiksha Raj: gave an introduction to neural networks, more exactly convolutional neural networks (CNN) and recurrent neural networks (RNN). He started with the early models of human cognition, associationism (i.e., humans learn through association) and connectionism (i.e., the information is in the connexions and the human brain is a connectionist machine).
  • Chris Dyer: discussed about modeling sequential data with recurrent networks (but not only). He showed many examples related to language models, long short-term memories (LSTMs), conditional language models, among others. However, even if it’s easy to think of tasks that
 could be solved by conditional language models, most of the times the data does not exist, a problem that seems to appear in many fields and many examples.

Practical Talks

In the last part of the day we had practical talks or special talks of concrete applications that are based on the techniques learnt during the morning lectures. During the first day we were invited to attend a panel discussion named “Thinking machines: risks and opportunities” at the conference “Innovation, Society and Technology” where 6 speakers (Fernando Pereira – VP and Engineering Fellow at Google, Luís Sarmento – CTO at Tonic App’s, André Martins – Unbabel Senior researcher, Mário Figueiredo – Instituto de Telecomunicações at IST, José Santos Victor – president of the Institute for Systems and Robotics at IST and Arlindo Oliveira – president of Instituto Superior Técnico) in the AI field discussed about the benefits and risks of artificial intelligence and automatic learning. Here are a couple of thoughts:

  • Fernando Pereira: In order to enable people to make better use of technology, we need to make machines smarter at interacting with us and helping us.
  • André Martins pointed out an interesting problem: people spend time on solving very specific things but these are never generalized. -> but what if this is not possible?
  • Fernando Pereira: we build smart tools but only a limited amount of people are able to control them, so we need to build the systems in a smarter way and make the systems responsible to humans.

Another evening hosted the Demo Day, an informal gathering that brings together a number of highly technical companies and research institutions, all with the aim of solving machine learning problems through technology. There were a lot of enthuziastic people to talk to, many demos and products. I even discovered a new crowdsourcing platform, DefinedCrowd that soon might start competing with CrowdFlower and Amazon Mechanical Turk.

Here are some other interesting talks that we followed:

  • Fernando Pereira – “Learning and representation in language understanding”: talked about learning language representation using machine learning. However, machine understanding of language is not a solved problem. Learning from labeled data or learning with distant supervision may not yield the desired results, so it’s time to go implicit. He then introduced the work done by Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention Is All You Need. In this paper, the authors claim that you do not need complex CNNs or RNNs models, but it’s enough to use attention mechanisms in order to obtain quality machine translation data.
  • Graham Neubig – “Simple and Efficient Learning with Dynamic Neural Networks”: dynamic neural networks such as DyNet can be used as alternatives to TensorFlow or Theano. According to Graham, here as some advantages of using such nets: the API is closer to standard Python/C++ and it’s easier to implement nets with varying structure and some disadvantages: it’s harder to optimize graphs (but still possible) and it’s also harder to schedule data transfer.
  • Kyunghyun Cho – “Neural Machine Translation and Beyond”: showed why sentence-level and word-level machine translation is not desired: (1) it’s inefficient to handle various morphological words variants, (2) we need good tokenisation for every language (not that easy), (3) they are not able to handle typos or spelling errors. Therefore, character-level translation is what we need because it’s more robust to errors and handles better rare tokens (which are actually not necessarily rare).
Posted in CrowdTruth, Projects

Trip Report: Dagstuhl Seminar on Citizen Science

A month ago, I had the opportunity to attend the Dagstuhl Seminar  Citizen Science: Design and Engagement. Dagstuhl is really a wonderful place. This was my fifth time there. You can get an impression of the atmosphere from the report I wrote about my first trip there. I have primarily been to Dagstuhl for technical topics in the area of data provenance and semantic data management as well as for conversations about open science/research communication.

This seminar was a great chance for me to learn more about citizen science and discuss its intersection with the practice of open science. There was a great group of people there covering the gamut from creators of citizen science platforms to crowd-sourcing researchers. 17272.01.l

As usual with Dagstuhl seminars, it’s less about presentations and more about the conversations. There will be a report documenting the outcome and hopefully a paper describing the common thoughts of the participants. Neal Reeves took vast amounts of notes so I’m sure that this will be a good report :-). Here’s a whiteboard we had full of input:

2017-07-05 11.28.24.jpg

Thus, instead of trying to relay what we came up with (you’ll have to wait for the report), I’ll just pull out some of my own brief highlights.

Background on Citizen Science

There were a lot of good pointers on where to start understand current thinking around citizen science. First, two tutorials from the seminar:

What do citizen science projects look like:

Example projects:

How should citizen science be pursued:

And a Book:

Open Science & Citizen Science

Claudia Göbel gave an excellent talk about the overlap of citizen science and open science. First, she gave an important reminder that science in particular in the 1700s was done as public demonstrations walking us through the example painting below. 2017-07-04 11.23.02

She then looked at the overlap between citizen science and open science. Summarized below:

citizenopenscience.png

A follow-on discussion at the with some of the seminar participants led to input for a whitepaper that is being developed through the ECSA on Citizen & Open Science for Europe. Check out the preliminary draft. I look forward to seeing the outcome.

Questioning Assumptions

One thing that I left the seminar thinking about was was the need to question my own (and my field’s) assumptions. This was really inspired by talking to Chris Welty and reflecting on his work with Lora Aroyo on the issues in human annotation and the construction of gold sets.  Some assumptions to question:

  • What qualifications you need to have to be considered a scientist.
  • Interoperability is a good thing to pursue.
  • Openness is a worthy pursuit.
  • We can safely assume a lack of dynamics in computational systems.
  • That human performance is good performance.

Indeed, in Marissa Ponti she pointed to the example below and highlighted some of the potential ramifications of what each of these (what at first blush are positive) citizen science projects could lead to. 2017-07-03 10.06.36

That being said, the ability to rapidly engage more people in the science system seems to be a good thing indeed. An an assumption I’m happy to hold.

Random

Filed under: trip report Tagged: citizen science, dagstuhl, open science
Source: Think Links

Posted in Paul Groth, Staff Blogs

Identifying emotions in email with human-level accuracy

As part of the Master’s degree Business Analytics at the VU Amsterdam, Erwin Huijzer completed his master thesis at Anchormen:
“Identifying effective affective email responses; Predicting customer affect after email conversation”

When customers contact a company with regards to queries and complaints, often they prefer to use email. Handling these emails is a massive task for the Customer Support department. Automating email handling can help improve , reduce costs and shorten response time. However, awareness of customer emotion during the conversation is an important aspect in effective email handling.

In the thesis, sentiment analysis was used on incoming customer emails to determine the initial emotion of a customer. Furthermore, affect analysis was applied to predict the customer’s emotion after the response email from Customer Support. Both analyses were executed using supervised machine learning which trains computer models based on labelled data. This required manual labelling of a set of emails with sentiment (None, Neg, Pos, Mix) and emotions (Anger, Disgust, Fear, Joy, Sadness).

Manual labelling revealed that humans find it very difficult to determine emotions in email. Still, using majority vote, a reliable labelset could be determined. Applying machine learning (voting ensemble of Random Forest and Neural Net) on the labelled data resulted in human-level accuracy for Anger and Joy. For Disgust, the model even significantly outperforms human annotation. Using the same voting ensemble and including SVM, leads to human-level performance on Sentiment too. In both sentiment and emotions, the domain specific models trained on a small (742) set of emails outperforms a commercial model that was trained on millions of news sources.

Machine learning to predict customer affect, showed low performance. Still, results are significantly better than the benchmarks. A more direct measurement of customer affect may however drastically improve performance.

The full thesis is available for download here. The presentation is available here.

Posted in Masters Projects

A Concentric-based Approach to Represent Topics in Tweets and News

[This post is based on the BSc. Thesis of Enya Nieland and the BSc. Thesis of Quinten van Langen (Information Science Track)]

The Web is a rich source of information that presents events, facts and their evolution across time. People mainly follow events through news articles or through social media, such as Twitter. The main goal of the two bachelor projects was to see whether topics in news articles or tweets can be represented in a concentric model where the main concepts describing the topic are placed in a “core”, and the concepts less relevant are placed in a “crust”. In order to answer to this question, Enya and Quinten addressed the research conducted by José Luis Redondo García et al. in the paper “The Concentric Nature of News Semantic Snapshots”.

Enya focused on the tweets dataset and her results show that the approach presented in the aforementioned paper does not work well for tweets. The model had a precision score of only 0.56. After a data inspection, Enya concluded that the high amount of redundant information found in tweets, make them difficult to summarise and identify the most relevant concepts. Thus, after applying stemming and lemmatisation techniques, data cleaning and similarity scores together with various relevance thresholds, she improved the precision to 0.97.

Quinten focused on topics published in news articles. When applying the method described in the reference article, Quinten concluded that relevant entities from news articles can be indeed identified. However, his focus was also to identify the most relevant events that are mentioned when talking about a topic. As an addition, he calculated a term frequency inverse document frequency (TF-IDF) score and an event-relation (temporal relations and event-related concepts) score for each topic. These combined scores determines the new relevance score of the entities mentioned in a news article. The improvements made improved the ranking of the events, but did not improve the ranking of the other concepts, such as places or actors.

Following, you can check the final presentations that the students gave to present their work:

A Concentric-based Approach to Represent News Topics in Tweets
Enya Nieland, June 21st 2017

The Relevance of Events in News Articles
Quentin van Langen, June 21st 2017

Posted in CrowdTruth, Projects

Elevator Annotator: Local Crowdsourcing on Audio Annotation

[This post is based on Anggarda Prameswari’s Information Sciences MSc. Thesis]

For her M.Sc. Project, conducted at the Netherlands Institute for Sound and Vision (NISV), Information Sciences student Anggarda Prameswari (pictured right) investigated a local crowdsourcing application to allow NISV to gather crowd annotations for archival audio content. Crowdsourcing and other human computation techniques have proven their use for collecting large numbers of annotations, including in the domain of cultural heritage. Most of the time, crowdsourcing campaigns are done through online tools. Local crowdsourcing is a variant where annotation activities are based on specific locations related to the task.

The two variants of the Elevator Annotator box as deployed during the experiment.
The two variants of the Elevator Annotator box as deployed during the experiment.

Anggarda, in collaboration with NISV’s Themistoklis Karavellas, developed a platform called “Elevator Annotator”, to be used on-site. The platform is designed as a standalone Raspberry Pi-powered box which can be placed in an on-site elevator for example. It features a speech recognition software and a button-based UI to communicate with participants (see video below).

The effectiveness of the platform was evaluated in two different locations (at NISV and at Vrije Universiteit) and with two different modes of interaction (voice input and button-based input) through a local crowdsourcing experiment. In this experiments, elevator-travellers were asked to participate in an experiment. Agreeing participants were then played a short sound clip from the collection to be annotated and asked to identify a musical instrument.

The results show that this approach is able to achieve annotations with reasonable accuracy, with up to 4 annotations per hour. Given that these results were acquired from one elevator, this new form of crowdsourcing can be a promising method of eliciting annotations from on-site participants.

Furthermore, a significant difference was found between participants from the two locations. This indicates that indeed, it makes sense to think about localized versions of on-site crowdsourcing.

More information:

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer