ISWC 2015 trip report

I would like to share here a few thoughts and experiences from this year’s International Semantic Web Conference (ISWC 2015), which took place last week in Bethlehem PA (USA). Many papers are still on my reading list, but here is a preliminary and incomplete list of things I found interesting:

First of all, it was a great crowd of people there. I had so many interesting discussions; it would have been worth going just for that.

There were many interesting conference papers as well of course, but let’s start with the workshops. Unfortunately, I missed the first workshop day, but I was there all day for the second. There was a great keynote by David Karger at the Intelligent Exploration of Semantic Data (IESD) workshop that reminded us that, when developing tools for the Web, we should think about end-users from the start, and he concluded with the only seemlingly pessimistic statement that “the Semantic Web will never work because when it works, you won’t know it’s the Semantic Web”.

In the afternoon of the second workshop day, I attended the Linked Science (LISC) workshop, with a nice keynote by Krzysztof Janowicz on “Linked Data Scientometrics”. I found it to be a very good workshop with very interesting talks, for example one on how to preserve computational experiments with Docker. I was also very happy to present there my own work on a nanopublication library written in Java, and it was very interesting to see that PROV-O was mentioned again and again during this workshop.

At the main conference, the Linked Data Fragments (LDF) technique got a lot attention, mostly in the form of low-level algorithm and implementation papers such as “Substring Filtering for Low-Cost Linked Data Interfaces”. A very nice and user-friendly system that was presented was TR Discover (also nominated for the best paper award), which however unfortunately missed a whole body of literature that exists on Controlled Natural Languages (CNL). The system is therefore not as novel as it claims to be at all, but is apart from that still very nice and interesting (probably none of the similar existing systems achieved that level of maturity). During the Poster and Demo session at the end of the first day, I discovered Basil, which seems to be a neat tool to convert your SPARQL endpoints into REST APIs.

At the keynote on the second day, given by Andrew McCallum, the concept of a Universal Schema was introduced. I didn’t fully understand the concept (that’s one of the papers on my reading list), but it seemed to be useful for the integration of data when combined with machine learning. Then there were a number of nice presentations demonstrating that data recency can trump data volume, that we can use Linked Data as content for quizzes for the purpose of education (though some reported afterwards that this has already been done 4 years ago), and that we can use nanopublications and PROV to integrate various datasets in the domain of drug-drug interactions (I was myself involved in this last paper). The work on using Linked Data to combat human trafficking got a lot of praise (probably deserved, but I still have to read the paper). And then an interesting small project was presented during the lightning talks: RDFa Lite for Markdown.

The keynote by Ian Horrocks on the last day nicely summarized what the Semantic Web community has achieved so far, and he argued even though RDF, SPARQL, and OWL might not be perfect, they provide a huge advantage and opportunity. Unfortunately, I missed Laurens Rietveld’s presentation on the LOD Labs, which won the best research paper award (also nominated for the best student paper award, but it got the big one!), which is truly fantastic (and I am sure well-deserved, though that paper is still on my reading list too). The reason I missed that talk was that I was giving my own talk at the parallel Scientific Data session on estabilishing a decentralized server network for data publishing in the form of nanopublications. Other nice work in that track included the generation of semantic topic networks for research areas. Lastly, another paper that seems very interesting and promising is “General Terminology Induction in OWL”, which was also nominated for two best paper awards. That paper presents an approach to automatically generate hypotheses, and it is based on the idea that hypotheses have a fitness (i.e. how many facts in the data support it) and a braveness (how many new assertions it makes).

This leaves me with a long list of promising papers to read in detail, which is probably the best that can happen after such a conference. The full list of accepted papers can be found here, the official Springer proceedings here and here, and some of the papers are also uploaded and linked from the conference program.

