Trip Report: Observatory for Knowledge Organisation Systems

Last week, I was at in Malta for a small workshop on building or thinking about the need for observatories for knowledge organization systems (KOSs). Knowledge organization systems are things like taxonomies, classification schemes, ontologies  or concept maps.  The event was hosted by the EU COST action KNOWeSCAPE, which focuses on understanding the dynamics of knowledge through their analysis and importantly visualization.

Observatory for KOS is on! @KNOWeSCAPE pic.twitter.com/1BL7q3fFUD

— Joseph T. Tennis (@josephttennis) February 1, 2017

This was a follow-up to a previous workshop I attended on KOS evolution. Inspired by that workshop, I began to think with my colleague Mike Lauruhn about how the process of constructing KOS is changing with the incorporation of software agents and non-professional contributors (e.g. crowdsourcing). In particular, we wanted to try and get a handle on what a manager of a KOS should think about when dealing with its inevitable evolution especially with the introduction of these new factors. We wrote about this in our article Sources of Change for Modern Knowledge Organization Systems. Knowl. Org. 43(2016)No.8. (preprint).

In my talk (slides below), I presented our article in the context of building large knowledge graphs at Elsevier. The motivating slides were taken from Brad Allen’s keynote from the Dublin Core conference on metadata in the machine age. My aim was to motivate the need for KOS observatories in order to  provide empirical evidence for how to deal with changing KOS.

Both Joseph Tennis and Richard P. Smiraglia gave excellent views on the current state-of-the-art of KOS ontogeny in information systems. In particular, I think the definitional terms introduced by Tennis are useful.  He had the clearest motivation for the need for an observatory – we need to have a central dataset that is collected overtime in order to go beyond case study analysis (e.g. 1 or two KOS) to a population based approach.

I really enjoyed Shenghui Wang‘s talk on her and Rob Koopman’s experiments embeddings to start to try and detect concept drift within journal articles. Roughly put they used different vector spaces for each time duration and were able to see how particular terms changed with respect to other terms in those vector spaces. I’m looking forward to seeing how this work progresses.

2017-02-01 12.20.09 copy.jpg

The workshop was co-organized with the Wikimedia Community Malta so there was good representation from various members of the community. I particular enjoyed meeting John Cummings who is a Wikimedian in Residence at UNESCO. He told me about one of his project to help create high-quality wikipedia pages from UNESCO reports and other open access documents. It’s really cool seeing how deep research based content can be used to expand Wikipedia and the ramifications that has on its evolution. Another Wikipedian Rebecca O’Neill gave a fascinating talk about her rethinking the relationship between citizen curators and traditional memory institutions. Lot’s of stuff at her site so check it out.

Overall, the event confirmed my belief  that there’s lots more that knowledge organization studies can do with respect to large scale knowledge graphs and also those building these graphs can learn from the field.

Random Notes

 

 

 

Filed under: Uncategorized
Source: Think Links

Posted in Paul Groth, Staff Blogs

Hacked By TheWayEnd

You have been hacked

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Hacked By XwoLfTn

Hacked By XwoLfTn – Tunisian Hacker

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Hacked By SA3D HaCk3D

<br /> HaCkeD by SA3D HaCk3D<br />

HaCkeD By SA3D HaCk3D

Long Live to peshmarga

KurDish HaCk3rS WaS Here

fucked
FUCK ISIS !

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Speech technology and colorization for audiovisual archives

[This post describes and is based on Rudy Marsman‘s MSc thesis and is partly based on a Dutch blog post by him]

The Netherlands Institute for Sound and Vision (NISV) archives Dutch broadcast TV and makes it available to researchers, professionals and the general public. One subset are the Polygoonjournaals (Public News broadcasts) that are published under open licenses as part of the OpenImages platform. NISV is also interested in exploring new ways and technologies to make interaction with the material easier and to increase exposure to their archives. In this context, Rudy explored two options.

Two stills from the film ‘Steegjes‘, with the right frame colorized. Source: Polygoon-Profilti (producent) / Nederlands Instituut voor Beeld en Geluid  / colorized by Rudy Marsman, CC BY-SA

One part of the research was the autonomous colorization of old black-and-white video footage using Neural Networks. Rudy used a pre-trained NN (Zhang et al 2016) that is able to colorize black and white images. Rudy developed a program to split videos into frames, colorize the individual frames using the NN and then ‘stitch’ them back together into colorized videos. The stunning results were very well received by NISV employees. Examples are shown below.


Tour de France 1954 (colorized by Rudy Marsman in 2016), Polygoon-Profilti (producent) / Nederlands Instituut voor Beeld en Geluid (beheerder), CC-BY SA

Results from the comparison of the different variants of the method on different corpora

Results from the comparison of the different variants of the method on different corpora

In the other part of his research, Rudy investigated to what extent the existing news broadcast corpus, with a voice-overs from the famous Philip Bloemendal  can be used to develop a modern text-to-speech engine with his voice. To do so he have mainly focused on natural language processing and the determination to what extent the language used by Bloemendal in the 1970s is still comparable enough to contemporary Dutch.

Rudy used precompiled automatic speech recognition (ASR) results to match words to sounds and developed a slot-and-filler text-to-speech system based on this. To increase the limited vocabulary, he implemented a number of strategies, including term-expansion through the use of Open Dutch Wordnet and smart decompounding (this mostly works for Dutch, mapping ‘sinterklaasoptocht’ to ‘sinterklaas’ and ‘optocht’. The different strategies were compared to a baseline. Rudy found that a combination of the two resulted in the best performance (see figure). For more information:

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

ArchiMediaL proposal granted by Volkswagen Stiftung

Volkswagen stiftung letterI received a good news letter from Volkswagen Stiftung who decided to award us a research grant for a 3-year Digital Humanities project named “ArchiMediaL” around architectural history. This project will be a collaboration between  architecture historians from TU Delft,  computer scientists from TU Delft and VU-Web and Media. A number of German scholars will also be involved as domain experts. The project will combine image analysis software with crowdsourcing and semantic linking to create networks of visual resources which will foster understanding of understudied areas in architectural history.
From the proposal:In the mind of the expert or everyday user, the project detaches the digital images from its existence as a single artifact and includes it into a global network of visual sources, without disconnecting it from its provenance. The project that expands the framework of hermeneutic analysis through a quantitative reference system, in which discipline-specific canons and limitations are questions. For the dialogue between the history of architecture and urban form this means a careful balancing of qualitative and quantitative information and of negotiating new methodological approaches for future investigation.

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

A Look Back at the 2nd BDE Workshop on Big Data in Health, Demographic Change and Wellbeing

[reblogged from Big-Data-Europe.eu]

On 9 December 2016, the second workshop for the Big Data Europe Health, Demographic Change and Wellbeing societal challenge was held in Brussels. The aim of this workshop was to highlight progress from the BigDataEurope project in building the foundations of a generically applicable big data platform which can be applied across all Horizon 2020 societal challenges. This workshop specifically focused on health, and showcased our first pilot’s application to early bioscience research data.

The workshop in full effect

The workshop had 15 participants, from within the health domain and outside it, including many participants from the European Commission. Together we discussed different perspectives on how we may use appropriate H2020 instruments and work programmes to better integrate the ecosystem of linked data repositories, data management services and virtual collaboration environments to increase the pace of knowledge sharing in health.

The workshop featured presentations from BDE’s Simon Scerri and Aad Versteden on the general goals and progress of the BigDataEurope project and the BDE infrastructure respectively. After lunch, Ronald Siebes (BDE / VU Amsterdam) presented the first pilot in this specific domain. More information on that pilot can be found here. An extensive round-table discussion followed, in which possible options for new applications and connections were considered.

Snapshot of the SC1 pilot interface, as presented by Ronald Siebes

One question raised was whether the generic BDE infrastructure can be used by European SMEs. The fact that the BDE infrastructure is completely Open Source, very easy to install and features intuitive interface components makes re-use relatively simple even for smaller institutions and companies.

A significant part of the discussion focussed on possible new use cases for expanding the scope of the pilot. One suggestion was to look at post-hoc integration of clinical data, which represents a typical problem of data ‘variance’. This would require integrating information from different versions of medical questionnaires, which may be recorded or stored in different ways. Data provenance is also a key concern, as keeping a trail of what has happened to clinical data is crucial to tracking patients’ histories. Once integrated, this data could then be mined to identify biases or data patterns.

Finally, the workshop participants discussed potential connections to other European projects. Here many projects were mentioned including the MIDAS project, the Big-O project on childhood obesity, the PULSE projects and IMI / IMI2 projects including EMIF. We will be seeking collaborations with these projects and will continue to develop new and interesting Big Data use cases in this domain in the coming year.

More images can be found below: BDE Health Workshop SC 1.2

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Web of Voices and W4RA video at the Webscience@10 TV Channel

 

For its 10th anniversary, the Web Science Trust organized an event Webscience@10. For this event, a Webscience@10 TV channel was launched to showcase different research and education initatives around the world. On behalf of the VU Network Institute and W4RA, we submitted our Web of Voices video as well as a short introduction to the W4RA team.

You can watch the ~10 hours of video content at  http://www.webscience.org/webscience10/tv-channel-webscience10/. You can find us (listed under Netwerk Institute Amsterdam) at 2h31mins:
https://www.youtube.com/watch?v=7BTkylI60DM

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

Niels’ paper awarded first Bob Wielinga award at EKAW

Niels Ockeloen’s paper on Data2Documents was awarded the first Bob Wielinga memorial award for best research paper at the 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW2016). “Data 2 Documents: Modular and distributive content management in RDF” was authored by Niels Ockeloen, Victor de Boer, Tobias Kuhn and Guus Schreiber from the Web and Media group.. The paper describes Niels’ PhD. work on a method for creating human readable web documents out of machine readable Linked Data, focussing on modularity and re-use. You can view the slides for Niels’ presentation slides here.

Niels wins Best Paper Award

The award is named after Prof. Bob Wielinga, one of the most prominent European scientists in the area of knowledge-based systems, best known for his work on the KADS methodology, who has been one of the key influences on the development of the area in the past three decades. Bob was both my own and Guus Schreiber’s promotor so this makes it extra-special for us. In 2009 he was also appointed at our department, where he continued supervising PhD students until he passed away earlier this year. It is especially nice that the award, which was named after Bob Wielinga goes to work that is not only authored by people from Amsterdam but also work that Bob at some point discussed with Niels in the Basket, before his passing.

Source: BiographyNetBiographyNet

Posted in BiographyNet, Projects

Niels’ paper awarded first Bob Wielinga award at EKAW

Niels Ockeloen’s paper on Data2Documents was awarded the first Bob Wielinga memorial award for best research paper at the 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW2016). “Data 2 Documents: Modular and distributive content management in RDF” was authored by Niels Ockeloen, Victor de Boer, Tobias Kuhn and Guus Schreiber from the Web and Media group.. The paper describes Niels’ PhD. work on a method for creating human readable web documents out of machine readable Linked Data, focussing on modularity and re-use. You can view the slides for Niels’ presentation slides here

Niels wins Best Paper Award

The award is named after Prof. Bob Wielinga, one of the most prominent European scientists in the area of knowledge-based systems, best known for his work on the KADS methodology, who has been one of the key influences on the development of the area in the past three decades. Bob was both my own and Guus Schreiber’s promotor so this makes it extra-special for us. In 2009 he was also appointed at our department, where he continued supervising PhD students until he passed away earlier this year. It is especially nice that the award, which was named after Bob Wielinga goes to work that is not only authored by people from Amsterdam but also work that Bob at some point discussed with Niels in the Basket, before his passing.

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer