Exploring West African Folk Narrative Texts Using Machine Learning

It is so nice when two often very distinct research lines come together. In my case, Digital Humanities and ICT for Development rarely meet directly. But they sure did come together when Gossa Lô started with her Master AI thesis. Gossa, a long-time collaborator in the W4RA team, chose to focus on the opportunities for Machine Learning and Natural Language Processing for West-African folk tales. Her research involved constructing a corpus of West-African folk tales, performing various classification and text generation experiments and even included a field trip to Ghana to elicit information about folk tale structures. The work -done as part of an internship at Bolesian.ai– resulted in a beautiful Master AI thesis, which was awarded a very high grade.

As a follow up, we decided to try to rewrite the thesis into an article and submit it to a DH or ICT4D journal. This proved more difficult. Both DH and ICT4D are very multidisciplinary in nature and the combination of both proved a bit too much for many journals, with our article being either too technical, not technical enough, or too much out of scope.

But now, the article ” Exploring West African Folk Narrative Texts Using Machine Learning ” has been published (Open Access) in a special issue of Information on Digital Humanities!

Experiment 1: RNN network architecture of word-level (left) and character-level (right) models
T-SNE visualisation of the 2nd experiment

The paper examines how machine learning (ML) and natural language processing (NLP) can be used to identify, analyze, and generate West African folk tales. Two corpora of West African and Western European folk tales were compiled and used in three experiments on cross-cultural folk tale analysis:

  1. In the text generation experiment, two types of deep learning text generators are built and trained on the West African corpus. We show that although the texts range between semantic and syntactic coherence, each of them contains West African features.
  2. The second experiment further examines the distinction between the West African and Western European folk tales by comparing the performance of an LSTM (acc. 0.79) with a BoW classifier (acc. 0.93), indicating that the two corpora can be clearly distinguished in terms of vocabulary. An interactive t-SNE visualization of a hybrid classifier (acc. 0.85) highlights the culture-specific words for both.
  3. The third experiment describes an ML analysis of narrative structures. Classifiers trained on parts of folk tales according to the three-act structure are quite capable of distinguishing these parts (acc. 0.78). Common n-grams extracted from these parts not only underline cross-cultural distinctions in narrative structures, but also show the overlap between verbal and written West African narratives.
Example output of the word-level model text generator on translated W-African folk tale fragments

All resources, including data and code are found at https://github.com/GossaLo/afr-neural-folktales

Share This:

Source: Victor de Boer

Posted in Staff Blogs, Victor de Boer

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs

(Virtual) Trip Report: KGC 2020

Last week, I virtually attended the Knowledge Graph Conference 2020. Originally, KGC was planned to be hosted in New York at Columbia University but, as with everything, had to go online because of the pandemic.

Before getting to the content, I wanted to talk about logistics.  Kudos to Francois Scharffe and the team for putting this conference online quickly and running it so smoothly. Just thinking of all the small things – for example, as a speaker I was asked to do a dry run with the organizers and get comments back for how the presentation went on Zoom. The conference Slack workspace was booming with tons of different challenges. The organizers had a nice cadence of talk announcements while boosting conversation by pushing the Q/A session onto Slack. This meant that the conversations could continue beyond each individual session. At the meta level, they managed to get the intensity of a conference online through the amount of effort in curating those Slack channels along with the rapid fire pace of the talks over the two main track days. Personally, I somehow found this more tiring than F2F because somehow Zoom presentations require full focus to ingest. Additionally, there’s this temptation to do both the conference and your normal workday when the event is in another time zone….which… err.. I might have been guilty of. I also did have some hallway conversations on Slack but not as much as I normally would in a F2F setting.

But what’s the conference about? KGC started last year with the idea of having an application and business oriented event focused on knowledge graphs. I would summarize  the aim is to bring people together to talk about knowledge graph technology in action, see the newest commercially ready tech and get a glimpse of future tech. The conference has the same flavor of Connected Data London . As a researcher, I really enjoy seeing the impact these technologies are having in a myriad of domains.

So what was I doing there? I was talking about Knowledge Graph Maintenance (slides) – how do we integrate machine learning techniques and the work of people to not only create but maintain knowledge graphs. Here’s my talk summarized in one picture:

EXp_fpiWAAAs9GC.jpeg

My goal is to get  organizations who are adopting knowledge graphs to think not only about one-of creation but think about what goes in to keeping that knowledge up-to-date. I also wanted to give a sketch of the current research we’ve been doing in this direction.

There was a lot of content at this event (which will be available online) so I’ll just call out three things I took away from it.

Human Understandable Data

One of the themes that kept coming up was the use of knowledge graphs to help the data in an organization match the conceptualizations that are used within businesses. Sure we can do this by saying we need to build an ontology or logical model or a semantic dictionary but the fundamental point that was highlighted again and again is that this data-to-business bridge was the purpose of building many knowledge graphs. It was kind of summed up in the following two slides from Michael Grove:

cim.png  logicalmodel.png

This also came through in Ora Lassila’s talk (now at Amazon Neptune) as well as the the tutorial I attended by Juan Sequeda about building Enterprise Knowledge Graphs from Relational Databases. Juan ran through a litany of mapping patterns all trying to bridge from data stored for specific applications to human understandable data. I’m looking forward to seeing this tutorial material available.

The Knowledge Scientist 

Great to see @BethanySehon back at #kgconf with “friendly neighborhood #ontologist” Brian Donohue of @CapitalOne Talk: “Validating Categories using Knowledge Graphs” pic.twitter.com/41jjqls4h3

— Knowledge Graph Conference (@KGConference) May 6, 2020

Given the need to bridge the gap between application data and business level goals, new kinds of knowledge engineering and tools to facilitate that we’re also of interest. Why aren’t existing approaches enough? I think the assumption is that there’s a ton of data that people doing this activity need to deal with.  Both Juan and I discussed the need to recognize these sorts of people – which we call a Knowledge Scientist– and it seemed to resonate or at least the premise behind the term did.

An excellent example of supporting this sort of tools to support knowledge engineering was by Rafael Gonçalves on how Pinterest used WebProtege to update and manage their taxonomy (paper):

pinintereest.png

Likewise, Bryon Jacob discussed about how the first step to getting to a knowledge graph was through the better cataloging of data within the organization. It reminds me of the lesson we learned from linked data – that before we can have knowledge we need to index and catalog the underlying data.  Also, I can never overlook a talk that gives a shoutout to PROV and the need for lineage and provenance 🙂 .

Knowledge Graphs as Data Assets

I really enjoyed seeing all the various kinds of application areas using knowledge graphs. There were early domain adopters  for example in drug discovery and scholarly data that have pushed further in using this technology:

An excellent talk by @TPlasterer @AstraZeneca on #FAIR data and knowledge graphs in the biopharmaceutical context at the closing of the #kgc2020 conference. What a great show, thanks so much for organizing this @KGConference! Judging by the burgeoning Slack, it will live on 🙂 pic.twitter.com/yby11ngNzr

— Kees van Bochove (@keesvanbochove) May 7, 2020

Why a Knowledge Graph? Interesting that Paco mentioned the use of @DSDimensions and @CrossrefOrg open data, as well as @orcid @ResearchOrgs – if heard him correctly then I'm happy to see my two worlds merging @force11rescomm and @KGConference #kgconf https://t.co/NMsaI3MTCu pic.twitter.com/0Cnauubx5g

— violeta 🕊 (@azraiekv) May 6, 2020

But also new domains like personal health (e.g. deck from Jim Hendler).

Again, amazing work from @flatlandagency capturing the workshop on Personal Health Knowledge Graph at #kgconf @KGConference. Thank you to the organizers: Ching-Hua Chen, Amar Das, Ying Ding, Deborah McGuinness, PhD, Oshani Seneviratne, and Mohammed J Zaki. https://t.co/OKLW9oCZH8 pic.twitter.com/fjF6DzIvlo

— violeta 🕊 (@azraiekv) May 5, 2020

The two I liked the most were on law and real estate.  David Kamien fromMind Alliance talked about how knowledge graphs in combination with NLP can specifically help law firms for example by automatically suggesting new business development opportunities by analyzing court dockets.

Ron Bekkerman‘s talk on the real estate knowledge graph that they’ve constructed at Cherre was the most eye opening to me. Technically, it was cool in that are applying geometric deep learning to perform entity resolution to build a massive graph of real estate. I had been at another academic workshop on this only a ~2 weeks prior. But from a business sense, their fundamental asset is that the cleaned data in the form of a knowledge graph. It’s not just data but reliable connected data. Really one to watch.

To wrap-up, the intellectual history of knowledge graphs is long (ee John Sowa’s slides and knowledgegraph.today) but I think it’s nice to see that we are at stage where this technology is being deployed at scale in practice, which brings additional research challenges for folks like me.

Part of the Knowledge Graph of the Knowledge Graph Conference:

kgc.png

Random Notes

 

Source: Think Links

Posted in Paul Groth, Staff Blogs