I was proud to be part of the Open PHACTS project for three years. The project built a platform for drug discovery that integrates data over multiple different kinds of chemistry and biological data currently connecting information about compounds, targets, pathways, diseases and tissues.. The platform is still going strong and is now supported by a foundation that is supported by its users from companies such as GSK, Roche, Janssen and Lilly. The foundation is also involved in several projects such as Big Data for Europe.
The project was large and produced many outputs including numerous publications. I wanted to tell a brief story of Open PHACTS by just categorizing the publications. This will hopefully help people navigate the results of the project. Note, I removed the authors for readability but click through to find all the great people who did this work.
Speaks for itself…
- Open PHACTS: semantic interoperability for drug discovery, Drug Discovery Today, Volume 17, Issues 21–22, November 2012, Pages 1188-1198, ISSN 1359-6446, http://dx.doi.org/10.1016/j.drudis.2012.05.016
The information needs of drug discovery scientists. 83 use cases gathered and analyzed. 20 prioritized use case questions as the result.
- Scientific competency questions as the basis for semantically enriched open pharmacological space development, Drug Discovery Today, Volume 18, Issues 17–18, September 2013, Pages 843-852, ISSN 1359-6446, http://dx.doi.org/10.1016/j.drudis.2013.05.008
Platform design and construction
Semantic technologies are great for integration – How do we get them to be fast and easy for developers? Leverage APIs
- Applying Linked Data Approaches to Pharmacology: Architectural Decisions and Implementation, Semantic Web Journal, vol. 5, iss. 2, pp. 101-113, 2014. http://dx.doi.org/10.3233/SW-2012-0088
- API-centric Linked Data integration: The Open PHACTS Discovery Platform case study, Web Semantics: Science, Services and Agents on the World Wide Web, Volume 29, December 2014, Pages 12-18, ISSN 1570-8268, http://dx.doi.org/10.1016/j.websem.2014.03.003
Applying the platform to do drug discovery
Can the platform do what it says it can do? Yep. 16/20 use case questions could be answered and some ones we didn’t think of. Plus, some cool end-user applications (e.g. The Open PHACTS Explorer and Chembionavigator )
- Drug discovery FAQs: workflows for answering multidomain drug discovery questions, Drug Discovery Today, Volume 20, Issue 4, April 2015, Pages 399-405, ISSN 1359-6446, http://dx.doi.org/10.1016/j.drudis.2014.11.006
- The Application of the Open Pharmacological Concepts Triple Store (Open PHACTS) to Support Drug Discovery Research. PLoS ONE 9(12): e115460. http://dx.doi.org/10.1371/journal.pone.0115460
- Molecular Informatics: Special Issue: Open Innovation in Drug Discovery. Volume 31, Issue 8, p. 517-609, August 2012
Interesting computer science
Along the way we addressed some computer science challenges like: How do we scale up querying over RDF? How do we deal with the multiplicity of mappings? How do we mix commercial, private and public data?
- On the formulation of performant SPARQL queries, Web Semantics: Science, Services and Agents on the World Wide Web, Volume 31, March 2015, Pages 1-26, ISSN 1570-8268, http://dx.doi.org/10.1016/j.websem.2014.11.003
- Scientific Lenses to Support Multiple Views over Linked Chemistry Data, The Semantic Web – ISWC 2014 – 13th International Semantic Web Conference, Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part I, 2014, pp. 98-113. http://dx.doi.org/10.1007/978-3-319-11964-9_7
- NoSQL databases for RDF: an empirical evaluation.” The Semantic Web–ISWC 2013. Springer Berlin Heidelberg, 2013. 310-325. http://dx.doi.org/10.1007/978-3-642-41338-4_20
- Incorporating commercial and private data into an open linked data platform for drug discovery. The Semantic Web–ISWC 2013. Springer Berlin Heidelberg, 2013. 65-80. http://dx.doi.org/10.1007/978-3-642-41338-4_5
Supporting Better Data
The project supported data providers in creating and updating RDF versions of their datasets.
- The ChEMBL bioactivity database: an update Nucleic Acids Research 42 (D1): D1083-D1090, 2014. http://dx.doi.org/10.1093/nar/gkt1031
- DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes, Database – The Journal of Biological Databases and Curation, 2015. http://dx.doi.org/10.1093/database/bav028
- Converting neXtProt into Linked Data and nanopublications. Semantic Web 6(2). 2015 http://dx.doi.org/10.3233/SW-140149
- Ontology Work at the Royal Society of Chemistry. figshare. (2014) http://dx.doi.org/10.6084/m9.figshare.964944
Many members of the project worked within a number of communities to develop specifications that help for dataset description (especially in terms of provenance) and interchange.
- Dataset Descriptions: HCLS Community Profile. W3C Interest Group Note. http://www.w3.org/2001/sw/hcls/notes/hcls-dataset/
- PROV-Overviev: An Overview of the PROV Family of Documents. W3C Working Group Note. http://www.w3.org/TR/prov-overview/
- PAV ontology: provenance, authoring and versioning. Journal of Biomedical Semantics, vol 4, iss. 37, November 22, 2013 . http://dx.doi.org/10.1186/2041-1480-4-37
- Nanopublications Guidelines. http://nanopub.org/guidelines/working_draft/
Overall, the Open PHACTS project not only delivered a data integration platform for drug discovery but also helped through the construction of more interoperable datasets and lessons about how to construct such platforms. I look forward to seeing what happens as the platform continues to be developed but maybe more importantly the impact of the results of the project as they diffuse.