University of Leicester
1-s2.0-S0002929715002347-main.pdf (1.85 MB)

The Human Phenotype Ontology: Semantic Unification of Common and Rare Disease.

Download (1.85 MB)
journal contribution
posted on 2017-01-23, 10:22 authored by T. Groza, S. Köhler, D. Moldenhauer, N. Vasilevsky, G. Baynam, T. Zemojtel, L. M. Schriml, W. A. Kibbe, P. N. Schofield, Tim Beck, D. Vasant, Anthony J. Brookes, A. Zankl, N. L. Washington, C. J. Mungall, S. E. Lewis, M. A. Haendel, H. Parkinson, P. N. Robinson
The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential diagnostics, phenotype-driven analysis of next-generation sequence-variation data, and translational research, but a comparable resource has not been available for common disease. Here, we have developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms. We derived disease models for 3,145 common human diseases comprising a total of 132,006 HPO annotations. The HPO now comprises over 250,000 phenotypic annotations for over 10,000 rare and common diseases and can be used for examining the phenotypic overlap among common diseases that share risk alleles, as well as between Mendelian diseases and common diseases linked by genomic location. The annotations, as well as the HPO itself, are freely available.


This work was supported by the Bundesministerium für Bildung und Forschung (project 0313911), the European Commission Seventh Framework Programme (FP7; grant 602300; SYBIL project), the Raine Clinician Research Fellowship (20140101), and the National Health and Medical Research Council of Australia (grant APP1055319, which is partnered with FP7 grant 305444). Oregon Health and Science University acknowledges the support of grant 1R24OD011883-01 from the NIH Office of the Director. T.G. was supported by an Australian Research Council Discovery Early Career Researcher Award (DE120100508). D.V. was supported in part by the BioMedBridges project funded by Research Infrastructures of the FP7 (grant 284209). H.P. was supported by European Molecular Biology Laboratory Core Funds. This work was supported by the director, Basic Energy Sciences, Office of Science, US Department of Energy under contract DE-AC02-05CH11231 and NIH contract 1R24OD011883-01. This document was prepared as an account of work sponsored by the US Government.



American Journal of Human Genetics, 2015, 97 (1), pp. 111-124

Author affiliation

/Organisation/COLLEGE OF MEDICINE, BIOLOGICAL SCIENCES AND PSYCHOLOGY/MBSP Non-Medical Departments/Department of Genetics


  • VoR (Version of Record)

Published in

American Journal of Human Genetics


Elsevier (Cell Press)





Acceptance date


Available date


Publisher version


Supplemental Data include 7 figures and 41 tables and can be found with this article online at



Usage metrics

    University of Leicester Publications