欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

知识图谱-数据集

程序员文章站 2022-06-12 19:51:01
...

DBpedia

网址:https://wiki.dbpedia.org/
简介:
DBpedia 是一个很特殊的语义网应用范例,它从*(Wikipedia)的词条里撷取出结构化的资料,以强化*的搜寻功能,并将其他资料集连结至*。透过这样的语意化技术的介入,让*的庞杂资讯有了许多创新而有趣的应用,例如手机版本、地图整合、多面向搜寻、关系查询、文件分类与标注等等。DBpedia 同时也是世界上最大的多领域知识本体之一,也是 Linked Data 的一部分,美国科技媒体 ReadWriteWeb 也将 DBpedia 选为2009 年最佳的语义网应用服务。

DBpedia 2014 版的资料集拥有超过458万的物件,包括144万5000人、73万5000个地点、12万3000张唱片、8万7千部电影、1万9000种电脑游戏、24万1000个组织、25万1000种物种和6000个疾病。其资料不仅被BBC、路透社、纽约时报所采用,也是Google、Yahoo等搜寻引擎检索的对象。

文献:

Yago

网址:https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/
YAGO (Yet Another Great Ontology) is an open source knowledge base developed at the Max Planck Institute for Computer Science in Saarbrücken. It is automatically extracted from Wikipedia and other sources.

As of 2012, YAGO3 has knowledge of more than 10 million entities and contains more than 120 million facts about these entities. The information in YAGO is extracted from Wikipedia (e.g., categories, redirects, infoboxes), WordNet (e.g., synsets, hyponymy), and GeoNames. The accuracy of YAGO was manually evaluated to be above 95% on a sample of facts.[To integrate it to the linked data cloud, YAGO has been linked to the DBpedia ontology[6] and to the SUMO ontology.

YAGO3 is provided in Turtle and tsv formats. Dumps of the whole database are available, as well as thematic and specialized dumps. It can also be queried through various online browsers and through a SPARQL endpoint hosted by OpenLink Software. The source code of YAGO3 is available on GitHub.

YAGO has been used in the Watson artificial intelligence system.

PDD

网址:http://pdd.wangmengsd.com/

中文简介:
PDD,全称Patient-Disease-Drug,是一个医疗相关的数据集,包含了患者、疾病和药物之间的连接关系。

英文简介:
What is PDD Graph (Patient-Disease-Drug Graph):

Electronic medical records contain multi-format electronic medical data that consist of an abundance of medical knowledge. Facing with patients symptoms, experienced caregivers make right medical decisions based on their professional knowledge that accurately grasps relationships between symptoms, diagnosis, and treatments. We aim to capture these relationships by constructing a large and high-quality heterogeneous graph linking patients, diseases, and drugs (PDD) in EMRs.

Specifically, we extract important medical entities from MIMIC-III (Medical Information Mart for Intensive Care III) and automatically link them with the existing biomedical knowledge graphs, including ICD-9 ontology and DrugBank. The PDD graph presented is accessible on the Web via the SPARQL endpoint, and provides a pathway for medical discovery and applications, such as effective treatment recommendations.

文献:

@inproceedings{DBLP:conf/semweb/WangZLHWLL17,
  author    = {Meng Wang and
               Jiaheng Zhang and
               Jun Liu and
               Wei Hu and
               Sen Wang and
               Xue Li and
               Wenqiang Liu},
  editor    = {Claudia d'Amato and
               Miriam Fern{\'{a}}ndez and
               Valentina A. M. Tamma and
               Freddy L{\'{e}}cu{\'{e}} and
               Philippe Cudr{\'{e}}{-}Mauroux and
               Juan F. Sequeda and
               Christoph Lange and
               Jeff Heflin},
  title     = {{PDD} Graph: Bridging Electronic Medical Records and Biomedical Knowledge
               Graphs via Entity Linking},
  booktitle = {The Semantic Web - {ISWC} 2017 - 16th International Semantic Web Conference,
               Vienna, Austria, October 21-25, 2017, Proceedings, Part {II}},
  series    = {Lecture Notes in Computer Science},
  volume    = {10588},
  pages     = {219--227},
  publisher = {Springer},
  year      = {2017},
  url       = {https://doi.org/10.1007/978-3-319-68204-4\_23},
  doi       = {10.1007/978-3-319-68204-4\_23},
  timestamp = {Tue, 14 May 2019 10:00:53 +0200},
  biburl    = {https://dblp.org/rec/conf/semweb/WangZLHWLL17.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
相关标签: 知识图谱