Semantic Indexing with the Biomedical Citation Network
Advisor Information
Dario Ghersi
Location
MBSC 201
Presentation Type
Poster
Start Date
6-3-2020 2:00 PM
End Date
6-3-2020 3:15 PM
Abstract
PubMed contains over 30 million biomedical literature citations and is an invaluable resource for researchers, medical professionals, students, and curious individuals. The search and retrieval process is significantly enhanced by PubMed’s Medical Subject Heading (MeSH) indexing process, which requires a significant manual component. It is difficult to effectively apply traditional machine learning methods to large scale semantic indexing problems, and this difficulty has impeded complete automation of the MeSH indexing process. PubMed citations are particularly challenging to index: documents are often indexed with a dozen or more terms, and most terms occur extremely infrequently in the document set. This project examines the biomedical literature citation network and MeSH vocabulary for viable signal that might benefit the indexing process.
Semantic Indexing with the Biomedical Citation Network
MBSC 201
PubMed contains over 30 million biomedical literature citations and is an invaluable resource for researchers, medical professionals, students, and curious individuals. The search and retrieval process is significantly enhanced by PubMed’s Medical Subject Heading (MeSH) indexing process, which requires a significant manual component. It is difficult to effectively apply traditional machine learning methods to large scale semantic indexing problems, and this difficulty has impeded complete automation of the MeSH indexing process. PubMed citations are particularly challenging to index: documents are often indexed with a dozen or more terms, and most terms occur extremely infrequently in the document set. This project examines the biomedical literature citation network and MeSH vocabulary for viable signal that might benefit the indexing process.