Automated indexing for full text information retrieval

Proc AMIA Symp. 2000:71-5.

Author

Affiliation

¹ Veterans Affairs Palo Alto Health Care System, Palo Alto, CA, USA.

PMID: 11079847
PMCID: PMC2243910

Abstract

We report our experience with a statistically based method of generating sentence-level indexing based on identified UMLS concepts and query and vector-space models. We evaluated the system using the consensus markup of two domain experts as the gold standard. UMLS concepts identified both from HTML headings and in paragraph text were valuable in proposing markup. Using both sources of concepts, the model proposed the correct set of concepts in the form of a query prototype 71% of the time. The correct query prototype was ranked first or second in 79% of cases.

Publication types

Evaluation Study
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Abstracting and Indexing / methods*
Electronic Data Processing
Information Storage and Retrieval
Statistics as Topic
Unified Medical Language System*