Book Chapter Details
Mandatory Fields
Joorabchi, A.; Mahdi, A.E.
2012 October
Knowledge Engineering and Knowledge Management
Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms
Springer Berlin Heidelberg
Berlin Heidelberg
Published
1
Optional Fields
text mining, scientific digital libraries, subject metadata, keyphrase annotation, keyphrase indexing, Wikipedia, genetic algorithms
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents’ content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of our method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.
978-3-642-33875-5
http://link.springer.com/chapter/10.1007/978-3-642-33876-2_6
32
41
10.1007/978-3-642-33876-2_6
Grant Details