ISSN: 0973-7510

E-ISSN: 2581-690X

P. Venkateshkumar and A. Subramani
1KSR College of Engineering, Tiruchengode, India.
J Pure Appl Microbiol. 2015;9(Spl. Edn. Aug.):87-96
© The Author(s). 2015
Received: 18/02/2015 | Accepted: 03/05/2015 | Published: 31/08/2015
Abstract

Usage of digital medical documents and sharing by Web services has tremendously increased the size of document collections and increases the burden on the user for getting relevant document while searching. Many tools such as query-based retrieval and browsing are available to search a document of interest. Document clustering is widely used for efficient Information Retrieval (IR) and data mining applications. Traditional methods use ‘bag of words’ approach to find the relevant document for a query. But, high dimensionality of the features of a document and ambiguity in the natural language needs concept-based search instead of using bag of words. Ranking the features and expansion of concepts of ranked features will be helpful for efficient data retrieval and mining. This work proposes a Honey Bee Mating optimization with k-Means clustering (HBM-KM) algorithm for optimal clustering of documents. The proposed technique performs better than with Hierarchical Agglomerative clustering (HAC) and k-means algorithm.

Keywords

Document Clustering, Concept Expansion, Hierarchical Agglomerative Clustering (HAC), k-Means clustering, Honey Bee Mating algorithm (HBM)

Article Metrics

Article View: 867

Share This Article

© The Author(s) 2015. Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License which permits unrestricted use, sharing, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.