ENHANCING TEXT MINING USING ONTOLOGY BASED SIMILARITY DISTANCE MEASURE

Main Article Content

Atiya Kazi,
Priyanka Bandagale

Abstract

Generally, Text mining applications disregard the side-information contained within the text document, which can enhance the overall clustering process. To overcome this deficiency, the proposed algorithm will work in two phases. In the first phase, it will perform clustering of data along with the sideinformation, by combining classical partitioning algorithms with probabilistic models. This will automatically boost the efficacy of clustering. Theclusters thus generated, can also be used as a training model to promote the solution of the classificationproblem. In the second phase, a similarity based distance calculation algorithm, which makes use of two shared word spaces from the DISCO ontology, is employed to perk up the clustering approach. This pre-clustering technique will calculate the similarity between terms based on the cosine distance method, and will generate the clusters based on a threshold. This inclusion of ontology in the pre-clustering phase will generate more coherent clusters by inducing ontology along with side-information.

Downloads

Download data is not yet available.

Article Details

How to Cite
Atiya Kazi, & Priyanka Bandagale. (2021). ENHANCING TEXT MINING USING ONTOLOGY BASED SIMILARITY DISTANCE MEASURE . International Journal of Innovations in Engineering Research and Technology, 1-4. https://repo.ijiert.org/index.php/ijiert/article/view/1005
Section
Articles

How to Cite

Atiya Kazi, & Priyanka Bandagale. (2021). ENHANCING TEXT MINING USING ONTOLOGY BASED SIMILARITY DISTANCE MEASURE . International Journal of Innovations in Engineering Research and Technology, 1-4. https://repo.ijiert.org/index.php/ijiert/article/view/1005

Similar Articles

You may also start an advanced similarity search for this article.