FUZZY DOCUMENT REPRESENTATION FOR SEARCH DIVERSIFICATION

Main Article Content

SIJIN P
DR.CHAMPA H.N.

Abstract

Fuzzy document representationinvolves transforming the unstructured data into numerical vectors. Such a representation is more useful for text classification and document clustering. The proposed Fuzzy Conceptualization Model (FCM) performs conceptualization and provides a better data representation model on the basis of semantic relatedness and similarity between terms in a word corpus. Word embedding is used to hold the semantically related words in a concept cluster. The concept clusters are inferred and vectored forthe given corpus to hold the data in a multidimensional space. FCM determines the fuzzy membership value of a base term by calculating the affinity score between its corresponding word embedding and other word embeddings. A weighing scheme isused to distinguish between exact and approximate matches. The greatest bound for the distribution of base set over the documents gives the best matched documents for a search query. The exact and approximate matches are differentiated by considering the normalized term frequency of a term in the specified concept cluster along with its actual presence. The resultant matrix gives a lower dimensional and discriminated representation of data

Downloads

Download data is not yet available.

Article Details

How to Cite
SIJIN P, & DR.CHAMPA H.N. (2021). FUZZY DOCUMENT REPRESENTATION FOR SEARCH DIVERSIFICATION . International Journal of Innovations in Engineering Research and Technology, 5(12), 1-9. https://repo.ijiert.org/index.php/ijiert/article/view/1564
Section
Articles

How to Cite

SIJIN P, & DR.CHAMPA H.N. (2021). FUZZY DOCUMENT REPRESENTATION FOR SEARCH DIVERSIFICATION . International Journal of Innovations in Engineering Research and Technology, 5(12), 1-9. https://repo.ijiert.org/index.php/ijiert/article/view/1564