FUZZY DOCUMENT REPRESENTATION FOR SEARCH DIVERSIFICATION
Keywords:
BoW, SVD, FCMAbstract
Fuzzy document representationinvolves transforming the unstructured data into numerical vectors. Such a representation is more useful for text classification and document clustering. The proposed Fuzzy Conceptualization Model (FCM) performs conceptualization and provides a better data representation model on the basis of semantic relatedness and similarity between terms in a word corpus. Word embedding is used to hold the semantically related words in a concept cluster. The concept clusters are inferred and vectored forthe given corpus to hold the data in a multidimensional space. FCM determines the fuzzy membership value of a base term by calculating the affinity score between its corresponding word embedding and other word embeddings. A weighing scheme isused to distinguish between exact and approximate matches. The greatest bound for the distribution of base set over the documents gives the best matched documents for a search query. The exact and approximate matches are differentiated by considering the normalized term frequency of a term in the specified concept cluster along with its actual presence. The resultant matrix gives a lower dimensional and discriminated representation of data
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Under the Creative Commons Attribution- 4.0 International License (CC BY-4.0 DEED).
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
