EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING

Authors

  • Dahatonde Varsha Sukhdev, Department of Computer Engineering, G.H.Raisoni College of Engineering, chas, Ahmednagar India
  • Ashish Kumar, Department of Computer Engineering, G.H.Raisoni College of Engineering, chas, Ahmednagar India

Keywords:

Unstructured data,, chema less database, secondary indexes, denormalization

Abstract

An unstructured data poses challenges to storing data. Experts estimate that 80 to 90 percent of the data in any organization is unstructured. And the amount of unstructured data in enterprises is growing significantly— often many times faster than structured databases are growing. As structured data is existing in table format i,e having proper scheme but unstructured data is schema less database So it’s directly signifying the importance of NoSQL storage Model and Map Reduce platform. For processing unstructured data, where in existing it is given to Cassandra dataset. Here in present system along with Cassandra dataset, Mongo DB is to be implemented. As Mongo DB provide flexible data model and large amount of options for querying unstructured data. Whereas Cassandra model their data in such a way as to minimize the total number of queries through more careful planning and renormalizations. It offers basic secondary indexes but for the best performance it’s recommended to model our data as to use them infrequently. So to process

Downloads

Published

2021-03-27

Issue

Section

Articles