DIAGNOSING VIRTUALIZED HADOOP PERFORMANCE FROM BENCHMARK RESULTS: AN EXPLORATORY STUDY
Main Article Content
Abstract
The importance of virtualization technologies in Hadoop is explored in this article. It looked at Hadoop as a new and common platform for businesses to use to improve business performance based on broad data sets. Hadoop gains from virtualization technologies in a variety of ways, including increased resource availability and cluster stability. Customers are also requesting virtual services including CPU, RAM, disks (etc.) from service companies (e.g. Amazon) and paying "pay as you go."[1] These advantages, though, are meaningless to consumers if unreasonable output loss occurs when moving from a real to a virtual platform. According to existing research on virtualized Hadoop performance, inappropriate network and storage settings for open-source virtual implementation result in significant performance degradation. However, due to the complexities of hardware and applications, including virtualization setups and implementation scales, performance tuning remains an extremely difficult practice to implement [1]. To bridge the virtualized Hadoop implementation gap, this paper recommends a performance diagnostic approach that incorporates statistical research from several levels, as well as a heuristic performance diagnostic tool that tests the reliability and accuracy of virtualized Hadoop by tracking employee traces from common big data benchmarks. Users will easily detect the bottleneck using the insights given by this tool, validate the assessment using performance resources generated by the guest OS and hypervisor, and keep optimizing performance for virtualized Hadoop by running this tool several times. Virtualization systems, in general, are used by supervisors to maximize resource use while lowering operational costs. Virtualization systems are divided into two classes. The first is about heavy virtualization, which is focused on the principle of virtual machines (VM) [2]. Each virtual machine (VM) replicates hardware and runs its operating system (OS) that is entirely independent of the host OS. The next one is for light virtualization, which is focused on container management. Although maintaining isolation, the containers share the host OS kernel [2]. This paper looks at the efficiency of Hadoop software which utilizes virtualization technologies
Downloads
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0 DEED).
You are free to:
- Share — copy and redistribute the material in any medium or format
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- NonCommercial — You may not use the material for commercial purposes .
- NoDerivatives — If you remix, transform, or build upon the material, you may not distribute the modified material.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
Rights of Authors
Authors retain the following rights:
1. Copyright and other proprietary rights relating to the article, such as patent rights,
2. the right to use the substance of the article in future works, including lectures and books,
3. the right to reproduce the article for own purposes, provided the copies are not offered for sale,
4. the right to self-archive the article.