We love eBooks
    Download Hadoop 2.x Administration Cookbook pdf, epub, ebook
    Publisher

    This site is safe

    You are at a security, SSL-enabled, site. All our eBooks sources are constantly verified.

    Hadoop 2.x Administration Cookbook

    By Gurmukh Singh

    What do you think about this eBook?

    About

    Key Features

    • Become an expert Hadoop administrator and perform tasks for optimizing your Hadoop Cluster
    • Import and export data into Hive and use Oozie to manage workflow.
    • Practical recipes to help you plan and secure your Hadoop cluster, and make it highly available

    Book Description

    Hadoop allows distributed storage and processing of large data sets across clusters of computers. Learning to administer Hadoop is crucial for exploiting its unique features. With this book, you will be able to overcome common problems encountered in Hadoop Administration.

    This book begins with laying the foundation by showing the steps to set up the Hadoop cluster and its various nodes. You will get a better understanding of how to maintain Hadoop cluster, especially on the HDFS layer and using YARN and MapReduce. Further you will explore durabiltiy and high availability of Hadoop cluster. Get a better understanding of the schedulers in Hadoop and how to configure and use them for your tasks. You will also get a hands-on experience with the back up and recovery options and also performance tuning aspects of Hadoop. Finally, you will get a better understanding of troubleshooting, diagnostics and best practices in Hadoop administration.

    By the end of this book, you will get a proper understanding of working with Hadoop clusters and will also be able to secure, encrypt it and configure auditing for your Hadoop clusters

    What you will learn

    • Set up hadoop architecture to run a Hadoop cluster smoothly.
    • Maintain Hadoop cluster on HDFS, YARN and MapReduce.
    • Understand High Availability with Zookeeper and Journal Node.
    • Configure Flume for data ingestion and Oozie to run various workflows.
    • Tune the Hadoop cluster for optimal performance.
    • Schedule jobs on Hadoop cluster using Fair and Capacity scheduler.
    • Secure your cluster and troubleshoot it for various common pain points.

    About the Author

    Gurmukh Singh has been an infrastructure engineer for over 10 years and has worked on big data platforms in the past 5 years. He started his career as a field engineer, setting up lease lines and radio links. He has vast experience in enterprise servers and network design and in scaling infrastructures and tuning them for performance. He is the founder of a small start-up called Netxillon Technologies, which is into big data training and consultancy. He talks at various technical meetings and is an active participant in the open source community's activities. He writes at http://linuxaddict.org and maintains his Github account at https://github.com/gdhillon.

    Download eBook Link updated in 2017
    Maybe you will be redirected to source's website
    Thank you and welcome to our newsletter list! Ops, you're already in our list.

    Related to this eBook

    Browse collections Find similar eBooks

    Keep connected to us

    Follow us on Social Media or subscribe to our newsletter to keep updated about eBooks world.

    Explore eBooks

    Browse all eBook collections

    Collections is the easy way to explore our eBook directory.