Download Mastering Apache Spark 2.0 – Second Edition pdf, epub, ebook

PublisherPackt Publishing

This site is safe

You are at a security, SSL-enabled, site. All our eBooks sources are constantly verified.

Mastering Apache Spark 2.0 – Second Edition

Name: Mastering Apache Spark 2.0 – Second Edition
Author: Romeo Kienzler

By Romeo Kienzler

What do you think about this eBook?

About

Key Features

An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities.
Extend your data processing capabilities to process huge chunk of data in minimum time using advanced concepts in Spark.
Master the art of real-time processing with the help of Apache Spark 2.0

Book Description

Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. This book aims to take your limited knowledge of Spark to the next level by teaching you how to expand Spark functionality.

The book commences with an overview of the Spark eco-system. You will learn how hive can be configured and used on Spark to provide real time SQL processing. The book will introduce you to Project Tungsten. You will understand how Memory Management and Binary Processing, Cache-aware Computation and Code Generation are used to speed things up dramatically. The book extends to show how to incorporate H20 and Deeplearning4j for machine learning, Titan for graph based storage, Databricks and Juypter Notebooks for cloud-based Spark. During the course of the book, you will learn about the latest enhancements in Apache Spark 2.0 such as Interactive querying of live data, unifying dataframes and data sets, and so on.

You will also learn about update in Accumulative APIs and DataFrame-based ML API. You will learn to use Spark as a Compiler, understand how to implement structure streaming, and thus explore how easy it is to use Spark in day-to-day tasks.

What you will learn

Examine clustering and classification using MLlib
Create a schema in Spark SQL, and learn how a Spark schema can be populated with data
Study Spark based graph processing using Spark GraphX
Combine Spark with H20 and DeepLearning4j and learn why it is useful
Evaluate how graph storage works with Apache Spark, Titan, HBase and Cassandra
Use Apache Spark in the cloud with Databricks, Jupyter Notebooks, Docker and OpenStack

About the Author

Romeo Kienzler is the Chief Data Scientist of the IBM Watson IoT Division and working as an Advisory Architect helping client worldwide to solve their data analysis problems.

https://www.linkedin.com/in/romeo-kienzler-089b4557

https://www.packtpub.com/big-data-and-business-intelligence/learning-data-mining-r-video

He holds an M. Sc. of Information System, Bioinformatics and Applied Statistics from the Swiss Federal Institute of Technology. He works as an Associate Professor for data mining at a Swiss University and his current research focus is on cloud-scale data mining using open source technologies including R, ApacheSpark, SystemML, ApacheFlink, and DeepLearning4J. He also contributes to various open source projects. Additionally, he is currently writing a chapter on Hyperledger for a book on Blockchain technologies.

http://dataconomy.com/where-life-science-and-data-science-meet-interview-with-romeo-kienzler-of-ibm/

Romeo has spoken at the O'Reilly's Velocity conference.

http://conferences.oreilly.com/velocity/devops-web-performance-eu-2015/public/schedule/speaker/219260

http://www.meetup.com/Big-Data-Developers-in-Berlin/events/227744512/

Download eBook Link updated in 2017
Maybe you will be redirected to source's website

Defined as

Listed in collections

Browse collections Find similar eBooks

This site is safe

Mastering Apache Spark 2.0 – Second Edition

What do you think about this eBook?

About

Key Features

Book Description

What you will learn

About the Author

Defined as

Listed in collections

Related to this eBook

Keep connected to us

Explore eBooks

Business

Computer & IT

Games

Literature & Fiction

Medicine

Programming & Coding

This site is safe

Mastering Apache Spark 2.0 – Second Edition

What do you think about this eBook?

About

Key Features

Book Description

What you will learn

About the Author

Defined as

Listed in collections

Social networks

1-email for week newsletter

Business

Computer & IT

Games

Literature & Fiction

Medicine

Programming & Coding