Key Features
- Apply R to simplify predictive modeling with short and simple code
- Use machine learning to solve problems ranging from small to big data
- Build a training and testing dataset from the churn dataset, applying different classification methods
Book Description
The R language is a powerful open source functional programming language. At its core, R is a statistical programming language that provides impressive tools to analyze data and create high-level graphics.
This book covers the basics of R by setting up a user-friendly programming environment and performing data ETL in R. Data exploration examples are provided that demonstrate how powerful data visualization and machine learning is in discovering hidden relationships. You will then dive into important machine learning topics, including data classification, regression, clustering, association rule mining, and dimension reduction.
What you will learn
- Create and inspect the transaction dataset, performing association analysis with the Apriori algorithm
- Visualize patterns and associations using a range of graphs and find frequent itemsets using the Eclat algorithm
- Compare differences between each regression method to discover how they solve problems
- Predict possible churn users with the classification approach
- Implement the clustering method to segment customer data
- Compress images with the dimension reduction method
- Incorporate R and Hadoop to solve machine learning problems on Big Data
About the Author
Yu-Wei, Chiu (David Chiu) is the founder of LargitData. He has previously worked for Trend Micro as a software engineer, with the responsibility of building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis.
Table of Contents
- Practical Machine Learning with R
- Data Exploration with RMS Titanic
- R and Statistics
- Understanding Regression Analysis
- Classification (I) – Tree, Lazy, and Probabilistic
- Classification (II) – Neural Network and SVM
- Model Evaluation
- Ensemble Learning
- Clustering
- Association Analysis and Sequence Minin
- Dimension Reduction
- Big Data Analysis (R and Hadoop)