About This Book
- Benchmark and profile R programs to solve performance bottlenecks
- Combine the ease of use and flexibility of R with the power of big data tools
- Filled with practical techniques and useful code examples to process large data sets more efficiently
Who This Book Is For
This book is for programmers and developers who want to improve the performance of their R programs by making them run faster with large data sets or who are trying to solve a pesky performance problem.
What You Will Learn
- Benchmark and profile R programs to solve performance bottlenecks
- Understand how CPU, memory, and disk input/output constraints can limit the performance of R programs
- Optimize R code to run faster and use less memory
- Use compiled code in R and other languages such as C to speed up computations
- Harness the power of GPUs for computational speed
- Process data sets that are larger than memory using disk-based memory and chunking
- Tap into the capacity of multiple CPUs using parallel computing
- Leverage the power of advanced database systems and Big Data tools from within R
In Detail
With the increasing use of information in all areas of business and science, R provides an easy and powerful way to analyze and process the vast amounts of data involved. It is one of the most popular tools today for faster data exploration, statistical analysis, and statistical modeling and can generate useful insights and discoveries from large amounts of data.
Through this practical and varied guide, you will become equipped to solve a range of performance problems in R programming. You will learn how to profile and benchmark R programs, identify bottlenecks, assess and identify performance limitations from the CPU, identify memory or disk input/output constraints, and optimize the computational speed of your R programs using great tricks, such as vectorizing computations. You will then move on to more advanced techniques, such as compiling code and tapping into the computing power of GPUs, optimizing memory consumption, and handling larger-than-memory data sets using disk-based memory and chunking.