Advanced Analytics with Spark

Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills
Apache Spark is emerging as one of the most popular technologies for performing analytics on huge datasets, and this practical guide shows you how to harness Spark’s power for approaching a variety of analytics problems. You’ll learn how to apply common techniques, such as classification, clustering, collaborative filtering, anomaly detection, dimensionality reduction, and Monte Carlo simulation to fields such as genomics, security, and finance.Advanced Analytics with Spark supplies complete implementations that analyze large public datasets, and acts as an introduction to using these techniques and other best practices in Spark programming.Become familiar with the Spark programming model and ecosystemLearn general approaches in data scienceDiscover which machine learning tools make sense for particular problemsAcquire code from GitHub that can be adapted to many usesThis book will interest both data science professionals and aspiring data scientists, students studying learning techniques for analyzing large datasets, and scientists interested in using Spark as a research tool.


