Every year a large amount of data is generated which needs to be stored and analyzed. Apache Spark allows you to process such big data. The real power and value proposition of Apache Spark is its speed and platform to execute data science tasks. Spark's unique use case is that it combines ETL, batch analytic, real-time stream analysis, machine learning, graph processing, and visualizations to allow data scientists to tackle the complexities that come with raw unstructured data sets. Spark embraces this approach and has the vision to make the transition from working on a single machine to working on a cluster, something that makes data science tasks a lot more agile. So, if you're interested to learn big data processing and execute data science tasks efficiently, then go for this Learning Path. Packt's Video Learning Path is a series of individual video products put together in a logical and stepwise manner such that each video builds on the skills learned in the video before it. The highlights of this Learning Path are: Explore the Apache Spark architecture and delve into its API and key features Implement efficient big data processing Write code that is maintainable and easy to test Explore various facets of data science with Spark Get up and running with Apache Spark and clean, analyze, and visualize data with ease Let's take a quick look at your learning journey. This Learning Path starts off by explaining the basics of Spark API and its architecture in detail. You will then learn about data mining and data cleaning. You will also learn to analyze data by writing actual jobs. Next, you will learn the needed steps to build machine learning applications. You will also explore machine learning algorithms and different machine learning techniques. Further, you will learn to collect, clean, and visualize data coming from Twitter with Spark streaming. Finally, you will understand how to perform analysis including graph processing. By the end of this Learning Path, you will be able to do all your data science tasks in a very visual way, comprehensive and appealing for business and other stakeholders. Meet Your Experts: We have the best works of the following esteemed authors to ensure that your learning journey is smooth: Tomasz Lelek is a Software Engineer, programmer mostly in Java and Scala. He is a fan of microservices architecture, and functional programming. He recently dived into big data technologies such as Apache Spark and Hadoop. Eric Charles has 10 years of experience in the field of Data Science and is the founder of Datalayer, a social network for Data Scientists. He is passionate about using software and mathematics to help companies get insights from data. His typical day includes building efficient processing with advanced machine learning algorithms, easy SQL, streaming, and graph analytics. He also focuses a lot on visualization and result sharing. He is passionate about open-source technologies and is an active Apache Member.