Apache Spark: What are some of the benefits of lazy evaluation of operations in Apache Spark?

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Apache Spark: What are some of the benefits of lazy evaluation of operations in Apache
Spark?

Expert Solution
Step 1: Apache Spark
  • Spark was founded by Matei Zaharia in “2009”, at the University of California, Berkeley.
  • After a year, it was made open-source. The project was then donated to the Apache Software Foundation in “2013”.
  • A year later, it became one of the top projects of Apache. In the very same year, Matei's company, Databricks, bring off the new world record in the domain of large-scale sorting, using Spark.
  • By “2015”, it received over a thousand contributions and became one of the most dominating projects in the open-source space of projects in big data. Since then, the project and its founders have not looked back.
  • The basic data structure of Spark is Resilient Distributed Dataset (RDD). It is an immutable, and a distributed collection of objects, which could be in Scala, Python and Java. Every dataset in Spark RDD is further divided into logical sections, which can be computed on multiple nodes of the entire cluster.
steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
Fundamentals of Blockchaining
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education