Beginning Apache Spark using Azure databricks :: unleashing large cluster analytics in the cloud /: Robert Ilijason.

Ilijason, Robert.

doi:10.1007/978-1-4842-5

Beginning Apache Spark using Azure databricks : unleashing large cluster analytics in the cloud / Robert Ilijason.

Ilijason, Robert.

2020

QA76.585

Available Online

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Linked e-resources

Linked Resource

Online Access

Concurrent users

Unlimited

Authorized users

Document Delivery Supplied

Can lend chapters, not whole ebooks

Details

Title

Beginning Apache Spark using Azure databricks : unleashing large cluster analytics in the cloud / Robert Ilijason.

Author

Ilijason, Robert.

ISBN

9781484257814 (electronic book)
1484257812 (electronic book)
1484257804
9781484257807

DOI

https://doi.org/10.1007/978-1-4842-5

Publication Details

[United States] : Apress, 2020.

Language

English

Description

1 online resource

Item Number

10.1007/978-1-4842-5

Call Number

QA76.585

Dewey Decimal Classification

004.67/82

Summary

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster. This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything about configuring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data. This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned. What You Will Learn Discover the value of big data analytics that leverage the power of the cloud Get started with Databricks using SQL and Python in either Microsoft Azure or AWS Understand the underlying technology, and how the cloud and Spark fit into the bigger picture See how these tools are used in the real world Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free This book is for data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation. Robert Ilijason is a 20-year veteran in the business intelligence (BI) segment. He has worked as a contractor for some of Europes biggest companies and has conducted large-scale analytics projects within the areas of retail, telecom, banking, government, and more. Robert has seen his share of analytic trends come and go over the years, but unlike most of them, he strongly believes that Apache Spark in the cloud, especially with Azure Databricks, is a game changer.

Note

Includes index.

Access Note

Access limited to authorized users.

Available in Other Form

Print version: 9781484257807

Linked Resources

Online Access

Record Appears in

Online Resources > Ebooks
All Resources

Chapter 1: Introduction to Large-Scale Data Analytics
Chapter 2: Spark and Databricks
Chapter 3: Getting Started with Databricks
Chapter 4: Workspaces, Clusters, and Notebooks
Chapter 5: Getting Data into Databricks
Chapter 6: Querying Data Using SQL
Chapter 7: The Power of Python
Chapter 8: ETL and Advanced Data Wrangling
Chapter 9: Connecting to and from Afar
Chapter 10: Running in Production
Chapter 11: Bits and Pieces.

Browse Subjects

Show more subjects...

Beginning Apache Spark using Azure databricks : unleashing large cluster analytics in the cloud / Robert Ilijason.

Linked e-resources

Details

Table of Contents

Browse Subjects

Statistics