PySpark SQL Recipes :: with HiveQL, Dataframe and Graphframes /: Raju Kumar Mishra and Sundar Rajan Raman.

Mishra, Raju Kumar,; Raman, Sundar Rajan.

PySpark SQL Recipes : with HiveQL, Dataframe and Graphframes / Raju Kumar Mishra and Sundar Rajan Raman.

Mishra, Raju Kumar, author.; Raman, Sundar Rajan.

2019

QA76.73.P98

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Linked e-resources

Linked Resource

Online Access

Concurrent users

Unlimited

Authorized users

Document Delivery Supplied

Can lend chapters, not whole ebooks

Details

Title

PySpark SQL Recipes : with HiveQL, Dataframe and Graphframes / Raju Kumar Mishra and Sundar Rajan Raman.

Author

Mishra, Raju Kumar, author.

ISBN

9781484243350 (electronic book)
1484243358 (electronic book)
9781484243343

Published

[Place of publication not identified] : Springer Nature : Apress, 2019.

Language

English

Description

1 online resource.

Call Number

QA76.73.P98

Dewey Decimal Classification

005.13/3

Access Note

Access limited to authorized users.

Source of Description

Online resource; title from PDF title page (viewed March 26, 2019).

Added Author

Raman, Sundar Rajan.

Linked Resources

Online Access

Record Appears in

Online Resources > Ebooks
All Resources

Intro; Table of Contents; About the Authors; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Introduction to PySpark SQL; Introduction to Big Data; Volume; Velocity; Variety; Veracity; Introduction to Hadoop; Introduction to HDFS; Introduction to MapReduce; Introduction to Apache Hive; Introduction to Apache Pig; Introduction to Apache Kafka; Producer; Broker; Consumer; Introduction to Apache Spark; PySpark SQL: An Introduction; Introduction to DataFrames; SparkSession; Structured Streaming; Catalyst Optimizer; Introduction to Cluster Managers

Standalone Cluster ManagerApache Mesos Cluster Manager; YARN Cluster Manager; Introduction to PostgreSQL; Introduction to MongoDB; Introduction to Cassandra; Chapter 2: Installation; Recipe 2-1. Install Hadoop on a Single Machine; Problem; Solution; How It Works; Step 2-1-1. Creating a New CentOS User; Step 2-1-2. Adding a CentOS user to sudo; Step 2-1-3. Installing Java; Step 2-1-4. Creating Password-Less Logging from pysparksqlbook; Step 2-1-5. Downloading Hadoop; Step 2-1-6. Moving Hadoop Binaries to the Installation Directory; Step 2-1-7. Modifying the Hadoop Environment File

Step 2-1-8. Modifying the Hadoop Properties FilesStep 2-1-9. Updating the .bashrc File; Step 2-1-10. Running the Namenode Format; Step 2-1-11. Starting Hadoop; Step 2-1-12. Checking the Installation of Hadoop; Step 2-1-13. Stopping the Hadoop Processes; Recipe 2-2. Install Spark on a Single Machine; Problem; Solution; How It Works; Step 2-2-1. Downloading Apache Spark; Step 2-2-2. Extracting the .tgz File of Spark; Step 2-2-3. Moving the Extracted Spark Directory to /allBigData; Step 2-2-4. Changing the Spark Environment File; Step 2-2-5. Amending the .bashrc File

Step 2-2-6. Starting the PySpark ShellRecipe 2-3. Use the PySpark Shell; Problem; Solution; How It Works; Recipe 2-4. Install Hive on a Single Machine; Problem; Solution; How It Works; Step 2-4-1. Downloading Hive; Step 2-4-2. Extracting Hive; Step 2-4-3. Moving the Extracted Hive Directory; Step 2-4-4. Updating hive-site.xml; Step 2-4-5. Updating the .bashrc File; Step 2-4-6. Creating Datawarehouse Directories of Hive; Step 2-4-7. Initiating the Metastore Database; Step 2-4-8. Checking the Hive Installation; Recipe 2-5. Install PostgreSQL; Problem; Solution; How It Works

Step 2-5-1. Installing PostgreSQLStep 2-5-2. Initializing the Database; Step 2-5-3. Enabling and Starting the Database; Recipe 2-6. Configure the Hive Metastore on PostgreSQL; Problem; Solution; How It Works; Step 2-6-1. Downloading the PostgreSQL JDBC Connector; Step 2-6-2. Copying the JDBC Connector to the Hive lib Directory; Step 2-6-3. Connecting to PostgreSQL; Step 2-6-4. Creating the Required User and Database; Step 2-6-5. Populating Data in the pymetastore Database; Step 2-6-6. Granting Permissions; Step 2-6-7. Changing the pg_hba.conf File; Step 2-6-8. Testing Our User

Browse Subjects

Show more subjects...

PySpark SQL Recipes : with HiveQL, Dataframe and Graphframes / Raju Kumar Mishra and Sundar Rajan Raman.

Linked e-resources

Details

Table of Contents

Browse Subjects

Statistics