Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS
Cite
Citation

Linked e-resources

Details

Intro
Table of Contents
About the Authors
About the Technical Reviewer
Acknowledgments
Introduction
Chapter 1: Extracting the Data
Introduction
Client Data
Free Sources
Web Scraping
Recipe 1-1. Collecting Data
Problem
Solution
How It Works
Step 1-1. Log in to the Twitter developer portal
Step 1-2. Execute query in Python
Recipe 1-2. Collecting Data from PDFs
Problem
Solution
How It Works
Step 2-1. Install and import all the necessary libraries
Step 2-2. Extract text from a PDF file
Recipe 1-3. Collecting Data from Word Files

Problem
Solution
How It Works
Step 3-1. Install and import all the necessary libraries
Step 3-2. Extract text from a Word file
Recipe 1-4. Collecting Data from JSON
Problem
Solution
How It Works
Step 4-1. Install and import all the necessary libraries
Step 4-2. Extract text from a JSON file
Recipe 1-5. Collecting Data from HTML
Problem
Solution
How It Works
Step 5-1. Install and import all the necessary libraries
Step 5-2. Fetch the HTML file
Step 5-3. Parse the HTML file
Step 5-4. Extract a tag value

Step 5-5. Extract all instances of a particular tag
Step 5-6. Extract all text from a particular tag
Recipe 1-6. Parsing Text Using Regular Expressions
Problem
Solution
How It Works
Tokenizing
Extracting Email IDs
Replacing Email IDs
Extracting Data from an eBook and Performing regex
Recipe 1-7. Handling Strings
Problem
Solution
How It Works
Replacing Content
Concatenating Two Strings
Searching for a Substring in a String
Recipe 1-8. Scraping Text from the Web
Problem
Solution
How It Works
Step 8-1. Install all the necessary libraries

Step 8-2. Import the libraries
Step 8-3. Identify the URL to extract the data
Step 8-4. Request the URL and download the content using Beautiful Soup
Step 8-5. Understand the website's structure to extract the required information
Step 8-6. Use Beautiful Soup to extract and parse the data from HTML tags
Step 8-7. Convert lists to a data frame and perform an analysis that meets business requirements
Step 8-8. Download the data frame
Chapter 2: Exploring and Processing Text Data
Recipe 2-1. Converting Text Data to Lowercase
Problem
Solution
How It Works

Step 1-1. Read/create the text data
Step 1-2. Execute the lower() function on the text data
Recipe 2-2. Removing Punctuation
Problem
Solution
How It Works
Step 2-1. Read/create the text data
Step 2-2. Execute the replace() function on the text data
Recipe 2-3. Removing Stop Words
Problem
Solution
How It Works
Step 3-1. Read/create the text data
Step 3-2. Remove punctuation from the text data
Recipe 2-4. Standardizing Text
Problem
Solution
How It Works
Step 4-1. Create a custom lookup dictionary

Browse Subjects

Show more subjects...

Statistics

from
to
Export