Linked e-resources

Details

Intro
Preface
Algorithm Competition Era
Why Write
Features of the Book
Target Readers
Welcome to Contact with Us
Acknowledgments
Contents
Part I: Half the Work, Twice the Effect
Chapter 1: Guide to the Competitions
1.1 Competition Platforms
1.1.1 Kaggle
1.1.2 Tianchi
1.1.2.1 Registration
1.1.2.2 Competition System
1.1.2.3 Points
1.1.3 DF
1.1.4 DC
1.1.5 Kesci
1.1.6 JDATA
1.1.7 Corporate Websites
1.2 Competition Procedures
1.2.1 Problem Modeling
1.2.2 Data Exploration
1.2.3 Feature Engineering
1.2.4 Model Training

1.2.5 Model Integration
1.3 Competition Types
1.3.1 Data Types
1.3.2 Task Types
1.3.3 Application Scenarios
1.4 Thinking Exercises
Chapter 2: Problem Modeling
2.1 Understanding the Competition Question
2.1.1 Business Background
2.1.1.1 Go Deep into the Business
2.1.1.2 Be Clear About the Goals
2.1.2 Understanding Data
2.1.3 Evaluation Indicators
2.1.3.1 Classification Indicators
Error Rate and Accuracy
Precision and Recall
F1-score
ROC Curve
AUC
Logarithmic Loss
2.1.3.2 Indicators of Regression
Mean Absolute Error

Mean Squared Error
Root Mean Squared Error
Average Absolute Percentage Error
2.2 Sample Selection
2.2.1 Main Reasons
2.2.1.1 Too Large Data Set
2.2.1.2 Data Noise
2.2.1.3 Data Redundancy
2.2.1.4 Uneven Distribution of Positive and Negative Samples
2.2.2 Accurate Methods
2.2.3 Application Scenarios
2.3 Offline Evaluation Strategy
2.3.1 Strong Time Sequence Problems
2.3.2 Weak Time Sequence Problems
2.4 Cases in Practice
2.4.1 Understanding the Competition Question
2.4.2 Offline Verification
2.5 Thinking Exercises
Chapter 3: Data Exploration

3.1 Preliminary Data Exploration
3.1.1 Analytical Thinking
3.1.2 Analysis Methods
3.1.3 Purpose Clarification
3.2 Variable Analysis
3.2.1 Univariate Analysis
3.2.1.1 Labels
3.2.1.2 Continuous Type
3.2.1.3 Category Type
3.2.2 Multivariate Analysis
3.3 Model Analysis
3.3.1 Learning Curve
3.3.1.1 Underfitting Learning Curve
3.3.1.2 Overfitting Learning Curve
3.3.2 Feature Importance Analysis
3.3.3 Error Analysis
3.4 Thinking Exercises
Chapter 4: Feature Engineering
4.1 Data Preprocessing
4.1.1 Processing Missing Values

4.1.1.1 Distinguishing Missing Values
4.1.1.2 Processing Method
4.1.2 Dealing with Outliers
4.1.2.1 Looking for Outliers
4.1.2.2 Coping with Outliers
4.1.3 Optimizing Memory
4.2 Feature Transformation
4.2.1 Non-dimensionalization Processing of Continuous Variables
4.2.2 Data Transformation of Continuous Variables
4.2.2.1 log Transformation
4.2.2.2 Discretization of Continuous Variables
4.2.3 Category Feature Transformation
4.2.4 Irregular Feature Transformation
4.3 Feature Extraction
4.3.1 Statistics Features Related to Categories
4.3.1.1 Target Coding

Browse Subjects

Show more subjects...

Statistics

from
to
Export