Linked e-resources
Details
Table of Contents
Foreword; Preface; Contents; Part I Fundamentals of Big Data Processing; Big Data Storage and Data Models; 1 Storage Models; 1.1 Block-Based Storage; 1.2 File-Based Storage; 1.3 Object-Based Storage; 1.4 Comparison of Storage Models; 2 Data Models; 2.1 NoSQL (Not only SQL); 2.2 Relational-Based; 2.3 Summary of Data Models; References; Big Data Programming Models; 1 MapReduce; 1.1 Features; 1.2 Examples; 2 Functional Programming; 2.1 Features; 2.2 Example Frameworks; 3 SQL-Like; 3.1 Features; 3.2 Examples; 4 Actor Model; 4.1 Features; 4.2 Examples; 5 Statistical and Analytical; 5.1 Features
5.2 Examples6 Dataflow-Based; 6.1 Features; 6.2 Examples; 7 Bulk Synchronous Parallel; 7.1 Features; 7.2 Examples; 8 High Level DSL; 8.1 Pig Latin; 8.2 Crunch/FlumeJava; 8.3 Cascading; 8.4 Dryad LINQ; 8.5 Trident; 8.6 Green Marl; 8.7 Asterix Query Language (AQL); 8.8 IBM Jaql; 9 Discussion and Conclusion; References; Programming Platforms for Big Data Analysis; 1 Introduction; 2 Requirements of Big Data Programming Support; 3 Classification of Programming Platforms; 3.1 Data Source; 3.2 Processing Technique; 4 Major Existing Programming Platforms; 4.1 Data Parallel Programming Platforms
4.2 Graph Parallel Programming Platforms4.3 Task Parallel Platforms; 4.4 Stream Processing Programming Platforms; 5 A Unifying Framework; 5.1 Comparison of Existing Programming Platforms; 5.2 Need for Unifying Framework; 5.3 MatrixMap Framework; 6 Conclusion and Future Directions; References; Big Data Analysis on Clouds; 1 Introduction; 2 Introducing Cloud Computing; 2.1 Basic Concepts; 2.2 Cloud Service Distribution and Deployment Models; 3 Cloud Solutions for Big Data; 3.1 Microsoft Azure; 3.2 Amazon Web Services; 3.3 OpenNebula; 3.4 OpenStack; 4 Systems for Big Data Analytics in the Cloud
4.1 MapReduce4.2 Spark; 4.3 Mahout; 4.4 Hunk; 4.5 Sector/Sphere; 4.6 BigML; 4.7 Kognitio Analytical Platform; 4.8 Data Analysis Workflows; 4.9 NoSQL Models for Data Analytics; 4.10 Visual Analytics; 4.11 Big Data Funding Projects; 4.12 Historical Review; 4.13 Summary; 5 Research Trends; 6 Conclusions; References; Data Organization and Curation in Big Data; 1 Big Data Indexing Techniques; 1.1 Overview; 1.2 Record-Level Non-adaptive Indexing; 1.3 Record-Level Adaptive Indexing; 1.4 Split-Level Indexing; 1.5 Hadoop-RDBMS Hybrid Indexing; 2 Data Organization and Layout Techniques; 2.1 Overview
2.2 Result Materialization and Caching Techniques2.3 Pre-processing and Colocation Techniques; 2.4 None Row-Oriented Storage Layouts; 3 Non-traditional Workloads in Big Data; 3.1 Overview; 3.2 Techniques for Recurring Workloads; 3.3 Techniques for Fast Online Analytics ; 4 Curation and Metadata Management in Big Data; 4.1 Overview; 4.2 Execution-Centric Metadata Approach; 4.3 Provenance-Centric Metadata Approach; 4.4 Data-Centric Metadata Approach; 5 Conclusion; References; Big Data Query Engines; 1 Introduction; 1.1 MPP Query Engines; 1.2 Hadoop Query Engines; 1.3 Chapter Organization
5.2 Examples6 Dataflow-Based; 6.1 Features; 6.2 Examples; 7 Bulk Synchronous Parallel; 7.1 Features; 7.2 Examples; 8 High Level DSL; 8.1 Pig Latin; 8.2 Crunch/FlumeJava; 8.3 Cascading; 8.4 Dryad LINQ; 8.5 Trident; 8.6 Green Marl; 8.7 Asterix Query Language (AQL); 8.8 IBM Jaql; 9 Discussion and Conclusion; References; Programming Platforms for Big Data Analysis; 1 Introduction; 2 Requirements of Big Data Programming Support; 3 Classification of Programming Platforms; 3.1 Data Source; 3.2 Processing Technique; 4 Major Existing Programming Platforms; 4.1 Data Parallel Programming Platforms
4.2 Graph Parallel Programming Platforms4.3 Task Parallel Platforms; 4.4 Stream Processing Programming Platforms; 5 A Unifying Framework; 5.1 Comparison of Existing Programming Platforms; 5.2 Need for Unifying Framework; 5.3 MatrixMap Framework; 6 Conclusion and Future Directions; References; Big Data Analysis on Clouds; 1 Introduction; 2 Introducing Cloud Computing; 2.1 Basic Concepts; 2.2 Cloud Service Distribution and Deployment Models; 3 Cloud Solutions for Big Data; 3.1 Microsoft Azure; 3.2 Amazon Web Services; 3.3 OpenNebula; 3.4 OpenStack; 4 Systems for Big Data Analytics in the Cloud
4.1 MapReduce4.2 Spark; 4.3 Mahout; 4.4 Hunk; 4.5 Sector/Sphere; 4.6 BigML; 4.7 Kognitio Analytical Platform; 4.8 Data Analysis Workflows; 4.9 NoSQL Models for Data Analytics; 4.10 Visual Analytics; 4.11 Big Data Funding Projects; 4.12 Historical Review; 4.13 Summary; 5 Research Trends; 6 Conclusions; References; Data Organization and Curation in Big Data; 1 Big Data Indexing Techniques; 1.1 Overview; 1.2 Record-Level Non-adaptive Indexing; 1.3 Record-Level Adaptive Indexing; 1.4 Split-Level Indexing; 1.5 Hadoop-RDBMS Hybrid Indexing; 2 Data Organization and Layout Techniques; 2.1 Overview
2.2 Result Materialization and Caching Techniques2.3 Pre-processing and Colocation Techniques; 2.4 None Row-Oriented Storage Layouts; 3 Non-traditional Workloads in Big Data; 3.1 Overview; 3.2 Techniques for Recurring Workloads; 3.3 Techniques for Fast Online Analytics ; 4 Curation and Metadata Management in Big Data; 4.1 Overview; 4.2 Execution-Centric Metadata Approach; 4.3 Provenance-Centric Metadata Approach; 4.4 Data-Centric Metadata Approach; 5 Conclusion; References; Big Data Query Engines; 1 Introduction; 1.1 MPP Query Engines; 1.2 Hadoop Query Engines; 1.3 Chapter Organization