Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Linked e-resources

Details

At a Glance; Contents; About the Author; About the Technical Reviewer; Acknowledgments; Chapter 1: MapReduce and Its Abstractions; Small Data Processing; Relational Database Management Systems; Data Warehouse Systems; Parallel Computing; GFS and MapReduce; Apache Hadoop; HDFS; MapReduce; Writing a Map Class; Writing a Reduce Class; Writing a main Class; Running a MapReduce Program; YARN; Benefits; Use Cases; Problems with MapReduce; Cascading; Benefits; Use Cases; Apache Hive; Benefits; Use Cases; Apache Pig; Pig vs. Other Tools; MapReduce; Cascading

Apache Hive Use Cases; Pig Philosophy; Pigs Eat Anything; Pigs Live Anywhere; Pigs Are Domestic Animals; Pigs Fly; Summary; Chapter 2: Data Types; Simple Data Types; int; long; float; double; chararray; boolean; bytearray; datetime; biginteger; bigdecimal; Summary of Simple Data Types; Complex Data Types; map; tuple; bag; Summary of Complex Data Types; Schema; Casting; Casting Error; Comparison Operators; Identifiers; Boolean Operators; Summary; Chapter 3: Grunt; Invoking the Grunt Shell; Commands; The fs Command; The sh Command; Utility Commands; help

History quit; kill; set; clear; exec; run; Summary of Commands; Auto-completion; Summary; Chapter 4: Pig Latin Fundamentals; Running Pig Latin Code; Grunt Shell; Pig -e; Pig -f; Embed Pig Code in a Java Program; Hue; Pig Operators and Commands; Load; RegEx in the File Path; store; dump; version; Foreach Generate; Projection; Flatten; Using Functions; New Schema; Nested Block; filter; null; Boolean Operators; Comparison Operators; Limit; Assert; SPLIT; SAMPLE; FLATTEN; Tuple Example; Bag Example; import; define; distinct

Choosing the Number of Reduce Tasks Using the MapReduce Partitioner; RANK; Union; ORDER BY; Choosing Number of Reduce Tasks; GROUP; Using the Partitioner; Choosing Number of Reducers; Avoiding a Reduce Task; Stream; Using Unix Commands; Using a Shell Program; MAPREDUCE; CUBE; CUBE; ROLLUP; Parameter Substitution; -param; -paramfile; Summary; Chapter 5: Joins and Functions; Join Operators; Equi Joins; Inner Joins; Outer Joins; Left Outer Join; Right Outer Join; Full Outer Join; cogroup; CROSS; Functions; String Functions; UPPER; LOWER; TRIM; REPLACE

STRSPLIT UniqueID; SUBSTRING; Mathematical Functions; FLOOR; CEIL; ROUND; RANDOM; ABS; Date Functions; CurrentTime; GetDay; DAYSBETWEEN; TODATE; TOUNIXTIME; EVAL Functions; AVG; MIN; COUNT; BagToString; Complex Data Type Functions; TOTUPLE; TOBAG; TOMAP; TOP; Load/Store Functions; JsonLoader/JsonStorage; PigStorage; TextLoader; HbaseStorage; Storing Data into HBase; OrcStorage; Loading Data; Summary; Chapter 6: Creating and Scheduling Workflows Using Apache Oozie; Types of Oozie Jobs; Workflow; job.properties; workflow.xml

Browse Subjects

Show more subjects...

Statistics

from
to
Export