Working with Apache Hive

Working with Apache Hive Course Details:

Hive is the de-facto standard for data warehousing Hadoop. This course starts with a Hive setup and operations and continues into advanced Hive uses. It also discusses performance and execution engines while ending with a practical workshop.

No classes are currenty scheduled for this course.

Call (919) 283-1674 to get a class scheduled online or in your area!

Hive Basics

Defining Hive Tables
SQL Queries over Structured Data
Filtering / Search
Aggregations / Ordering
Partitions
Joins
Text Analytics (Semi-Structured Data)

Hive Advanced

Transformation, Aggregation
Working with Dates, Timestamps, and Arrays
Converting Strings to Date, Time, and Numbers
Create new Attributes, Mathematical Calculations, Windowing Functions
Use Character and String Functions
Binning and Smoothing
Processing JSON Data
Execution Engines (Tez, MR, and Spark)

Impala (for Cloudera track)

Architecture
Impala joins and other SQL specifics

Bonus Project

Students will work in teams to do this end-to-end workshop
Setup a data warehouse with Hive
Query and analyze data with Hive and Spark

*Please Note: Course Outline is subject to change without notice. Exact course outline will be provided at time of registration.

Join an engaging hands-on learning environment, where you’ll learn:

Hive basics and features
How to process, transform, and manage data
Processing and performance management
How to setup a date warehouse with Hive
Data query and analysis

This course has a 50% hands-on labs to 50% lecture ratio with engaging instruction, demos, group discussions, labs, and project work.

Before attending this course, you should:

Be familiar with SQL
Be able to navigate the Linux command line
Have basic knowledge of command line Linux editors (VI/nano)

Data Scientists, Software Engineers, Developers, and Administrators

Working with Apache Hive

Working with Apache Hive Course Details:

Ready to Jumpstart Your IT Career?