Apache Iceberg Fundamentals Training Course

Apache Iceberg is an open-source table format designed for large-scale datasets, bringing the reliability and simplicity of SQL tables to big data environments. It was created to address the challenges of managing big data in data lakes, which often involve handling complex schemas, large files, and diverse data sources.

This instructor-led, live training (available online or onsite) is targeted at beginner-level data professionals who wish to acquire the knowledge and skills necessary to effectively utilize Apache Iceberg for managing large-scale datasets, ensuring data integrity, and optimizing data processing workflows.

By the end of this training, participants will be able to:

Gain a thorough understanding of Apache Iceberg's architecture, features, and benefits.
Learn about table formats, partitioning, schema evolution, and time travel capabilities.
Install and configure Apache Iceberg in different environments.
Create, manage, and manipulate Iceberg tables.
Understand the process of migrating data from other table formats to Iceberg.

Course Structure

Interactive lectures and discussions.
Numerous exercises and practical sessions.
Hands-on implementation in a live-lab environment.

Customization Options

To request customized training for this course, please contact us to arrange.

This course is available as onsite live training in Kenya or online live training.

Thank you for sending your enquiry! One of our team members will contact you shortly.

Thank you for sending your booking! One of our team members will contact you shortly.

Course Outline

Introduction to Apache Iceberg

Overview of Apache Iceberg
Importance and use cases in modern data architecture
Key features and benefits

Core Concepts

Iceberg table format and architecture
Comparison with other table formats
Partitioning and schema evolution
Time travel and data versioning

Setting Up Apache Iceberg

Installation and configuration
Integrating Iceberg with various data processing engines
Setting up an Iceberg environment on a local machine

Basic Operations

Creating and managing Iceberg tables
Writing to and reading from Iceberg tables
Basic CRUD operations

Data Migration and Integration

Migrating data from Hive and other systems to Iceberg
Integration with BI tools
Migrating a sample dataset to Iceberg

Optimizing Performance

Performance tuning techniques
Optimizing queries and data scans
Performance optimization in Iceberg

Overview of Advanced Features

Partition evolution and hidden partitioning
Table evolution and schema changes
Time travel and rollback features
Implementing advanced features in Iceberg

Summary and Next Steps

Requirements

Familiarity with concepts such as tables, schemas, partitions, and data ingestion
Basic knowledge of SQL

Target Audience

Data engineers
Data architects
Data analysts
Software developers

14 Hours

Need help picking the right course?
southafrica@nobleprog.co.za or +27 (0)10 005 5793

Testimonials (2)

A journey through the Spark world: a very intense course. DSL, spark sql, partitioning vs bucketing for me.

Georgiana Elisabeta

Course - Apache Spark Fundamentals

Hands on exercises. Class should have been 5 days, but the 3 days helped to clear up a lot of questions that I had from working with NiFi already

Apache Iceberg Fundamentals Training Course

Course Outline

Requirements

Testimonials (2)

Georgiana Elisabeta

Course - Apache Spark Fundamentals

James - BHG Financial

Course - Apache NiFi for Administrators

Related Categories

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites

Apache Iceberg Fundamentals Training Course

Course Outline

Requirements

Testimonials (2)

Georgiana Elisabeta

Course - Apache Spark Fundamentals

James - BHG Financial

Course - Apache NiFi for Administrators

Related Courses

Advanced Apache Iceberg

Big Data Analytics with Google Colab and Apache Spark

Big Data Business Intelligence for Govt. Agencies

A Practical Introduction to Data Analysis and Big Data - 3 Days

Big Data and Advanced Analytics

Big Data Business Intelligence for Criminal Intelligence Analysis

Apache NiFi for Administrators

PySpark and Machine Learning

Apache Spark Fundamentals

Administration of Apache Spark

Apache Spark in the Cloud

Python and Spark for Big Data (PySpark)

Python, Spark, and Hadoop for Big Data

Stratio: Rocket and Intelligence Modules with PySpark

Related Categories

Big Data

This site in other countries/regions

Europe

Asia Pacific

North America

South America

Africa / Middle East

Other sites