Advanced Partitioning

Discover ways to partition file-based and column-based datasets to compute incrementally based on time-based or discrete partitions. Includes the Partition Redispatch feature.

About this course

 

In Advanced Partitioning, advanced users will find out what makes partitioning unique and valuable in Dataiku DSS. Learners will get hands-on experience with partitioning file-based and column-based datasets in a Flow to compute incrementally based on time-based (e.g., by purchase date) or discrete (e.g., by merchant subsector such as gas, internet, or insurance) partitions. In this course, you will also learn more about the Dataiku DSS Partition Redispatch feature.

 

 

Learning Objectives

At the end of this course, you will be better acquainted with:

1 - When to use partitioning in a Flow

2 - What is a partition dependency

3 - Partition Redispatch scenarios such as "non-partitioned to partitioned" and "partitioned to non-partitioned"

4 - Operationalization: Partitioning in a scenario

Course Properties

Course Title Advanced Partitioning

Target Audience

Advanced users who want to know more about partitioning in a Flow including scenarios

Access Level

Free / included with registration

Estimated Time for Completion

79 Minutes

Completion Criteria

Pass the course checkpoint with 80% 

Supplemental Materials (Y/N)

NONE

Knowledge Prerequisite(s)

NONE

Technical Prerequisite(s)

NONE

Begin by watching the Course Overview: Advanced Partitioning video!

Curriculum79 min

  • Preview
    Course Overview: Advanced Partitioning 1 min
  • What You'll Need 1 min
  • Advanced Partitioning Concepts
  • Tips: Partitioning 1 min
  • Concept: Partitioning 3 min
  • Concept: How Partitioning Adds Value 2 min
  • Concept: Partitioned Datasets 5 min
  • Concept: Running Jobs with Partitioned Datasets 8 min
  • Concept: Redispatching and Collecting Partitions 3 min
  • Hands-On: Advanced Partitioning: File-Based Using Partition Redispatch 10 min
  • Hands-On: Advanced Partitioning: Column-Based (SQL-Based) 10 min
  • Quiz: Advanced Partitioning Concepts 5 min
  • Partitioning in a Scenario
  • Concept: Partitioning in a Scenario 4 min
  • Hands-On: Partitioning in a Scenario 10 min
  • Quiz: Partitioning in a Scenario 2 min
  • Wrap Up: Advanced Partitioning
  • Course Checkpoint: Advanced Partitioning 9 min
  • Tell Us What You Think: Advanced Partitioning 1 min
  • Course Complete: Advanced Partitioning 1 min

About this course

 

In Advanced Partitioning, advanced users will find out what makes partitioning unique and valuable in Dataiku DSS. Learners will get hands-on experience with partitioning file-based and column-based datasets in a Flow to compute incrementally based on time-based (e.g., by purchase date) or discrete (e.g., by merchant subsector such as gas, internet, or insurance) partitions. In this course, you will also learn more about the Dataiku DSS Partition Redispatch feature.

 

 

Learning Objectives

At the end of this course, you will be better acquainted with:

1 - When to use partitioning in a Flow

2 - What is a partition dependency

3 - Partition Redispatch scenarios such as "non-partitioned to partitioned" and "partitioned to non-partitioned"

4 - Operationalization: Partitioning in a scenario

Course Properties

Course Title Advanced Partitioning

Target Audience

Advanced users who want to know more about partitioning in a Flow including scenarios

Access Level

Free / included with registration

Estimated Time for Completion

79 Minutes

Completion Criteria

Pass the course checkpoint with 80% 

Supplemental Materials (Y/N)

NONE

Knowledge Prerequisite(s)

NONE

Technical Prerequisite(s)

NONE

Begin by watching the Course Overview: Advanced Partitioning video!

Curriculum79 min

  • Preview
    Course Overview: Advanced Partitioning 1 min
  • What You'll Need 1 min
  • Advanced Partitioning Concepts
  • Tips: Partitioning 1 min
  • Concept: Partitioning 3 min
  • Concept: How Partitioning Adds Value 2 min
  • Concept: Partitioned Datasets 5 min
  • Concept: Running Jobs with Partitioned Datasets 8 min
  • Concept: Redispatching and Collecting Partitions 3 min
  • Hands-On: Advanced Partitioning: File-Based Using Partition Redispatch 10 min
  • Hands-On: Advanced Partitioning: Column-Based (SQL-Based) 10 min
  • Quiz: Advanced Partitioning Concepts 5 min
  • Partitioning in a Scenario
  • Concept: Partitioning in a Scenario 4 min
  • Hands-On: Partitioning in a Scenario 10 min
  • Quiz: Partitioning in a Scenario 2 min
  • Wrap Up: Advanced Partitioning
  • Course Checkpoint: Advanced Partitioning 9 min
  • Tell Us What You Think: Advanced Partitioning 1 min
  • Course Complete: Advanced Partitioning 1 min