Working with Dates

One processor in the Prepare recipe that deserves special discussion is for parsing dates.

Working with dates poses a number of challenges. There are many date formats, different timezones, and components like “day of the week” which can be difficult to extract. Luckily, Dataiku DSS provides a number of helpful utilities for working with dates.

This brief tutorial demonstrates how to parse and manipulate dates in DSS.

Prerequisites

It is not strictly required, but we recommend all newcomers to Dataiku DSS begin with the Foundational learning materials.

Parsing strings into dates

The first challenge when working with dates is to convert the date column into a machine-readable date storage type or format.

When in a Prepare recipe (or Visual Analysis), notice that DSS automatically suggests that strings that look like dates can be parsed.

Here the column order_date is stored as a string. DSS recognizes it is likely a date, and so suggests parsing it.

"Parse date suggested action"

After initiating the Parse date processor, the Smart date window appears and suggests the most likely date formats. You can see in green and red the relative number of valid and invalid examples, as calculated on the sample. You can also enter a custom format if necessary.

"Select date format for date column being parsed"

After choosing a format, a new step is added in the Script, and a new column appears in the dataset.

Note below that while the column order_date is stored as a string, the new column order_date_parsed is stored as a date.

"Parsed date column"

Manipulating dates

With a properly parsed date column, DSS can suggest new operations from the processor library, specific to dates, such as:

  • Extracting date elements: year, month, day, day of week, week of year, etc.
  • Computing time since a date, another columns, today.
  • Flagging holidays.
"Context menu options for parsed date"

What’s Next?

For more information on managing dates in DSS, please see the reference documentation.