Working with Dates¶
One processor in the Prepare recipe that deserves special discussion is for parsing dates.
Working with dates poses a number of challenges. There are many date formats, different timezones, and components like “day of the week” which can be difficult to extract. Luckily, Dataiku DSS provides a number of helpful utilities for working with dates.
This brief tutorial demonstrates how to parse and manipulate dates in DSS.
It is not strictly required, but we recommend all newcomers to Dataiku DSS begin with the Foundational learning materials.
Parsing strings into dates¶
The first challenge when working with dates is to convert the date column into a machine-readable date storage type or format.
When in a Prepare recipe (or Visual Analysis), notice that DSS automatically suggests that strings that look like dates can be parsed.
Here the column order_date is stored as a string. DSS recognizes it is likely a date, and so suggests parsing it.
After initiating the Parse date processor, the Smart date window appears and suggests the most likely date formats. You can see in green and red the relative number of valid and invalid examples, as calculated on the sample. You can also enter a custom format if necessary.
After choosing a format, a new step is added in the Script, and a new column appears in the dataset.
Note below that while the column order_date is stored as a string, the new column order_date_parsed is stored as a date.
With a properly parsed date column, DSS can suggest new operations from the processor library, specific to dates, such as:
- Extracting date elements: year, month, day, day of week, week of year, etc.
- Computing time since a date, another columns, today.
- Flagging holidays.