You’ve just completed the Visual Recipes Overview course, where you gained hands-on experience in working with Dataiku DSS recipes. Here are a few of the main takeaways from this course:
- Data preparation happens in the Prepare Recipe. Strings representing dates need to be parsed, so that the computer can recognize the true, unambiguous meaning of the Date. Similar to what you might find in a spreadsheet tool like Excel, DSS has its own Formula language.
- You can use the Distinct Recipe to identify all rows that have the exact same values on all columns and keep only one of them.
- On big data projects, you can add a random sample at the beginning of a flow, continue to work on the project, and then remove the sample step when you’re done experimenting and ready to run the entire dataset through the pipeline.
- In the Join Recipe, the Left join is a common join type used in data enrichment. It lets us keep all the records in our main dataset regardless if there is a match in the enrichment dataset.
- The Group By Recipe lets you perform aggregations such as summing the value of transactions according to things like individual customers, or product categories, or units of time.
- The Window Recipe is similar to the Group By Recipe in that it performs aggregations, but it lets you keep the original structure of the dataset.
- If you wanted to transform a row of items into columns by pivoting on the Items column, you would then select a different column, such as Quantity, to perform aggregations using the Pivot Recipe.
- There are four ways you can split a dataset when using the Split Recipe.
- If you had two tables containing similar information but with different column names in a different order, you could use the Stack recipe to map and combine them.
- The Top N and Sort Recipes are similar. The difference is, with a Top N Recipe, you can configure DSS to retrieve only the rows you are interested in, such as the top 10 rows based on groupings of the dataset.
Be sure to check out other Academy courses! You can continue your learning journey by visiting more advanced courses. For example, you can continue with DSS & SQL, which is an optional course.