Removing Duplicates

Excel can Remove duplicate values, using all columns or a subset to determine uniqueness of a row. Duplicates are simply removed, with no way to recover them later.

../../../_images/excel-remove-duplicates.png

Dataiku’s Distinct recipe identifies and removes duplicate rows within a dataset. Additionally, it can track which rows had duplicates, and how many, in the original dataset. See the Distinct recipe video below for an introduction to handling duplicates in Dataiku.