Sampling RowsΒΆ

In Excel you can create a column of random values with the RAND() function and then filter on values of that column to select a random subsample of rows.

../../../_images/excel-sampling.png

In Dataiku, you can use the Sample step in a Sample/Filter recipe to draw a subsample of your data. This can be a random sample, or a weighted sample based on the different classes of a particular variable.


To learn more about sampling in Dataiku, start with the reference documentation.