Creating Charts in DSS

It is easy to create a variety of useful charts in DSS, as you have already seen in Tutorial: Basics.

This brief tutorial reviews the basics of creating visualizations in Dataiku DSS.

Business Case

The fictional Haiku T-Shirts company wants to understand more about their typical order size. They know from experience that most customers order a single shirt, but they do occasionally get larger orders. What they don’t know is whether these larger orders constitute a significant portion of their business, and whether certain categories of shirts are more likely to be ordered in larger quantities.

How to create charts

Working with the haiku_shirt_sales data in a visual analysis, to create a chart in the Charts tab:

  • Drag Count of records to the Y axis and nb_tshirts to the X Axis; drag category to the color droplet.
../../_images/analyze-charts-tshirt-01.gif

The resulting chart shows us that 10 equal-width bins loses a lot of information, because all orders of 1-5 shirts are clumped together.

../../_images/analyze-charts-tshirt-01.png

So let’s break the display of nb_shirts down into raw values:

  • Click on the nb_tshirts label and select None, use raw values.
  • Create a filter to remove the value “Hoodies” as a category from the chart.
../../_images/analyze-charts-tshirt-02.gif

The vast majority of orders were for 1 shirt. From the perspective of number of orders, this is not a significant portion of Haiku T-Shirts’ business.

From the scale of the X axis, we can see that at least one person made an order of close to 40 T-shirts, but the total is too small to see on the chart, relative to the number of orders for 1 shirt.

../../_images/analyze-charts-tshirt-02.png

In order to get a better view of the categories by order size:

  • Click on the chart type selector and choose Stacked 100%.
  • Drag tshirt_price and total to the Tooltip area. On the total dropdown, select Sum. This adds summary statistics to your tooltips.
../../_images/analyze-charts-tshirt-03.gif

Now we can easily see that the proportion of sales by category appears to differ by order size. By hovering over bars in the chart, we can see, for example, that while women’s black T-shirts account for a greater and greater proportion of sales as the order size increases from 1 to 5 shirts, the total value of the orders decreases.

../../_images/analyze-charts-tshirt-03.png

Thus, whether these visual differences represent a statistically significant model that the Haiku T-Shirts company can exploit is a question we’ll leave for further analysis, because there is always a next step in data science!

Sharing charts

DSS charts are portable. You can easily download them as an image (PNG) or an Excel document at any point.

../../_images/analyze-charts-tshirt-04.gif

DSS charts can also be saved as insights and published to dashboards.

Which data is used by charts, and where computations take place

There are two places where you can create charts in DSS:

  • in a Visual Analysis (using the Lab)
  • on a Dataset

Both visual analyses and datasets give you control over which data your chart is created with – sampled or complete.

../../_images/analyze-charts-tshirt-05.gif

We strongly recommend that, unless you have a relatively small dataset, you use a sample for building interactive charts in visual analyses. This is because a visual analysis is intended for exploration and quick visual feedback, and thus always uses the in-memory DSS engine.

When building charts on a dataset however, you can also use an in-database or in-cluster engine, depending on the location of the original data. Look at the following page for additional information on sampling and engines for charts.

What’s next

Once you have mastered the basics of charts in DSS, move on to the next tutorial for guidance on making paneled and animated charts in DSS.

For more information about charts in DSS, please see the reference documentation.