Using the IPython/Jupyter Notebook in DSS

The IPython Notebook is a favorite tool for many data scientists. It provides the users with an ideal environment for analyzing interactively datasets directly from their web browser, combining code, graphical output, and rich content in a single place.

Because of all these nice features, the IPython Notebook is embedded in Dataiku DSS, and tightly integrated with other components.

Working with IPython Notebooks

There are two main ways for creating an IPython Notebook in Dataiku DSS:

  • if you want to create a completely “blank” Notebook, just navigate to the Notebook section from the top nav bar, and hit the “New notebook” button. You’ll have the choice between creating a regular Notebook using a Python kernel, or creating a Notebook based on a R kernel.
"Creating a new notebook"
  • very often, while building your data science workflow, you need to explore new datasets. If you want to use a Notebook in this case, from the Flow screen, click on the dataset you want to analyze, then from the right panel, click on the Python or R icon, and select “Notebook”.
"Creating a new notebook from within the Flow"

(Note that this functionality is also available from the Actions menu on a Dataset)

This will open automatically a new Notebook with some minimal code pre-filled, allowing to use Dataiku’s API to read your datasets into Python or R structures (such as data frames). For instance:

"New notebook with minimal pre-filled code"

You can uncomment the portions of code depending on whether you need to load your dataset into a Pandas dataframe, or iterate through the lines of your dataset.

Should you work with R or Python in your Notebook, you will be able to easily load your datasets using Dataiku’s API’s, whatever initial source or storage system.

Creating an Insight from an IPython Notebook

In Dataiku DSS, Insights are used by data scientists to share their work, notably via the Dashboard. IPython Notebooks can be used to create an Insight in DSS, where you will be able to share your document with other users in a HTML format.

Once your Notebook is created, close it and go back to the Notebooks list. Click on the Notebook you want to use, and under “Actions” in the right panel, click on “Publish”:

"Publishing a notebook to a dashboard"

Once generated, you will be taken to the Dashboard, and the Notebook is shown as new Insight. Double click on the top bar of the window to open it:

"Opening a notebook as an insight"

Generating a Notebook from a Model

Finally, another very interesting feature is the ability to create an IPython Notebook directly from a trained machine learning Model. Let’s say you have trained a model, then from the caret close the Deploy menu, you can choose to export it an IPython Notebook:

"Model action menu: export to IPython notebook"

This will create a new Notebook filled with all the settings and the steps required to reproduce your machine learning build from the UI, but directly in Python. Isn’t It Great?

"Python notebook generated from model"


IPython Notebooks are first-class citizens in DSS. They are in the toolbox of most of the data scientists, and they make a great environment for interactively analyzing your datasets using R or Python. If you want to see what next for IPython Notebooks, check the Jupyter project.