Bokeh Web Apps

In this tutorial, we’ll create a simple Bokeh web app in Dataiku DSS. It’s a scatterplot on Haiku T-shirt sales data, related to the data used in the Basic Tutorials.

Prerequisites

  • Some familiarity with Python.

Technical Requirements

Create Your Project

From the Dataiku homepage, click +New Project > DSS Tutorials > General Topics > Haiku Starter. Alternatively you can download the Orders_enriched_prepared dataset and import it into a new project.

Creating a New Bokeh Webapp

Create a new empty Bokeh web app:

  1. In the top navigation bar, select Lab - Notebooks > Web apps
  2. Click + New Web App
  3. Select Bokeh
  4. Choose An empty Bokeh app and type a name for the web app
../../_images/new_webapp.png

You will be redirected to the web app editor.

The Web App Editor

The web app editor is divided into two panes.

../../_images/bokeh-editor.png

The left pane allows you to see and edit the Python code underlying the web app.

The right pane gives you several views on the web app.

  • The Preview tab allows you to write and test your code in the left pane while having immediate visual feedback in the right pane. At any time you can save or reload your current code by clicking on the Save button or the Reload Preview button.
  • The Python tab allows you to look at different portions of the code side-by-side in the left and right panes.
  • The Log is useful for troubleshooting problems.
  • Settings allows you to set the code environment for this web app, if you want it to be different from the project default.

Coding the Web App

Let’s build the code behind the Python Bokeh web app.

Importing the Packages

Insert the following code into the Python tab so that we’ll have the necessary tools to create the web app.

from bokeh.io import curdoc
from bokeh.layouts import row, widgetbox
from bokeh.models import ColumnDataSource
from bokeh.models.widgets import Slider, TextInput, Select
from bokeh.plotting import figure
import dataiku
import pandas as pd

Setting up the Data

Add the following code to the Python tab. This code parameterizes the inputs to the web app. By specifying this information up front, it will be easier for us to generalize the web app later.

# Parameterize web app inputs
input_dataset = "Orders_enriched_prepared"
x_column = "age"
y_column = "total"
time_column = "order_date_year"
cat_column = "tshirt_category"

Add the following code to the Python tab to access a Dataiku dataset as a pandas dataframe.

# Set up data
mydataset = dataiku.Dataset(input_dataset)
df = mydataset.get_dataframe()

Add the following code to the Python tab. This code extracts the customer age and total amount spent from the pandas dataframe to define the source data for the visualization as a ColumnDataSource.

x = df[x_column]
y = df[y_column]
source = ColumnDataSource(data=dict(x=x, y=y))

Nothing is displayed yet, because we haven’t created the visualization, but there are no errors in the log.

Defining the Visualization

Add the following code to the Python tab to define the output visualization. First we create a plot object with the desired properties. We’ve defined the title of the plot using the X- and Y-Axis column names. We have also computed the minimum and maximum values of customer age and total, and used those to define the axis limits. Next we define the visualization as a scatterplot that plots data from the source defined above.

# Set up plot
plot = figure(plot_height=400, plot_width=400, title=y_column+" by "+x_column,
              tools="crosshair,pan,reset,save,wheel_zoom",
              x_range=[min(x), max(x)], y_range=[min(y),max(y)])

plot.scatter('x', 'y', source=source)

The following code defines the layout of the web app and adds it to the current “document”. For now, we’ll include an empty widgetbox that we’ll populate in a moment when we add the interactivity.

# Set up layouts and add to document
inputs = widgetbox()

curdoc().add_root(row(inputs, plot, width=800))

Save your work, and the preview should show the current (non-interactive) scatterplot.

../../_images/bokeh-noninteractive.png

Adding Interactivity

The current scatterplot includes all orders from 2013-2017, across all types of t-shirts sold. Now let’s add the ability to select a subset of years, and a specific category of t-shirt. To do this, we need to make changes to the Python code.

The code in this section should be added after the code to set up the plot, but before the code to define the layout of the web app.

In the Python tab, add the following.

# Set up widgets
text = TextInput(title="Title", value=y_column+" by "+x_column)
time = df[time_column]
min_year = Slider(title="Time start", value=min(time), start=min(time), end=max(time), step=1)
max_year = Slider(title="Time max", value=max(time), start=min(time), end=max(time), step=1)
cat_categories = df[cat_column].unique().tolist()
cat_categories.insert(0,'All')
category = Select(title="Category", value="All", options=cat_categories)

This defines four widgets.

  • text accepts text input to be used as the title of the visualization.
  • min_year and max_year are sliders that take values from 2013 to 2017 in integer steps. Their default values are 2013 and 2017, respectively
  • category is a selection that has an option for each t-shirt category, plus “All”. Its default value is All.

Add the following code, which contains the instructions on how to update the web app when a user interacts with it.

def update_title(attrname, old, new):
    plot.title.text = text.value

def update_data(attrname, old, new):
    category_value = category.value
    selected = df[(time>=min_year.value) & (time<=max_year.value)]
    if (category_value != "All"):
        selected = selected[selected[cat_column].str.contains(category_value)==True]
    # Generate the new plot
    x = selected[x_column]
    y = selected[y_column]
    source.data = dict(x=x, y=y)
  • When the title text is changed, update_title()` updates plot.title.text to the new value
  • When the sliders or the select widget are changed, update_data takes the input dataframe df and uses the widget selections to filter the dataframe to only use records with the correct order year and t-shirt category. It then defines the x and y axes of the scatterplot to be the age and order total from the filtered dataframe.

Add the following code that listens for changes to the widget values using the on_change()` method, which calls the functions above to update the web app.

# Set up callbacks
text.on_change('value', update_title)

for w in [min_year, max_year, category]:
    w.on_change('value', update_data)

Finally, change the definition of inputs as follows, to include the four widgets so that they are displayed in the web app.

inputs = widgetbox(text, min_year, max_year, category)

Save your work, refresh the preview, and it should now show the current interactive scatterplot.

../../_images/bokeh-interactive.png

Publish to a Dashboard

When you are done with editing, you can easily publish your web app on a dashboard from the Actions dropdown at the top-right corner of the screen.

../../_images/publish_webapp.png

What’s Next

Using Dataiku DSS, you have created an interactive Bokeh web app and published it to a dashboard.