create_dir_save_file
) to automatically download and save the required data (data/2020-03-15_interactive_data_visualization_with_bokeh
) and image (Images/2020-03-15_interactive_data_visualization_with_bokeh
) files.Bokeh is an interactive data visualization library for Python, and other languages, that targets modern web browsers for presentation. It can create versatile, data-driven graphics and connect the full power of the entire Python data science stack to create rich, interactive visualizations.
import pandas as pd
from pprint import pprint as pp
from itertools import combinations
from pathlib import Path
import requests
import numpy as np
from bokeh.io import output_notebook, curdoc # output_file
from bokeh.plotting import figure, show
from bokeh.sampledata.iris import flowers
from bokeh.sampledata.iris import flowers as iris_df
from bokeh.models import ColumnDataSource, HoverTool, CategoricalColorMapper, Slider, Column, Select
from bokeh.models import CheckboxGroup, RadioGroup, Toggle, Button
from bokeh.models.widgets import Tabs, Panel
from bokeh.layouts import row, column, gridplot, widgetbox
from bokeh.palettes import Spectral6
from bokeh.themes import Theme
import yaml
output_notebook()
pd.set_option('max_columns', 200)
pd.set_option('max_rows', 300)
pd.set_option('display.expand_frame_repr', True)
def create_dir_save_file(dir_path: Path, url: str):
"""
Check if the path exists and create it if it does not.
Check if the file exists and download it if it does not.
"""
if not dir_path.parents[0].exists():
dir_path.parents[0].mkdir(parents=True)
print(f'Directory Created: {dir_path.parents[0]}')
else:
print('Directory Exists')
if not dir_path.exists():
r = requests.get(url, allow_redirects=True)
open(dir_path, 'wb').write(r.content)
print(f'File Created: {dir_path.name}')
else:
print('File Exists')
data_dir = Path('data/2020-03-15_interactive_data_visualization_with_bokeh')
images_dir = Path('Images/2020-03-15_interactive_data_visualization_with_bokeh')
# AAPL Stock
aapl_url = 'https://assets.datacamp.com/production/repositories/401/datasets/313eb985cce85923756a128e49d7260a24ce6469/aapl.csv'
# Automobile miles per gallon
auto_url = 'https://assets.datacamp.com/production/repositories/401/datasets/2a776ae9ef4afc3f3f3d396560288229e160b830/auto-mpg.csv'
# Gapminder
gap_url = 'https://assets.datacamp.com/production/repositories/401/datasets/09378cc53faec573bcb802dce03b01318108a880/gapminder_tidy.csv'
# Blood glucose levels
glucose_url = 'https://assets.datacamp.com/production/repositories/401/datasets/edcedae3825e0483a15987248f63f05a674244a6/glucose.csv'
# Female literacy and birth rate
female_url = 'https://assets.datacamp.com/production/repositories/401/datasets/5aae6591ddd4819dec17e562f206b7840a272151/literacy_birth_rate.csv'
# Olympic medals (100m sprint)
sprint_url = 'https://assets.datacamp.com/production/repositories/401/datasets/68b7a450b34d1a331d4ebfba22069ce87bb5625d/sprint.csv'
# State coordinates
state_url = 'https://github.com/trenton3983/DataCamp/blob/master/data/2020-03-15_interactive_data_visualization_with_bokeh/state_coordinates.xlsx?raw=true'
datasets = [aapl_url, auto_url, gap_url, glucose_url, female_url, sprint_url, state_url]
data_paths = list()
for data in datasets:
file_name = data.split('/')[-1].replace('?raw=true', '')
data_path = data_dir / file_name
create_dir_save_file(data_path, data)
data_paths.append(data_path)
aapl = pd.read_csv(data_paths[0])
aapl.drop('Unnamed: 0', axis=1, inplace=True)
aapl['date'] = pd.to_datetime(aapl['date'])
auto = pd.read_csv(data_paths[1])
gap = pd.read_csv(data_paths[2])
gluc = pd.read_csv(data_paths[3])
gluc['datetime'] = pd.to_datetime(gluc['datetime'])
gluc.set_index('datetime', inplace=True, drop=True)
lit = pd.read_csv(data_paths[4])
lit = lit.iloc[0:162, :]
lit[['female literacy', 'fertility']] = lit[['female literacy', 'fertility']].astype('float')
lit.columns = [x.strip() for x in lit.columns] # Country has a whitespace at the end
run = pd.read_csv(data_paths[5])
state_coor_dict = {state: pd.read_excel(data_paths[6], sheet_name=state) for state in ['az', 'co', 'nm', 'ut']}
This chapter provides an introduction to basic plotting with Bokeh. You will create your first plots, learn about different data formats Bokeh understands, and make visual customizations for selections and mouse hovering.
What is Bokeh?
What you will learn
bokeh.plotting
bokeh.charts
What are Glyphs
Typical usage
plot = figure(plot_height=300, plot_width=400, tools='pan,box_zoom,reset')
plot.circle([1, 2, 3, 4, 5], [8, 6, 5, 2, 3])
# output_file('circle.html')
show(plot)
Glyph properties
plot = figure(plot_height=400, plot_width=400, )
plot.circle(x=10, y=[2, 5, 8, 12], size=[10, 20, 30, 40])
show(plot)
Markers
In Bokeh, visual properties of shapes are called glyphs. The visual properties of these glyphs such as position or color can be assigned single values, for example x=10
or fill_color='red'
.
What other kinds of values can glyph properties be set to in normal usage?
Answer the question
In this example, you're going to make a scatter plot of female literacy vs fertility using data from the European Environmental Agency. This dataset highlights that countries with low female literacy have high birthrates. The x-axis data has been loaded for you as fertility
and the y-axis data has been loaded as female_literacy
.
Your job is to create a figure, assign x-axis and y-axis labels, and plot female_literacy
vs fertility
using the circle glyph.
After you have created the figure, in this exercise and the ones to follow, play around with it! Explore the different options available to you on the tab to the right, such as "Pan", "Box Zoom", and "Wheel Zoom". You can click on the question mark sign for more details on any of these tools.
Note: You may have to scroll down to view the lower portion of the figure.
Instructions
figure
function from bokeh.plotting
, and the output_file
and show
functions from bokeh.io
.p
with figure()
. It has two parameters: x_axis_label
and y_axis_label
.p
using the function p.circle()
where the inputs are, in order, the x-axis data and y-axis data.output_file()
function to specify the name 'fert_lit.html'
for the output file.show()
and passing in the figure p
.fertility = lit['fertility']
female_literacy = lit['female literacy']
# Create the figure: p
p = figure(plot_height=300, x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')
# Add a circle glyph to the figure p
p.circle(fertility, female_literacy)
# Call the output_file() function and specify the name of the file
# output_file('fert_lit.html')
# Display the plot
show(p)
By calling multiple glyph functions on the same figure object, we can overlay multiple data sets in the same figure.
In this exercise, you will plot female literacy vs fertility for two different regions, Africa and Latin America. Each set of x and y data has been loaded separately for you as fertility_africa
, female_literacy_africa
, fertility_latinamerica
, and female_literacy_latinamerica
.
Your job is to plot the Latin America data with the circle()
glyph, and the Africa data with the x()
glyph.
figure has already been imported for you from bokeh.plotting
.
Instructions
p
with the figure()
function. It has two parameters: x_axis_label
and _axis_label
.p
using the function p.circle()
where the inputs are the x and y data from Latin America: fertility_latinamerica
and female_literacy_latinamerica
.p
using the function p.x()
where the inputs are the x and y data from Africa: fertility_africa
and female_literacy_africa
.lit.head()
fertility_africa = lit[lit.Continent == 'AF']['fertility']
fertility_latinamerica = lit[lit.Continent == 'LAT']['fertility']
female_literacy_africa = lit[lit.Continent == 'AF']['female literacy']
female_literacy_latinamerica = lit[lit.Continent == 'LAT']['female literacy']
# Create the figure: p
p = figure(plot_height=300, x_axis_label='fertility (children per woman)', y_axis_label='female literacy (% population)')
# Add a circle glyph to the figure p
p.circle(fertility_latinamerica, female_literacy_latinamerica)
# Add an x glyph to the figure p
p.x(fertility_africa, female_literacy_africa, color='red')
# Specify the name of the file
# output_file('fert_lit_separate.html')
# Display the plot
show(p)
The three most important arguments to customize scatter glyphs are color
, size
, and alpha
. Bokeh accepts colors as hexadecimal strings, tuples of RGB values between 0 and 255, and any of the 147 CSS color names. Size values are supplied in screen space units with 100 meaning the size of the entire figure.
The alpha
parameter controls transparency. It takes in floating point numbers between 0.0, meaning completely transparent, and 1.0, meaning completely opaque.
In this exercise, you'll plot female literacy vs fertility for Africa and Latin America as red and blue circle glyphs, respectively.
Instructions
fertility_latinamerica
and female_literacy_latinamerica
), add a blue
circle glyph of size=10
and alpha=0.8
to the figure p
. To do this, you will need to specify the color
, size
and alpha
keyword arguments inside p.circle()
.fertility_africa
and female_literacy_africa
), add a red
circle glyph of size=10
and alpha=0.8
to the figure p
.# Create the figure: p
p = figure(plot_height=300, x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)')
# Add a blue circle glyph to the figure p
p.circle(fertility_latinamerica, female_literacy_latinamerica, color='blue', size=10, alpha=0.8)
# Add a red circle glyph to the figure p
p.circle(fertility_africa, female_literacy_africa, color='red', size=10, alpha=0.8)
# Specify the name of the file
# output_file('fert_lit_separate_colors.html')
# Display the plot
show(p)
Lines
x = [1,2,3,4,5]
y = [8,6,5,2,3]
plot = figure(plot_height=300)
plot.line(x, y, line_width=3)
# output_file('line.html')
show(plot)
Lines and Markers Together
x = [1,2,3,4,5]
y = [8,6,5,2,3]
plot = figure(plot_height=300)
plot.line(x, y, color='purple', line_width=2)
plot.circle(x, y, color='purple', fill_color='white', size=10)
# output_file('line.html')
show(plot)
Patches
xs = [[1,1,2,2], [2,2,4], [2,2,3,3]]
ys = [[2,5,5,2], [3,5,5], [2,3,4,2]]
plot = figure(plot_height=300)
plot.patches(xs, ys, fill_color=['red', 'blue','green'], line_color='white')
# output_file('patches.html')
show(plot)
Other glyphs
We can draw lines on Bokeh plots with the line()
glyph function.
In this exercise, you'll plot the daily adjusted closing price of Apple Inc.'s stock (AAPL) from 2000 to 2013.
The data points are provided for you as lists. date
is a list of datetime objects to plot on the x-axis and price
is a list of prices to plot on the y-axis.
Since we are plotting dates on the x-axis, you must add x_axis_type='datetime'
when creating the figure object.
Instructions
figure
function from bokeh.plotting
.p
using the figure()
function with x_axis_type
set to 'datetime'
. The other two parameters are x_axis_label
and y_axis_label
.date
and price
along the x- and y-axes using p.line()
.# Create a figure with x_axis_type="datetime": p
p = figure(plot_height=300, x_axis_type="datetime", x_axis_label='Date', y_axis_label='US Dollars')
# Plot date along the x axis and price along the y axis
p.line(aapl['date'], aapl['adj_close'], line_color='green')
# Specify the name of the output file and show the result
# output_file('line.html')
show(p)
Lines and markers can be combined by plotting them separately using the same data points.
In this exercise, you'll plot a line and circle glyph for the AAPL stock prices. Further, you'll adjust the fill_color
keyword argument of the circle()
glyph function while leaving the line_color
at the default value.
The date
and price
lists are provided. The Bokeh figure object p that you created in the previous exercise has also been provided.
Instructions
date
along the x-axis and price
along the y-axis with p.line()
.date
on the x-axis and price
on the y-axis, use p.circle()
to add a 'white'
circle glyph of size 4
. To do this, you will need to specify the fill_color
and size
arguments.aapl_mar_jul_2000 = aapl[(aapl['date'] >= '2000-01') & (aapl['date'] < '2000-08')]
aapl_mar_jul_2000.head()
# Create a figure with x_axis_type="datetime": p
p = figure(plot_height=300, x_axis_type="datetime", x_axis_label='Date', y_axis_label='US Dollars')
# Plot date along the x axis and price along the y axis
p.line(aapl_mar_jul_2000['date'], aapl_mar_jul_2000['adj_close'])
# With date on the x-axis and price on the y-axis, add a white circle glyph of size 4
p.circle(aapl_mar_jul_2000['date'], aapl_mar_jul_2000['adj_close'], fill_color='white', size=4)
# Specify the name of the output file and show the result
# output_file('line.html')
show(p)
In Bokeh, extended geometrical shapes can be plotted by using the patches()
glyph function. The patches glyph takes as input a list-of-lists collection of numeric values specifying the vertices in x and y directions of each distinct patch to plot.
In this exercise, you will plot the state borders of Arizona, Colorado, New Mexico and Utah. The latitude and longitude vertices for each state have been prepared as lists.
Your job is to plot longitude on the x-axis and latitude on the y-axis. The figure object has been created for you as p
.
Instructions
x
. This has already been done for you.y
. The variable names for the latitude positions are az_lats
, co_lats
, nm_lats
, and ut_lats
..patches()
method to add the patches glyph to the figure p
. Supply the x
and y
lists as arguments along with a line_color
of 'white'
.az_lons = state_coor_dict['az'].az_lons
az_lats = state_coor_dict['az'].az_lats
co_lons = state_coor_dict['co'].co_lons
co_lats = state_coor_dict['co'].co_lats
nm_lons = state_coor_dict['nm'].nm_lons
nm_lats = state_coor_dict['nm'].nm_lats
ut_lons = state_coor_dict['ut'].ut_lons
ut_lats = state_coor_dict['ut'].ut_lats
# Create a list of az_lons, co_lons, nm_lons and ut_lons: x
x = [az_lons, co_lons, nm_lons, ut_lons]
# Create a list of az_lats, co_lats, nm_lats and ut_lats: y
y = [az_lats, co_lats, nm_lats, ut_lats]
# Add patches to figure p with line_color=white for x and y
p = figure(plot_height=400, plot_width=400)
p.patches(x, y, line_color='white')
# Specify the name of the output file and show the result
# output_file('four_corners.html')
show(p)
Python Basic Types
x = [1,2,3,4,5]
y = [8,6,5,2,3]
plot = figure(plot_height=300)
plot.line(x, y, line_width=3)
plot.circle(x, y, fill_color='white', size=10)
# output_file('basic.html')
show(plot)
NumPy Arrays
x = np.linspace(0, 10, 1000)
y = np.sin(x) + np.random.random(1000) * 0.2
plot = figure(plot_height=300)
plot.line(x, y)
# output_file('numpy.html')
show(plot)
Pandas
# Flowers is a Pandas DataFrame
plot = figure(plot_height=300)
plot.circle(flowers['petal_length'], flowers['sepal_length'], size=10)
# output_file('pandas.html')
show(plot)
Column Data Source
# from bokey.models import ColumnDataSource # imported at the top of the notebook
source = ColumnDataSource(data={'x': [1,2,3,4,5],
'y': [8,6,5,2,3]})
source.data
iris_df.head()
source = ColumnDataSource(iris_df)
In the previous exercises, you made plots using data stored in lists. You learned that Bokeh can plot both numbers and datetime objects.
In this exercise, you'll generate NumPy arrays using np.linspace()
and np.cos()
and plot them using the circle glyph.
np.linspace()
is a function that returns an array of evenly spaced numbers over a specified interval. For example, np.linspace(0, 10, 5)
returns an array of 5
evenly spaced samples calculated over the interval [0, 10]
. np.cos(x)
calculates the element-wise cosine of some array x.
For more information on NumPy functions, you can refer to the NumPy User Guide and NumPy Reference.
The figure p
has been provided for you.
Instructions
numpy
as np
.x
using np.linspace()
with 0
, 5
, and 100
as inputs.y
using np.cos()
with x
as input.x
and y
using p.circle()
.# Create array using np.linspace: x
x = np.linspace(0, 5, 100)
# Create array using np.cos: y
y = np.cos(x)
# Add circles at x and y
p = figure(plot_height=300)
p.circle(x, y)
# Specify the name of the output file and show the result
# output_file('numpy.html')
show(p)
You can create Bokeh plots from Pandas DataFrames by passing column selections to the glyph functions.
Bokeh can plot floating point numbers, integers, and datetime data types. In this example, you will read a CSV file containing information on 392 automobiles manufactured in the US, Europe and Asia from 1970 to 1982.
The CSV file is provided for you as 'auto.csv'
.
Your job is to plot miles-per-gallon (mpg
) vs horsepower (hp
) by passing Pandas column selections into the p.circle()
function. Additionally, each glyph will be colored according to values in the color
column.
Instructions
pandas
as pd
.read_csv()
function of pandas
to read in 'auto.csv'
and store it in the DataFrame df
.figure
from bokeh.plotting
.figure()
function to create a figure p
with the x-axis labeled 'HP'
and the y-axis labeled 'MPG'
.mpg
(on the y-axis) vs hp
(on the x-axis) by color
using p.circle()
. Note that the x-axis should be specified before the y-axis inside p.circle()
. You will need to use Pandas DataFrame indexing to pass in the columns. For example, to access the color
column, you can use df['color']
, and then pass it in as an argument to the color
parameter of p.circle()
. Also specify a size
of 10
.# Create the figure: p
p = figure(plot_height=400, x_axis_label='HP', y_axis_label='MPG')
# Plot mpg vs hp by color
p.circle(auto.hp, auto.mpg, size=10, color=auto.color)
# Specify the name of the output file and show the result
# output_file('auto-df.html')
show(p)
The ColumnDataSource
is a table-like data object that maps string column names to sequences (columns) of data. It is the central and most common data structure in Bokeh.
Which of the following statements about ColumnDataSource
objects is true?
Answer the question
You can create a ColumnDataSource
object directly from a Pandas DataFrame by passing the DataFrame to the class initializer.
In this exercise, we have imported pandas as pd
and read in a data set containing all Olympic medals awarded in the 100 meter sprint from 1896 to 2012. A color
column has been added indicating the CSS colorname we wish to use in the plot for every data point.
Your job is to import the ColumnDataSource
class, create a new ColumnDataSource
object from the DataFrame df
, and plot circle glyphs with 'Year'
on the x-axis and 'Time'
on the y-axis. Color each glyph by the color
column.
The figure object p
has already been created for you.
Instructions
ColumnDataSource
class from bokeh.plotting
.ColumnDataSource()
function to make a new ColumnDataSource object called source
from the DataFrame df
.p.circle()
to plot circle glyphs on the figure p
with 'Year'
on the x-axis and 'Time'
on the y-axis.8
, and use color='color'
to ensure each glyph is colored by the color
column.source=source
so that the ColumnDataSource object is used.run.head()
# Create a ColumnDataSource from df: source
source = ColumnDataSource(run)
# Add circle glyphs to the figure p
p = figure(plot_height=400, x_axis_label='Race Year', y_axis_label='Race Run Time (seconds)', title='100 Meter Sprint Run Times: 1896 - 2012')
p.circle('Year', 'Time', source=source, size=8, color='color')
# Specify the name of the output file and show the result
# output_file('sprint.html')
show(p)
Selection appearance
plot = figure(plot_height=400, tools='box_select, lasso_select, reset')
plot.circle(iris_df.petal_length, iris_df.sepal_length, selection_color='red', nonselection_fill_alpha=0.2, nonselection_fill_color='grey')
show(plot)
Hover appearance
np.random.seed(365)
a = np.random.random_sample(1000)
b = np.random.random_sample(1000)
hover = HoverTool(tooltips=None, mode='hline')
plot = figure(plot_height=400, tools=[hover, 'crosshair'])
# x and y are lists of random points
plot.circle(a, b, size=8, hover_color='magenta')
show(plot)
Color mapping
iris_df.head()
source = ColumnDataSource(iris_df)
mapper = CategoricalColorMapper( factors=['setosa', 'virginica', 'versicolor'], palette=['red', 'green', 'blue'])
plot = figure(plot_height=400, x_axis_label='petal_length', y_axis_label='sepal_length')
plot.circle('petal_length', 'sepal_length', size=10, source=source, color={'field': 'species', 'transform': mapper})
show(plot)
In this exercise, you're going to add the box_select
tool to a figure and change the selected and non-selected circle glyph properties so that selected glyphs are red and non-selected glyphs are transparent blue.
You'll use the ColumnDataSource object of the Olympic Sprint dataset you made in the last exercise. It is provided to you with the name source
.
After you have created the figure, be sure to experiment with the Box Select tool you added! As in previous exercises, you may have to scroll down to view the lower portion of the figure.
Instructions
p
with an x-axis label of 'Year'
, y-axis label of 'Time'
, and the 'box_select'
tool. To add the 'box_select' tool, you have to specify the keyword argument tools='box_select'
inside the figure()
function.'box_select'
to p
, add in circle glyphs with p.circle()
such that the selected glyphs are red and non-selected glyphs are transparent blue. This can be done by specifying 'red'
as the argument to selection_color
and 0.1
to nonselection_alpha
. Remember to also pass in the arguments for the x
('Year'
), y
('Time'
), and source
parameters of p.circle()
.# Create a ColumnDataSource from df: source
source = ColumnDataSource(run)
# Create a figure with the "box_select" tool: p
p = figure(plot_height=400, x_axis_label='Year', y_axis_label='Time', tools='box_select, reset')
# Add circle glyphs to the figure p with the selected and non-selected properties
p.circle('Year', 'Time', source=source, selection_color='red', nonselection_alpha=0.1)
# Specify the name of the output file and show the result
# output_file('selection_glyph.html')
show(p)
Now let's practice using and customizing the hover tool.
In this exercise, you're going to plot the blood glucose levels for an unknown patient. The blood glucose levels were recorded every 5 minutes on October 7th starting at 3 minutes past midnight.
The date and time of each measurement are provided to you as x
and the blood glucose levels in mg/dL are provided as y
.
A bokeh figure is also provided in the workspace as p
.
Your job is to add a circle glyph that will appear red when the mouse is hovered near the data points. You will also add a customized hover tool object to the plot.
When you're done, play around with the hover tool you just created! Notice how the points where your mouse hovers over turn red.
Instructions
HoverTool
from bokeh.models
.p
for x
and y
with a size
of 10
, fill_color
of 'grey'
, alpha of 0.1
, line_color
of None
, hover_fill_color
of 'firebrick'
, hover_alpha
of 0.5
, and hover_line_color
of 'white'
.HoverTool()
function to create a HoverTool called hover
with tooltips=None
and mode='vline'
.hover
to the figure p
using the p.add_tools()
function.gluc.head()
# Add circle glyphs to figure p
p = figure(plot_height=400, plot_width=800, x_axis_type="datetime", x_axis_label='Date', y_axis_label='Glucose Level', tools=[])
p.circle(gluc.index, gluc.glucose, size=10,
fill_color='grey', alpha=0.1, line_color=None,
hover_fill_color='purple', hover_alpha=0.5,
hover_line_color='white')
# Create a HoverTool: hover
hover = HoverTool(tooltips=None, mode='vline')
# Add the hover tool to the figure p
p.add_tools(hover)
# Specify the name of the output file and show the result
# output_file('hover_glyph.html')
show(p)
The final glyph customization we'll practice is using the CategoricalColorMapper to color each glyph by a categorical property.
Here, you're going to use the automobile dataset to plot miles-per-gallon vs weight and color each circle glyph by the region where the automobile was manufactured.
The origin
column will be used in the ColorMapper to color automobiles manufactured in the US as blue, Europe as red and Asia as green.
The automobile data set is provided to you as a Pandas DataFrame called df
. The figure is provided for you as p
.
Instructions
CategoricalColorMapper
from bokeh.models
.df
to a ColumnDataSource called source
. This has already been done for you.color_mapper
with the CategoricalColorMapper()
function. It has two parameters here: factors
and palette
.circle
glyph to the figure p
to plot 'mpg'
(on the y-axis) vs 'weight'
(on the x-axis). Remember to pass in source
and 'origin'
as arguments to source
and legend
. For the color
parameter, use dict(field='origin', transform=color_mapper)
.auto.head()
# Convert df to a ColumnDataSource: source
source = ColumnDataSource(auto)
# Make a CategoricalColorMapper object: color_mapper
color_mapper = CategoricalColorMapper(factors=['Europe', 'Asia', 'US'],
palette=['red', 'green', 'blue'])
# Add a circle glyph to the figure p
p = figure(plot_height=400, x_axis_label='Vehicle Weight', y_axis_label='MPG')
p.circle('weight', 'mpg', source=source, color=dict(field='origin', transform=color_mapper), legend_field='origin')
# Specify the name of the output file and show the result
# output_file('colormap.html')
show(p)
Learn how to combine multiple Bokeh plots into different kinds of layouts on a page, how to easily link different plots together, and how to add annotations such as legends and hover tooltips.
Arranging multiple plots
Rows of plots
source = ColumnDataSource(iris_df)
# Flowers is a Pandas DataFrame
p1 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. sepal length')
p1.circle('petal_length', 'sepal_length', source=source, color='blue')
p2 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. sepal width')
p2.circle('petal_length', 'sepal_width', source=source, color='green')
p3 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. petal width')
p3.circle('petal_length', 'petal_width', source=source, color='red')
layout = row(p1, p2, p3)
# output_file('row.html')
show(layout)
Columns of plots
layout = column(p1, p2, p3)
# output_file('column.html')
show(layout)
Nested Layouts
layout = row(column(p1, p2), p3)
# output_file('nested.html')
show(layout)
Layouts are collections of Bokeh figure objects.
In this exercise, you're going to create two plots from the Literacy and Birth Rate data set to plot fertility vs female literacy and population vs female literacy.
By using the row()
method, you'll create a single layout of the two figures.
Remember, as in the previous chapter, once you have created your figures, you can interact with them in various ways.
In this exercise, you may have to scroll sideways to view both figures in the row layout. Alternatively, you can view the figures in a new window by clicking on the expand icon to the right of the "Bokeh plot" tab.
Instructions
row
from the bokeh.layouts
module.p1
using the figure()
function and specifying the two parameters x_axis_label
and y_axis_label
.p1
. The x-axis data is 'fertility'
and y-axis data is 'female_literacy'
. Be sure to also specify source=source
.p2
using the figure()
function and specifying the two parameters x_axis_label
and y_axis_label
.circle()
glyph to p2
, specifying the x
and y
parameters.p1
and p2
into a horizontal layout using row()
.lit.head()
source = ColumnDataSource(lit)
# Create the first figure: p1
p1 = figure(plot_height=300, x_axis_label='fertility (children per woman)', y_axis_label='female literacy (% population)')
# Add a circle glyph to p1
p1.circle('fertility', 'female literacy', source=source)
# Create the second figure: p2
p2 = figure(plot_height=300, x_axis_label='population', y_axis_label='female literacy (% population)')
# Add a circle glyph to p2
p2.circle('population', 'female literacy', source=source)
# Put p1 and p2 into a horizontal row: layout
layout = row(p1, p2)
# Specify the name of the output_file and show the result
# output_file('fert_row.html')
show(layout)
In this exercise, you're going to use the column()
function to create a single column layout of the two plots you created in the previous exercise.
Figure p1
has been created for you.
In this exercise and the ones to follow, you may have to scroll down to view the lower portion of the figure.
Instructions
column
from the bokeh.layouts
module.p1
has been created for you. Create a new figure p2
with an x-axis label of 'population'
and y-axis label of 'female_literacy (% population)'
.p2
.p1
and p2
into a vertical layout using column()
.# Create a blank figure: p1
p1 = figure(plot_height=300, x_axis_label='fertility (children per woman)', y_axis_label='female literacy (% population)')
# Add circle scatter to the figure p1
p1.circle('fertility', 'female literacy', source=source)
# Create a new blank figure: p2
p2 = figure(plot_height=300, x_axis_label='population', y_axis_label='female literacy (% population)')
# Add circle scatter to the figure p2
p2.circle('population', 'female literacy', source=source)
# Put plots p1 and p2 in a column: layout
layout = column(p1, p2)
# Specify the name of the output_file and show the result
# output_file('fert_column.html')
show(layout)
You can create nested layouts of plots by combining row and column layouts. In this exercise, you'll make a 3-plot layout in two rows using the auto-mpg data set. Three plots have been created for you of average mpg vs year (avg_mpg
), mpg vs hp (mpg_hp
), and mpg vs weight (mpg_weight
).
Your job is to use the row()
and column()
functions to make a two-row layout where the first row will have only the average mpg vs year plot and the second row will have mpg vs hp and mpg vs weight plots as columns.
By using the sizing_mode
argument, you can scale the widths to fill the whole figure.
Instructions
row
and column
from bokeh.layouts
.row2
with the figures mpg_hp
and mpg_weight
in a list and set sizing_mode='scale_width'
.layout
with the figure avg_mpg
and the row layout row2
in a list and set sizing_mode='scale_width'
.auto.head()
avg_mpg_df = pd.DataFrame(auto.groupby('yr')['mpg'].mean()).reset_index()
avg_mpg_source = ColumnDataSource(avg_mpg_df)
auto_source = ColumnDataSource(auto)
avg_mpg = figure(plot_height=150, x_axis_label='year', y_axis_label='mean mpg')
avg_mpg.line('yr', 'mpg', source=avg_mpg_source)
mpg_hp = figure(plot_height=300, x_axis_label='hp', y_axis_label='mpg')
mpg_hp.circle('hp', 'mpg', source=auto_source)
mpg_weight = figure(plot_height=300, x_axis_label='weight', y_axis_label='mpg')
mpg_weight.circle('weight', 'mpg', source=auto_source)
# Make a row layout that will be used as the second row: row2
row2 = row([mpg_hp, mpg_weight], sizing_mode='scale_width')
# Make a column layout that includes the above row layout: layout
layout = column([avg_mpg, row2], sizing_mode='scale_width')
# Specify the name of the output_file and show the result
# output_file('layout_custom.html')
show(layout)
Gridplots
toolbar_location
, which can be set to 'above'
, 'below'
, 'left'
, or 'right'
.source = ColumnDataSource(iris_df)
# Flowers is a Pandas DataFrame
p1 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. sepal length')
p1.circle(flowers['petal_length'], flowers['sepal_length'], color='blue')
p2 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. sepal width')
p2.circle(flowers['petal_length'], flowers['sepal_width'], color='green')
p3 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. petal width')
p3.circle(flowers['petal_length'], flowers['petal_width'], color='red')
layout = gridplot([[None, p1], [p2, p3]], toolbar_location=None)
# output_file('nested.html')
show(layout)
Tabbed Layouts
# from bokeh.models.widgets import Tabs, Panel # done at the top of the notebook
first = Panel(child=row(p1, p2), title='first')
second = Panel(child=row(p3), title='second')
# Put the Panels in a Tabs object
tabs = Tabs(tabs=[first, second])
# output_file('tabbed.html')
show(tabs)
Bokeh layouts allow for positioning items visually in the page presented to the user. What kinds of objects can be put into Bokeh layouts?
Answer the question
Regular grids of Bokeh plots can be generated with gridplot
.
In this example, you're going to display four plots of fertility vs female literacy for four regions: Latin America, Africa, Asia and Europe.
Your job is to create a list-of-lists for the four Bokeh plots that have been provided to you as p1
, p2
, p3
and p4
. The list-of-lists defines the row and column placement of each plot.
Instructions
gridplot
from the bokeh.layouts
module.row1
containing plots p1
and p2
.row2
containing plots p3
and p4
.row1
and row2
. You will have to pass in row1
and row2
in the form of a list.lit_la = lit[lit.Continent == 'LAT']
lit_af = lit[lit.Continent == 'AF']
lit_as = lit[lit.Continent == 'ASI']
lit_eu = lit[lit.Continent == 'EUR']
# from bokeh.layouts import gridplot ## done at the top of the notebook
p1 = figure(plot_height=300, plot_width=300, x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)', title='Latin America')
p1.circle(lit_la['fertility'], lit_la['female literacy'])
p2 = figure(plot_height=300, plot_width=300, x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)', title='Africa')
p2.circle(lit_af['fertility'], lit_af['female literacy'])
p3 = figure(plot_height=300, plot_width=300, x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)', title='Asia')
p3.circle(lit_as['fertility'], lit_as['female literacy'])
p4 = figure(plot_height=300, plot_width=300, x_axis_label='fertility (children per woman)', y_axis_label='female_literacy (% population)', title='Europe')
p4.circle(lit_eu['fertility'], lit_eu['female literacy'])
# Create a list containing plots p1 and p2: row1
row1 = [p1, p2]
# Create a list containing plots p3 and p4: row2
row2 = [p3, p4]
# Create a gridplot using row1 and row2: layout
layout = gridplot([row1, row2])
# Specify the name of the output_file and show the result
# output_file('grid.html')
show(layout)
Tabbed layouts can be created in Bokeh by placing plots or layouts in Panels.
In this exercise, you'll take the four fertility vs female literacy plots from the last exercise and make a Panel()
for each.
No figure will be generated in this exercise. Instead, you will use these panels in the next exercise to build and display a tabbed layout.
Instructions
Panel
from bokeh.models.widgets
.tab1
with child p1
and a title of 'Latin America'
.tab2
with child p2
and a title of 'Africa'
.tab3
with child p3
and a title of 'Asia'
.tab4
with child p4
and a title of 'Europe'
.# Import Panel from bokeh.models.widgets
# from bokeh.models.widgets import Panel # done at the top of the notebook
# Create tab1 from plot p1: tab1
tab1 = Panel(child=p1, title='Latin America')
# Create tab2 from plot p2: tab2
tab2 = Panel(child=p2, title='Africa')
# Create tab3 from plot p3: tab3
tab3 = Panel(child=p3, title='Asia')
# Create tab4 from plot p4: tab4
tab4 = Panel(child=p4, title='Europe')
Tabbed layouts are collections of Panel objects. Using the figures and Panels from the previous two exercises, you'll create a tabbed layout to change the region in the fertility vs female literacy plots.
Your job is to create the layout using Tabs()
and assign the tabs
keyword argument to your list of Panels. The Panels have been created for you as tab1
, tab2
, tab3
and tab4
.
After you've displayed the figure, explore the tabs you just added! The "Pan", "Box Zoom" and "Wheel Zoom" tools are also all available as before.
Instructions
Tabs
from bokeh.models.widgets
.Tabs
layout called layout
with tab1
, tab2
, tab3
, and tab4
.# Import Tabs from bokeh.models.widgets
# from bokeh.models.widgets import Tabs # done at the top of the notebook
# Create a Tabs layout: layout
layout = Tabs(tabs=[tab1, tab2, tab3, tab4])
# Specify the name of the output_file and show the result
# output_file('tabs.html')
show(layout)
Linking axes
source = ColumnDataSource(iris_df)
# Flowers is a Pandas DataFrame
plot1 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. sepal length')
plot1.circle(flowers['petal_length'], flowers['sepal_length'], color='blue')
plot2 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. sepal width')
plot2.circle(flowers['petal_length'], flowers['sepal_width'], color='green')
plot3 = figure(plot_width=300, plot_height=300, toolbar_location=None, title='petal length vs. petal width')
plot3.circle(flowers['petal_length'], flowers['petal_width'], color='red')
plot3.x_range = plot2.x_range = plot1.x_range
plot3.y_range = plot2.y_range = plot1.y_range
layout = row(plot1, plot2, plot3)
show(layout)
Linking selections
source = ColumnDataSource(iris_df)
tool_list = ['lasso_select', 'tap', 'reset', 'save']
plot1 = figure(title='petal length vs. sepal length', plot_height=300, plot_width=300, tools=tool_list)
plot1.circle('petal_length', 'sepal_length', color='blue', source=source)
plot2 = figure(title='petal length vs. sepal width', plot_height=300, plot_width=300, tools=tool_list)
plot2.circle('petal_length', 'sepal_width', color='green', source=source)
plot3 = figure(title='petal length vs. petal width', plot_height=300, plot_width=300, tools=tool_list)
plot3.circle('petal_length', 'petal_width', line_color='red', fill_color=None, source=source)
plot3.x_range = plot2.x_range = plot1.x_range
plot3.y_range = plot2.y_range = plot1.y_range
layout = row(plot1, plot2, plot3)
show(layout)
Linking axes between plots is achieved by sharing range
objects.
In this exercise, you'll link four plots of female literacy vs fertility so that when one plot is zoomed or dragged, one or more of the other plots will respond.
The four plots p1
, p2
, p3
and p4
along with the layout
that you created in the last section have been provided for you.
Your job is link p1
with the three other plots by assignment of the .x_range
and .y_range
attributes.
After you have linked the axes, explore the plots by clicking and dragging along the x or y axes of any of the plots, and notice how the linked plots change together.
Instructions
x_range
of p2
to p1
.y_range
of p2
to p1
.x_range
of p3
to p1
.y_range
of p4
to p1
.p1
, p2
, p3
, and p4
) from 2.2.2 Creating Gridded Layouts# Link the x_range of p2 to p1: p2.x_range
p2.x_range = p1.x_range
# Link the y_range of p2 to p1: p2.y_range
p2.y_range = p1.y_range
# Link the x_range of p3 to p1: p3.x_range
p3.x_range = p1.x_range
# Link the y_range of p4 to p1: p4.y_range
p4.y_range =