Small overview of HoloViz capability of data exploration#

This notebook is intended to present a small overview of HoloViz and the capability for data exploration, with interactive plots (show difference between matplotlib and bokeh). Many parts are based on or copied from the official HoloViz Tutorial (highly recommended for a more extensive overview of the possibilities of HoloViz).

Note: In June 2019 the project name changed from PyViz to HoloViz. The reason for this is explained in this blog post.

HoloViz Packages used for this notebook#


Exploring Pandas Dataframes#

If your data is in a Pandas dataframe, it’s natural to explore it using the .plot() method (based on Matplotlib). Let’s have a look at some automatic weather station data from Langenferner:

import pandas as pd
url = 'https://cluster.klima.uni-bremen.de/~oggm/tutorials/aws_data_Langenferner_UTC+2.csv'
df = pd.read_csv(url, index_col=0, parse_dates=True)
df.head()
TEMP RH SWIN SWOUT LWIN LWOUT WINDSPEED WINDDIR PRESSURE
2013-07-13 00:00:00 1.634333 67.595753 0.0 0.0 212.744817 303.656833 4.436833 211.533333 692.622250
2013-07-13 01:00:00 1.388667 68.150512 0.0 0.0 209.781683 302.588717 5.544000 206.166667 692.395683
2013-07-13 02:00:00 1.064500 66.853977 0.0 0.0 207.234933 300.872133 5.573167 210.750000 692.200800
2013-07-13 03:00:00 0.985167 55.827547 0.0 0.0 207.913533 295.684267 3.970167 203.250000 692.163967
2013-07-13 04:00:00 1.155333 43.371014 0.0 0.0 211.513517 292.688400 3.267000 203.366667 692.001667

Just calling .plot() won’t give anything meaningful, because of the different magnitudes of the parameters:

df.plot();
../../_images/0d2416ddd4bcc9fe78f6f1c445bfd18cae69ed456ae1483e77477cb1413368c3.png

Of course we can have a look at one variable only:

df.TEMP.plot();
../../_images/7c81560e2ecf3cc12b34fd3a40022059a43b12a7182bb21575b97c4594fb304f.png

This creates a static plot using matplotlib. With this approach we also can make some further explorations, like calculating the monthly mean temperature:

dfm = df.resample('m').mean()
dfm.TEMP.plot();
/tmp/ipykernel_15009/2820268743.py:1: FutureWarning: 'm' is deprecated and will be removed in a future version, please use 'ME' instead.
  dfm = df.resample('m').mean()
../../_images/75ab4db9f25b1d16e23bf4d5c417a6d64ff0b230563f2d009ab8b2bfa5928d82.png

We can see the course of the parameter but we can not tell what was the exact temperature at January and we also cannot zoom in.

Exploring Data with hvPlot and Bokeh#

If we are using hvplot instead we can create interactive plots with the same plotting API:

you might need to install first hvplot via e.g. conda install -c pyviz hvplot

import hvplot.pandas

df.TEMP.hvplot()

Now you have an interactive plot using bokeh with zooming option and hover with additional information (get the exact values and timestamps), also possible for all variables but again not very meaningful:

plot = df.hvplot()
plot