Small overview of HoloViz capability of data exploration#
This notebook is intended to present a small overview of HoloViz and the capability for data exploration, with interactive plots (show difference between matplotlib and bokeh). Many parts are based on or copied from the official HoloViz Tutorial (highly recommended for a more extensive overview of the possibilities of HoloViz).
Note: In June 2019 the project name changed from PyViz to HoloViz. The reason for this is explained in this blog post.
HoloViz Packages used for this notebook#
Exploring Pandas Dataframes#
If your data is in a Pandas dataframe, it’s natural to explore it using the .plot()
method (based on Matplotlib). Let’s have a look at some automatic weather station data from Langenferner:
import pandas as pd
url = 'https://cluster.klima.uni-bremen.de/~oggm/tutorials/aws_data_Langenferner_UTC+2.csv'
df = pd.read_csv(url, index_col=0, parse_dates=True)
df.head()
TEMP | RH | SWIN | SWOUT | LWIN | LWOUT | WINDSPEED | WINDDIR | PRESSURE | |
---|---|---|---|---|---|---|---|---|---|
2013-07-13 00:00:00 | 1.634333 | 67.595753 | 0.0 | 0.0 | 212.744817 | 303.656833 | 4.436833 | 211.533333 | 692.622250 |
2013-07-13 01:00:00 | 1.388667 | 68.150512 | 0.0 | 0.0 | 209.781683 | 302.588717 | 5.544000 | 206.166667 | 692.395683 |
2013-07-13 02:00:00 | 1.064500 | 66.853977 | 0.0 | 0.0 | 207.234933 | 300.872133 | 5.573167 | 210.750000 | 692.200800 |
2013-07-13 03:00:00 | 0.985167 | 55.827547 | 0.0 | 0.0 | 207.913533 | 295.684267 | 3.970167 | 203.250000 | 692.163967 |
2013-07-13 04:00:00 | 1.155333 | 43.371014 | 0.0 | 0.0 | 211.513517 | 292.688400 | 3.267000 | 203.366667 | 692.001667 |
Just calling .plot()
won’t give anything meaningful, because of the different magnitudes of the parameters:
df.plot();
Of course we can have a look at one variable only:
df.TEMP.plot();
This creates a static plot using matplotlib. With this approach we also can make some further explorations, like calculating the monthly mean temperature:
dfm = df.resample('m').mean()
dfm.TEMP.plot();
/tmp/ipykernel_15035/2820268743.py:1: FutureWarning: 'm' is deprecated and will be removed in a future version, please use 'ME' instead.
dfm = df.resample('m').mean()
We can see the course of the parameter but we can not tell what was the exact temperature at January and we also cannot zoom in.
Exploring Data with hvPlot and Bokeh#
If we are using hvplot
instead we can create interactive plots with the same plotting API:
you might need to install first hvplot via e.g. conda install -c pyviz hvplot
import hvplot.pandas
df.TEMP.hvplot()
Now you have an interactive plot using bokeh with zooming option and hover with additional information (get the exact values and timestamps), also possible for all variables but again not very meaningful:
plot = df.hvplot()
plot