heatmaps in python

ML 101: 8 Heatmaps in Python (Full Code)

Heatmaps are great for quickly visualizing data that normally isn’t easy to ingest.

However, it sometimes feels impossible to find a coding resource that shows you how to code up these heatmaps in Python, and what they’ll look like when you’re done.

After this post, you’ll have both.

Heatmaps in this post:

  • bqplot
  • ggplot
  • Lightning Viz
  • Cufflinks
  • MissingNo
  • Matplotlib
  • Seaborn
  • Plotly


Example Data For Each Heatmap

For this tutorial, we’ll simply use the data frame below to implement these different heatmaps.

This will allow you to implement a quick visual and code comparison for your project!

import pandas as pd
import numpy as np

# example data
#https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009
df = pd.read_csv('winequality-red.csv')

# take the last 5 columns
df=df[[col for col in df.columns[5:]]]

# we will heatmap the correlations
corr = df.corr()

corr

dataset correlation


How To Code A Heatmap In Seaborn

A standard in data science, Seaborn has one of the easiest-to-implement heatmaps.

This package is built on top of matplotlib and is one of my favorite packages for plotting distributions.

While I think this package struggles with customization, getting a heatmap out in one line of code quickly is extremely attractive.

Read more about Seaborn here or the initial paper here.

import seaborn as sns

sns.heatmap(corr)

seaborn heatmap
How To Code A Heatmap In Plotly

Plotly is an interactive graphing library that has been on the rise for some time.

With mountains of examples and a community hangout where users exchange questions and code – you can’t really go wrong with Plotly.

Personally, I think their simple implementation of a heatmap is one of the best looking and is one I use when I don’t need special customization.

Read more about Plotly here.

import plotly.express as px

fig = px.imshow(corr)
fig.show()

plotly heatmap
How To Code A Heatmap In Matplotlib

The king of visualization in Python.

Matplotlib is one of the oldest and most stable libraries in python for visualization.

My gripe with Matplotlib is most of the graphs out-of-the-box are a little ugly – but can be made to look better with some customization.

Most of this list’s visualization libraries are built on top of Matplotlib (like Seaborn), leveraging the strong code base of Matplotlib.

Read more about matplotlib here.

import matplotlib.pyplot as plt

plt.imshow(corr, cmap="viridis", interpolation="nearest")
plt.colorbar()
plt.show()

matplotlib heatmap

import matplotlib.pyplot as plt

plt.imshow(corr, cmap="hot", interpolation="nearest")
plt.colorbar()
plt.show()

matplotlib heatmap2


How To Code A Heatmap In ggplot

A very well-known package in R is now popping up in Python.

ggplot is simply a package for plotting in python.

While this package dominates in R, it simply hasn’t reached the same level of adoption as Python.

One of my favorite uses of ggplot is plotting text which can be done in one line of code.

Read more about ggplot here.

from plotnine import ggplot, aes, geom_tile, geom_text
from plotnine import scale_fill_gradientn, ggtitle
import pandas as pd
import numpy as np

# melt down our dataframe
melted_corr = corr.melt()

# repeat the columns
a = np.array([col for col in corr])
melted_corr = melted_corr\
                .assign(variable2=a[np.arange(len(melted_corr)) % len(a)])

# create a figure
fig = plt.figure()

# plot, we can add whatever we want
# I added tiles forexample
ggplot(melted_corr, aes(x=melted_corr['variable'],
                        y=melted_corr['variable2'],
                        fill=melted_corr['value']))\
                        + geom_tile()\
                        + geom_text(aes(label = \
                                        round(melted_corr['value'], 3)))

ggplot heatmap python


How To Code A Heatmap In bqplot

bqplot is an interactive visualization tool for the jupyter environment, bringing customer-ready visualizations to your customers with minimal code.

While the visual aspects of bqplot can mostly be found in plotly, the ability to have interactive charts right in your jupyter environment is a huge plus.

Since I do a lot of coding in jupyter notebooks, I’ll leverage bqplot if I know I will have a stakeholder viewing my visualizations.

That way, they’ll be able to interact with the charts.

You can read more about bqplot here.

from bqplot import pyplot as plt
from ipywidgets import Layout

fig = plt.figure(
    title="HeatMap",
    layout=Layout(width="425px", height="425px"),
    min_aspect_ratio=1,
    max_aspect_ratio=1,
    padding_y=0,
)

axes_options = {'color': {'orientation': "vertical","side":"right"}}

heatmap = plt.heatmap(color=corr.values, axes_options=axes_options)

plt.show()

heatmap in bqplot


How To Code A Heatmap In Missingno

While all the packages above focus on visualizing your data, we need one to visualize what isn’t there.

MissingNo is a simple toolset that allows you to create visualizations and utilities to visualize your missing data quickly.

This is a no-brainer use in situations where you’re lacking data and wondering about the impact it will have on your analysis.

Read more about MissingNo here.

import missingno as msno
import random
%matplotlib inline



# random index values
ix = [(row, col) for row in range(df.shape[0]) for col in range(df.shape[1])]

# create nulls for 20% of the data
for row, col in random.sample(ix, int(round(.2 * len(ix)))):
    df.iat[row, col] = np.nan

# show our heatmap
msno.heatmap(df)

heatmap in missingno


How To Code A Heatmap in Cufflinks

Cufflinks is great for the average data scientist.

This package is built on top of plotly and pandas, which seems to work perfectly for data scientists, as most of the data is in dataframes.

One of my personal favorite packages that really starts to show its strength when it comes to “stacking” charts together.

While this package packs a lot of power, it seems not to be actively managed, as no commits seem to have been merged within the last three years.

It’s still a powerful tool you should know; read more about it here.

import cufflinks as cf
from plotly.offline import iplot
cf.go_offline() #will make cufflinks offline
cf.set_config_file(offline=False, world_readable=True)


corr.iplot(kind='heatmap', colorscale='rdpu' )

# some other colors that can be used for cufflinks heatmap
# dark2, dflt, ggplot, gnbu
# greens, greys, oranges
# original, orrd, paired
# pastel1, pastel2, piyg
# plotly, polar, prgn,
# pubu, pubugn, puor,
# purd, purples, rdbu
# rdgy, rdpu, rdylbu,
# rdylgn, reds, set1
# set2, set3, spectral
# ylgn, ylgnbu,
# ylorbr, ylorrd

heatmap in cufflinks


How To Code A Heatmap in Lightning Viz

Finally, a python based graphing solution for your web apps.

Lightning provides API-based access to all your apps and is supported in Python, Javascript, Scala, and R.

I worry if this project is being maintained, as their SSL certificate has gone missing, and they’ve shut down their test server.

Anyways, if you’re willing to start up your own server, the code still exists and can be utilized.

Read more about Lightning here.

lgn = Lightning(ipython=True) 
lgn.matrix(corr, colormap='BuPu', \
           row_labels=list(corr.index.values), \
           column_labels=list(corr.columns.values), \
           width=500, \
           description="HeatMap")

lgn.open()


Recapping Heatmaps In Python:

We’ve learned how to create heatmaps for the eight following python packages:

  • Lightning Viz
  • Cufflinks
  • MissingNo
  • bqplot
  • ggplot
  • Matplotlib
  • Seaborn
  • Plotly

We’ve also learned when to use each one, some positives and negatives about each package and full python code to implement these on your own.

Now that you can visualize your data, get out there and make a heatmap!


Other Articles In Our Machine Learning 101 Series

We have many brief guides that go over some of the fundamental parts of machine learning.

Some of those guides include:

Stewart Kaplan