Data Visualisation : 31.01.2017 -Python / Seaborne

In today's post, I shall take you through a very important and indeed visualising package of the python programming language called the "Seaborne". Let's jump right to it.

Let's start by importing the package.

import seaborn as sns

Then, just so that we get our images in between the code-lines, execute the below by ivoking the matplotlib library

%matplotlib inline

HeatMap

# Load the example flights dataset and conver to long-form
flights_long = sns.load_dataset("flights")
flights = flights_long.pivot("month", "year", "passengers")

# Draw a heatmap with the numeric values in each cell
sns.heatmap(flights, annot=True, fmt="d", linewidths=.5)

Kdeplot

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

sns.set(style="dark")
rs = np.random.RandomState(50)

# Set up the matplotlib figure
f, axes = plt.subplots(3, 3, figsize=(9, 9), sharex=True, sharey=True)

# Rotate the starting point around the cubehelix hue circle
for ax, s in zip(axes.flat, np.linspace(0, 3, 10)):

# Create a cubehelix colormap to use with kdeplot
cmap = sns.cubehelix_palette(start=s, light=1, as_cmap=True)

# Generate and plot a random bivariate dataset
x, y = rs.randn(2, 50)
sns.kdeplot(x, y, cmap=cmap, shade=True, cut=5, ax=ax)
ax.set(xlim=(-3, 3), ylim=(-3, 3))

f.tight_layout()

tsplot

sns.set(style="darkgrid", palette="Set2")

# Create a noisy periodic dataset
sines = []
rs = np.random.RandomState(8)
for _ in range(15):
x = np.linspace(0, 30 / 2, 30)
y = np.sin(x) + rs.normal(0, 1.5) + rs.normal(0, .3, 30)
sines.append(y)

# Plot the average over replicates with bootstrap resamples
sns.tsplot(sines, err_style="boot_traces", n_boot=500)

Swarmplot

import pandas as pd
import seaborn as sns
sns.set(style="whitegrid", palette="muted")

# Load the example iris dataset
iris = sns.load_dataset("iris")

# "Melt" the dataset to "long-form" or "tidy" representation
iris = pd.melt(iris, "species", var_name="measurement")

# Draw a categorical scatterplot to show each observation
sns.swarmplot(x="measurement", y="value", hue="species", data=iris)

Pairgrid

import seaborn as sns
sns.set(style="whitegrid")

# Load the dataset
crashes = sns.load_dataset("car_crashes")

# Make the PairGrid
g = sns.PairGrid(crashes.sort_values("total", ascending=False),
x_vars=crashes.columns[:-3], y_vars=["abbrev"],
size=10, aspect=.25)

# Draw a dot plot using the stripplot function
g.map(sns.stripplot, size=10, orient="h",
palette="Reds_r", edgecolor="gray")

# Use the same x axis limits on all columns and add better labels
g.set(xlim=(0, 25), xlabel="Crashes", ylabel="")

# Use semantically meaningful titles for the columns
titles = ["Total crashes", "Speeding crashes", "Alcohol crashes",
"Not distracted crashes", "No previous crashes"]

for ax, title in zip(g.axes.flat, titles):

# Set a different title for each axes
ax.set(title=title)

# Make the grid horizontal instead of vertical
ax.xaxis.grid(False)
ax.yaxis.grid(True)

sns.despine(left=True, bottom=True)

Based on this, a relatively small sample size, which is it that you like better, [R]'s ggplot2 or Seaborne? Help me find your answers as comments below.

Data Visualisation

Thursday, 2 February 2017

31.01.2017 -Python / Seaborne