-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path07B-Plotting on data-aware grids.tex
146 lines (137 loc) · 5.66 KB
/
07B-Plotting on data-aware grids.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
\documentclass{beamer}
\usepackage{framed}
\usepackage{graphicx}
\begin{document}
\section{Plotting pairwise relationships with PairGrid and pairplot()}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
Plotting pairwise relationships with PairGrid and pairplot()
\begin{itemize}
\item PairGrid also allows you to quickly draw a grid of small subplots using the same plot type to visualize data in each.
\item In a PairGrid, each row and column is assigned to a different variable, so the resulting plot shows each pairwise relationship in the dataset.
\item This style of plot is sometimes called a “scatterplot matrix”, as this is the most common way to show each relationship, but PairGrid is not limited to scatterplots.
\end{itemize}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
It’s important to understand the differences between a FacetGrid and a PairGrid. In the former, each facet shows the same relationship conditioned on different levels of other variables. In the latter, each plot shows a different relationship (although the upper and lower triangles will have mirrored plots). Using PairGrid can give you a very quick, very high-level summary of interesting relationships in your dataset.
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
The basic usage of the class is very similar to FacetGrid. First you initialize the grid, then you pass plotting function to a map method and it will be called on each subplot. There is also a companion function, pairplot() that trades off some flexibility for faster plotting.
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
\begin{framed}
\begin{verbatim}
iris = sns.load_dataset("iris")
g = sns.PairGrid(iris)
g.map(plt.scatter);
../_images/axis_grids_42_0.png
\end{verbatim}
\end{framed}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
It’s possible to plot a different function on the diagonal to show the univariate distribution of the variable in each column. Note that the axis ticks won’t correspond to the count or density axis of this plot, though.
\begin{verbatim}
g = sns.PairGrid(iris)
g.map_diag(plt.hist)
g.map_offdiag(plt.scatter);
../_images/axis_grids_44_0.png
\end{verbatim}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large A very common way to use this plot colors the observations by a separate categorical variable. For example, the iris dataset has four measurements for each of three different species of iris flowers so you can see how they differ.
\begin{verbatim}
g = sns.PairGrid(iris, hue="species")
g.map_diag(plt.hist)
g.map_offdiag(plt.scatter)
g.add_legend();
../_images/axis_grids_46_0.png
\end{verbatim}
By default every numeric column in the dataset is used, but you can focus on particular relationships if you want.
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
\begin{verbatim}
g = sns.PairGrid(iris, vars=["sepal_length", "sepal_width"], hue="species")
g.map(plt.scatter);
../_images/axis_grids_48_0.png
\end{verbatim}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
It’s also possible to use a different function in the upper and lower triangles to emphasize different aspects of the relationship.
\begin{verbatim}
g = sns.PairGrid(iris)
g.map_upper(plt.scatter)
g.map_lower(sns.kdeplot, cmap="Blues_d")
g.map_diag(sns.kdeplot, lw=3, legend=False);
../_images/axis_grids_50_0.png
\end{verbatim}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
The square grid with identity relationships on the diagonal is actually just a special case, and you can plot with different variables in the rows and columns.
\begin{verbatim}
g = sns.PairGrid(tips, y_vars=["tip"], x_vars=["total_bill", "size"], size=4)
g.map(sns.regplot, color=".3")
g.set(ylim=(-1, 11), yticks=[0, 5, 10]);
../_images/axis_grids_52_0.png
\end{verbatim}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
Of course, the aesthetic attributes are configurable. For instance, you can use a different palette (say, to show an ordering of the hue variable) and pass keyword arguments into the plotting functions.
\begin{verbatim}
g = sns.PairGrid(tips, hue="size", palette="GnBu_d")
g.map(plt.scatter, s=50, edgecolor="white")
g.add_legend();
../_images/axis_grids_54_0.png
\end{verbatim}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
PairGrid is flexible, but to take a quick look at a dataset, it can be easier to use pairplot(). This function uses scatterplots and histograms by default, although a few other kinds will be added (currently, you can also plot regression plots on the off-diagonals and KDEs on the diagonal).
\begin{verbatim}
sns.pairplot(iris, hue="species", size=2.5);
../_images/axis_grids_56_0.png
\end{verbatim}
\end{frame}
%====================================%
\begin{frame}[fragile]
\frametitle{Seaborn Workshop}
\large
\begin{itemize}
\item You can also control the aesthetics of the plot with keyword arguments, and it returns the PairGrid instance for further tweaking.
\end{itemize}
\begin{verbatim}
g = sns.pairplot(iris, hue="species", palette="Set2", diag_kind="kde", size=2.5)
../_images/axis_grids_58_0.png
\end{verbatim}
\end{frame}
%====================================%
\end{document}