毫升 | Seaborn 中的矩阵图

Seaborn 是Python提供的一个很棒的可视化库。它有几种绘图，通过这些绘图提供了惊人的可视化功能。其中一些包括计数图、散点图、配对图、回归图、矩阵图等等。本文涉及 seaborn 中的矩阵图。
示例 1：热图
热图是一种显示某种矩阵图的方法。要使用热图，数据应采用矩阵形式。通过矩阵，我们的意思是索引名称和列名称必须以某种方式匹配，以便我们在单元格中填充的数据是相关的。让我们看一个例子来更好地理解这一点。
代码： Python程序

Python3

# import the necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
 
# load the tips dataset
dataset = sns.load_dataset('tips')
 
# first five entries of the tips dataset
dataset.head()
 
# correlation between the different parameters
tc = dataset.corr()
 
# plot a heatmap of the correlated data
sns.heatmap(tc)

Python3

# import the necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
 
# load the tips dataset
dataset = sns.load_dataset('tips')
 
# first five entries of the tips dataset
dataset.head()
 
# correlation between the different parameters
tc = dataset.corr()
sns.heatmap(tc, annot = True, cmap ='plasma',
            linecolor ='black', linewidths = 1)

Python3

# import the necessary libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
 
# load the flights dataset
fd = sns.load_dataset('flights')
 
# make a dataframe of the data
df = pd.pivot_table(values ='passengers', index ='month',
                    columns ='year', data = fd)
 
# first five entries of the dataset
df.head()
 
# make a clustermap from the dataset
sns.clustermap(df, cmap ='plasma')

Python3

# import the necessary libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
 
# load the flights dataset
fd = sns.load_dataset('flights')
 
# make a dataframe of the data
df = pd.pivot_table(values ='passengers',
                    index ='month', columns ='year', data = fd)
 
# first five entries of the dataset
df.head()
 
# make a clustermap from the dataset
sns.clustermap(df, cmap ='plasma', standard_scale = 1)

数据集的前五个条目

相关矩阵

相关矩阵的热图
为了更好地使用热图进行可视化，我们可以添加注释、线宽和线颜色等参数。

蟒蛇3

# import the necessary libraries
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
 
# load the tips dataset
dataset = sns.load_dataset('tips')
 
# first five entries of the tips dataset
dataset.head()
 
# correlation between the different parameters
tc = dataset.corr()
sns.heatmap(tc, annot = True, cmap ='plasma',
            linecolor ='black', linewidths = 1)

解释

annot 用于注释属于这些单元格的实际值
cmap 用于您想要的颜色映射，如coolwarm、等离子、岩浆等。
linewidth 用于设置分隔单元格的线的宽度。
linecolor 用于设置分隔单元格的线条的颜色。

这是一个显示这些属性的图。

所以我们可以说热图所做的只是根据梯度为单元格着色，并使用一些参数来增加数据可视化。
示例 2：集群映射
聚类地图使用层次聚类。它根据行和列的相似性执行聚类。

蟒蛇3

# import the necessary libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
 
# load the flights dataset
fd = sns.load_dataset('flights')
 
# make a dataframe of the data
df = pd.pivot_table(values ='passengers', index ='month',
                    columns ='year', data = fd)
 
# first five entries of the dataset
df.head()
 
# make a clustermap from the dataset
sns.clustermap(df, cmap ='plasma')

数据集的前五个条目

使用数据透视表创建的矩阵（前五个条目）

来自给定数据的 Clustermap
我们还可以使用 standard_scale 参数更改颜色条的比例。

蟒蛇3

# import the necessary libraries
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt % matplotlib inline
 
# load the flights dataset
fd = sns.load_dataset('flights')
 
# make a dataframe of the data
df = pd.pivot_table(values ='passengers',
                    index ='month', columns ='year', data = fd)
 
# first five entries of the dataset
df.head()
 
# make a clustermap from the dataset
sns.clustermap(df, cmap ='plasma', standard_scale = 1)

使用标准缩放后的 Clustermap
standard_scale = 1将数据从 0 到 1 范围标准化。我们可以看到月份和年份不再是有序的，因为它们是根据聚类图的相似性进行聚类的。
因此我们可以得出结论，热图将按照我们给出的顺序显示事物，而集群图则根据相似性对数据进行聚类。