桑基图

作者

[编辑] 郑虎;

[审核] .

注记

Hiplot 网站

本页面为 Hiplot Sankey 插件的源码版本教程,您也可以使用 Hiplot 网站实现无代码绘图,更多信息请查看以下链接:

https://hiplot.cn/basic/sankey?lang=zh_cn

桑基图是一种流量图,其中箭头的宽度与流量成比例。

环境配置

  • 系统: Cross-platform (Linux/MacOS/Windows)

  • 编程语言: R

  • 依赖包: ggalluvial; ggplot2

# 安装包
if (!requireNamespace("ggalluvial", quietly = TRUE)) {
  install.packages("ggalluvial")
}
if (!requireNamespace("ggplot2", quietly = TRUE)) {
  install.packages("ggplot2")
}

# 加载包
library(ggalluvial)
library(ggplot2)

数据准备

载入数据为 4 个变量及每4种变量组合下的频数。

# 加载数据
data <- read.delim("files/Hiplot/158-sankey-data.txt", header = T)

# 整理数据格式
value <- "Freq"
axis <- c("Class", "Sex")
usr_axis <- c()
for (i in seq_len(length(axis))) {
  usr_axis <- c(usr_axis, axis[i])
  assign(paste0("axis", i), axis[i])
}
index_axis <- match(usr_axis, colnames(data))
index_value <- match(value, colnames(data))
data1 <- data[, c(index_value, index_axis)]
## 定义带颜色
nlevels <- as.numeric(apply(data1[, -1], 2, function(data) {
  return(length(unique(data)))
}))
band_color <- c("#8DD3C7", "#FFFFB3", "#BEBADA", "#FB8072", "#8DD3C7", "#FFFFB3")
## 重命名数据
data_rename <- data1
colnames(data_rename) <- c(
  "value",
  paste("axis", seq_len(length(usr_axis)), sep = "")
)

# 查看数据
head(data)
  Class    Sex   Age Survived Freq
1   1st   Male Child       No    0
2   2nd   Male Child       No    0
3   3rd   Male Child       No   35
4  Crew   Male Child       No    0
5   1st Female Child       No    0
6   2nd Female Child       No    0

可视化

# 桑基图
p <- ggplot(data_rename, aes(y = value, axis1 = axis1, axis2 = axis2)) +
  geom_alluvium(alpha = 1, aes(fill = data1[, colnames(data1) == "Sex"]),
                width = 0, reverse = FALSE) +
  scale_x_discrete(limits = usr_axis, expand = c(0.02, 0.1)) +
  ylab("") +
  scale_fill_discrete(name = "Sex") +
  coord_flip() +
  geom_stratum(alpha = 1, width = 1 / 8, reverse = FALSE, fill = band_color,
               color = "white") +
  geom_text(stat = "stratum", infer.label = TRUE, reverse = FALSE) +
  ggtitle("Sankey plot") +
  guides(fill = guide_legend(title = "Sex")) +
  scale_fill_manual(values = c("#00468BFF", "#ED0000FF")) +
  theme_bw() +
  theme(text = element_text(family = "Arial"),
        plot.title = element_text(size = 12,hjust = 0.5),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.text.x = element_text(angle = 0, hjust = 0.5,vjust = 1),
        legend.position = "right",
        legend.direction = "vertical",
        legend.title = element_text(size = 10),
        legend.text = element_text(size = 10))

p
图 1: 桑基图

female 分流的颜色为蓝色,male 分流的颜色为红色,蓝色分流出去的宽度和等于 female 的总宽度。