# Install packages
if (!requireNamespace("ggpubr", quietly = TRUE)) {
install.packages("ggpubr")
}if (!requireNamespace("ggthemes", quietly = TRUE)) {
install.packages("ggthemes")
}
# Load packages
library(ggpubr)
library(ggthemes)
Violin
Hiplot website
This page is the tutorial for source code version of the Hiplot Violin
plugin. You can also use the Hiplot website to achieve no code ploting. For more information please see the following link:
The violin plot, named for its resemblance to a violin, is a statistical diagram combining a box diagram with a kernel density diagram to show the distribution of data and the probability density.
Setup
System Requirements: Cross-platform (Linux/MacOS/Windows)
Programming language: R
Dependent packages:
ggpubr
;ggthemes
Data Preparation
The loaded data is data set (gene names and expression levels in different tumors).
# Load data
<- read.delim("files/Hiplot/181-violin-data.txt", header = T)
data
# convert data structure
<- unique(data[, 2])
groups <- length(groups)
ngroups <- combn(1:ngroups, 2)
comb <- list()
my_comparisons for (i in seq_len(ncol(comb))) {
<- groups[comb[, i]]
my_comparisons[[i]]
}
# View data
head(data)
Expresssion Tumor
1 12.10228 AML
2 12.61382 AML
3 12.52741 AML
4 12.67990 AML
5 12.64837 AML
6 12.12146 AML
Visualization
# Violin
<- ggviolin(data, x = "Tumor", y = "Expresssion", fill = "Tumor", add = "boxplot",
p xlab = "Tumor", ylab = "Expresssion",
add.params = list(fill = "white"),
palette = c("#e04d39","#5bbad6","#1e9f86"),
title = "Violin Plot", alpha = 1) +
stat_compare_means(comparisons = my_comparisons, label = "p.signif") +
theme_stata() +
theme(text = element_text(family = "Arial"),
plot.title = element_text(size = 12,hjust = 0.5),
axis.title = element_text(size = 12),
axis.text = element_text(size = 10),
axis.text.x = element_text(angle = 0, hjust = 0.5,vjust = 1),
legend.position = "right",
legend.direction = "vertical",
legend.title = element_text(size = 10),
legend.text = element_text(size = 10))
p

The violin plot can reflect the data distribution, which is similar to the box diagram. The black horizontal line in the box shows the median gene expression level in each tumor, and the upper and lower edges in the white box represent the upper and lower quartiles in the data set. The violin graph can also reflect the data density, and the more concentrated the data set, the fatter the graph. The gene expression distribution in the BLGG group is more concentrated, followed by BIC group and AML group.