Violin

Authors

[Editor] Hu Zheng;

[Contributors]

Note

Hiplot website

This page is the tutorial for source code version of the Hiplot Violin plugin. You can also use the Hiplot website to achieve no code ploting. For more information please see the following link:

https://hiplot.cn/basic/violin?lang=en

The violin plot, named for its resemblance to a violin, is a statistical diagram combining a box diagram with a kernel density diagram to show the distribution of data and the probability density.

Setup

  • System Requirements: Cross-platform (Linux/MacOS/Windows)

  • Programming language: R

  • Dependent packages: ggpubr; ggthemes

# Install packages
if (!requireNamespace("ggpubr", quietly = TRUE)) {
  install.packages("ggpubr")
}
if (!requireNamespace("ggthemes", quietly = TRUE)) {
  install.packages("ggthemes")
}

# Load packages
library(ggpubr)
library(ggthemes)

Data Preparation

The loaded data is data set (gene names and expression levels in different tumors).

# Load data
data <- read.delim("files/Hiplot/181-violin-data.txt", header = T)

# convert data structure
groups <- unique(data[, 2])
ngroups <- length(groups)
comb <- combn(1:ngroups, 2)
my_comparisons <- list()
for (i in seq_len(ncol(comb))) {
  my_comparisons[[i]] <- groups[comb[, i]]
}

# View data
head(data)
  Expresssion Tumor
1    12.10228   AML
2    12.61382   AML
3    12.52741   AML
4    12.67990   AML
5    12.64837   AML
6    12.12146   AML

Visualization

# Violin
p <- ggviolin(data, x = "Tumor", y = "Expresssion", fill = "Tumor", add = "boxplot",
              xlab = "Tumor", ylab = "Expresssion", 
              add.params = list(fill = "white"),
              palette = c("#e04d39","#5bbad6","#1e9f86"),
              title = "Violin Plot", alpha = 1) + 
  stat_compare_means(comparisons = my_comparisons, label = "p.signif") +
  theme_stata() +
  theme(text = element_text(family = "Arial"),
        plot.title = element_text(size = 12,hjust = 0.5),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 10),
        axis.text.x = element_text(angle = 0, hjust = 0.5,vjust = 1),
        legend.position = "right",
        legend.direction = "vertical",
        legend.title = element_text(size = 10),
        legend.text = element_text(size = 10))

p
FigureΒ 1: Violin

The violin plot can reflect the data distribution, which is similar to the box diagram. The black horizontal line in the box shows the median gene expression level in each tumor, and the upper and lower edges in the white box represent the upper and lower quartiles in the data set. The violin graph can also reflect the data density, and the more concentrated the data set, the fatter the graph. The gene expression distribution in the BLGG group is more concentrated, followed by BIC group and AML group.