Visualizing functional score results

Riley Xin

In this tutorial, we will introduce different ways of visualizing functional score results generated from rosace. To see how to run Rosace, please refer to Introduction to Rosace . Visualization is a powerful tool for interpreting your results, and we offer three different views: heatmap, violin plot, and density plot.

library("rosace")

Prepare Data

A precomputed result on the full OCT1 dataset is provided for demonstration purposes. You can load it using:

data("oct1_rosace_scored")

Extract the functional score data for plotting with the OutputScore function. This will prepare the data in a format suitable for our visualization functions:

scores.data <- OutputScore(oct1_rosace_scored, name = "1SM73_ROSACE")
head(scores.data)
##    variants position wildtype mutation       type       mean        sd
## 1 p.(A107A)      107        A        A synonymous -0.2755549 0.2365942
## 2 p.(A107C)      107        A        C   missense  0.1134388 0.2255906
## 3 p.(A107D)      107        A        D   missense  0.3873864 0.2108217
## 4 p.(A107E)      107        A        E   missense -0.6415181 0.2534535
## 5 p.(A107F)      107        A        F   missense -0.1401362 0.2692807
## 6 p.(A107G)      107        A        G   missense -1.0571610 0.2417313
##           lfsr     lfsr.neg   lfsr.pos test.neg test.pos   label
## 1 1.220757e-01 1.220757e-01 0.87792430    FALSE    FALSE Neutral
## 2 3.075340e-01 6.924660e-01 0.30753402    FALSE    FALSE Neutral
## 3 3.306753e-02 9.669325e-01 0.03306753    FALSE    FALSE Neutral
## 4 5.685139e-03 5.685139e-03 0.99431486     TRUE    FALSE     Neg
## 5 3.013891e-01 3.013891e-01 0.69861090    FALSE    FALSE Neutral
## 6 6.119409e-06 6.119409e-06 0.99999388     TRUE    FALSE     Neg

Note: When using your own scores.data, ensure that it contains columns for position, control amino acid, mutated amino acid, mutation type, and score. If your column names differ from the default ones, specify the correct names using the respective arguments: pos.col, wt.col, mut.col, type.col, and score.col.

Heatmap

The heatmap provides a grid view of scores, allowing you to quickly identify regions of interest.

scoreHeatmap(data = scores.data,
             ctrl.name = 'synonymous', # the control mutation name
             score.col = "mean",
             savedir = "../tests/results/stan/assayset_full/plot/", 
             name = "Heatmap_1SM73",
             savepdf = TRUE,
             show = TRUE)
## Showing the first 100 positions. Full figure can be found in the saved directory.

Violin Plot

The violin plot can be used to visualize the distribution of the scores across different mutation types.

scoreVlnplot(data = scores.data, 
             savedir = "../tests/results/stan/assayset_full/plot/",
             name = "ViolinPlot_1SM73", 
             savepdf = TRUE, 
             show = TRUE)
## Warning: Groups with fewer than two data points have been dropped.
## Groups with fewer than two data points have been dropped.
## Groups with fewer than two data points have been dropped.
## Groups with fewer than two data points have been dropped.
## Showing the first 50 positions. Full figure can be found in the saved directory.

Density Plot

The density plot offers a smoothed representation of the distribution of scores across different mutation types.

scoreDensity(scores.data, 
             hist = FALSE,
             savedir = "../tests/results/stan/assayset_full/plot/", 
             name = "DensityPlot_1SM73")

Alternatively, you can plot a histogram by setting hist = TRUE.

scoreDensity(scores.data,
             hist = TRUE,
             nbins = 50,
             scale.free = TRUE,
             savedir = "../tests/results/stan/assayset_full/plot/",
             name = "Histogram_1SM73")