miRPathDB

Illustrated user guide

Download as PDF
guide.pdf

Data sources

Biological categories

Database	Version	Retrieval data
Gene Ontology	-	June 2019
KEGG	-	June 2019
miRBase	22	June 2019
miRCarta	1.1	June 2019
Reactome	-	June 2019
WikiPathways	-	June 2019

miRNA targets

Database	Version	Retrieval data
MiRanda	3.3a	June 2019
miRTarBase	7	June 2019
TargetScan	7.1	June 2019

Statistical analysis

All compute intensive tasks have been performed using the GeneTrail2 C++ library [1] and GNU Parallel [2]. Results of the enrichment analysis were evaluated using the freely available statistical programming environment R, version 3.5.

Parameter overview

Statistical test	Over-representation analysis
P-value adjustment	Benjamini-Hochberg
$\alpha$-level	0.05
Minimal category size	2
Maximal category size	1000

Over-representation analysis

In order to judge if a certain biological category is significantly enriched for a certain miRNA, we use a test called over-representation analysis (ORA). This approach has been employed by many authors, e.g. [3], [4], [5], [6], [7]. Here we use the version of ORA that was presented by Backes et al. [3]. This approach is based on the hypergeometric distribution and can be used to test if a set of selected biological entities is significantly more or less present in a biological category than expected by chance.

We use ORA to judge if a biological pathway contains more targets of a certain miRNA than expected by chance. In order to calculate this chance, ORA relies on a reference set R (background). In our case this is a list of all miRNA targets for the corresponding confidence.

Assume a biological category C has k entries in list $T = (t_{1},t_{2},\ldots,t_{n})$ and l entries in reference set $R=(r_{1},r_{2},\ldots,r_{m})$. Based on this information we expect to find $k'=\frac{n*l}{m}$ elements of test set T in category C on average.

If T is a subset of R, the hypergeometric test is applied to compute a p-value for C:

$$P_C(k)=\sum\limits_{i=k}^{n} \frac{\binom{l}{i}\binom{m-l}{n-i}}{\binom{m}{n}}$$

Benjamini Hochberg adjustment

The Benjamini-Hochberg method [8], [9] is a step-up approach to control the false discovery rate. It assumes all p-values to be independent. Given $n$ increasingly sorted p-values $\{p_1,...,p_n\}$, we can can compute the adjusted p-values using the following formula:

$$\tilde p_{i}\ =\ \begin{cases} p_{i} & \text{for } i=n\\ \min \left( \tilde p_{(i-1)}, \frac{n}{i}p_{i} \right) & \text{for }i=n-1 ,...,1 \end{cases}$$

Bibliography

Stöckel, Daniel and Kehl, Tim and Trampert, Patrick and Schneider, Lara and Backes, Christina and Ludwig, Nicole and Gerasch, Andreas and Kaufmann, Michael and Gessler, Manfred and Graf, Norbert and Meese, Eckart and Keller, Andreas and Lenhof, Hans-Peter Multi-omics Enrichment Analysis using the GeneTrail2 Web Service 2016 Bioinformatics Oxford University Press
O. Tange GNU Parallel - The Command-Line Power Tool 2011 ;login: The USENIX Magazine (View online)
Backes, Christina and Keller, Andreas and Kuentzer, Jan and Kneissl, Benny and Comtesse, Nicole and Elnakady, Yasser A and Müller, Rolf and Meese, Eckart and Lenhof, Hans-Peter GeneTrail—advanced gene set enrichment analysis 2007 Nucleic acids research Oxford Univ Press (View online)
Draghici, Sorin and Khatri, Purvesh and Martins, Rui P. and Ostermeier, G. Charles and Krawetz, Stephen A. Global functional profiling of gene expression 2003 Genomics Elsevier (View online)
Hosack, Douglas A and Dennis Jr, Glynn and Sherman, Brad T and Lane, H Clifford and Lempicki, Richard A and others Identifying biological themes within lists of genes with EASE 2003 Genome Biol (View online)
Khatri, Purvesh and Draghici, Sorin Ontological analysis of gene expression data: current tools, limitations, and open problems 2005 Bioinformatics Oxford Univ Press (View online)
Zhang, Bing and Schmoyer, Denise and Kirov, Stefan and Snoddy, Jay GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies 2004 BMC bioinformatics BioMed Central Ltd
Benjamini, Yoav and Hochberg, Yosef Controlling the false discovery rate: a practical and powerful approach to multiple testing 1995 Journal of the Royal Statistical Society. Series B (Methodological) JSTOR
Hochberg, Yosef and Benjamini, Yoav More powerful procedures for multiple significance testing 1990 Statistics in medicine Wiley Online Library

Database

Tools

Documentation