RPGeNet v2.0

Data


Table of Contents

  1. Driver Genes
  2. Sources
  3. Network
  4. Comparison with previous version
  5. Expression
  6. Update Notes

Driver Genes:


Sources


RPGeNet is a curated network built from the interaction databases: BioGRID and STRING, as well as, the protein interaction text-mining tool, PPaxe.

BioGRID is a genetic and protein interaction database that contains experimentally verified interactions for a plethora of species. STRING is only a protein interaction database that contains both verified interactions and non-verifed predictions. The databases were filterd by species to ensure only human interactions were included in the network. STRING, containing many non-experimentally verified predictions, had all interactions with no evidence filtered out.

STRING Score Distribution

The graph shows the distribution of scores (between 0 and 1000) given by STRING to the reliability of interactions in their database. Most of the lower scores correspond to predicted interactions.

STRING Score Distribution by Factors

This figure is a more detailed version of the figure above. It shows the distribution of STRING score but categorized by a combination of evidence type. The combination code, found on the top of every subset, is a seven digit binary string where 0 means no and 1 means yes. Each of the seven digits of the combination code is representative of a different category: the first digit represents whether or not there is self-interaction; the next digit is whether or not it is bidirectional; followed by if there is PubMed evidence and then if there is INTC evidences; the third to last, is whether or not there is reactome evidence and is followed by if there is a Gene Ontology DB annotation. Finally, the last digit represents if there is other supporting evidence. With seven digits there are a large number of possible subsets but only 27 subsets have interactions with the equivalent combination code. The first subset clearly shows that there are no evidence for a large number of interactions in STRING. Despite most of the interactions in the first subsection scoring low, there are still some, with no evidence, that retain a high score.

PPaxe mines for protein interactions within found in articles within PubMed. We used PPaxe to scout out protein interactions found within academic articles on retinitis pigmentosa and articles that contained any of the 276 associated with retinitis pigmentosa (driver genes). To reduce the number of false positive interactions, the PPaxe output was filtered to exclude any prospective protein symbol that was not found within the HGNC database. We also filtered out any interaction with a score below 65 to ensure a better interactions network.

PPaxe Score Distribution

The graph shows the distribution of the PPaxe score given to the interactions detected, the number of PPaxe interactions and the average score depending on the cutoff score chosen. In an attempt to optimize the number interactions while reducing false positive interactions and increasing false-negatives, the cutoff score chosen was 0.65.

Network


To build the complete network, it was first needed to create the skeleton: a graph network that connects all the driver genes through their shortest paths between them. There are levels above the skeleton that are an extension of the graph. The first level is an extension of the skeleton with the addition of parents and children genes that connect to the skeleton but are not themselves within the skeleton. The next level includes parents and children of the genes found within the first level that are not already in the skeleton and level one. The same pattern continues until there is a saturation (no more higher levels) except for the wholegraph that includes genes that do not connect with the network.

Circos Graph

Comparison with previous version


With an addition of 175 new driver genes now known, RPGeNet was in need of an update. The increase in the number of driver genes partially precipitated an early saturation in the number of levels from the skeleton. Our new network, however, does have less interactions and genes in comparison to the previous network. This is mainly due to the extensive filtering of STRING. The previous network included the predictions but the updated network only includes interactions with evidences.

upset plot comparing the nodes in the previous version of RPGeNet with updated version
upset plot comparing the interactions in the previous version of RPGeNet with updated version

Expression analysis


Unfortunately, there are few multitissue microarray expression experiments that include retina within their list of tissues. We had to rely on a relatively old microarray experiment (GSE7905) that includes thirty-two different tissues including the retina, liver, brain, skeletal muscle and others. Although the experiment is not recent, the expression values are still useful in finding potentially important pathways associated with retinitis pigmentosa by looking to see if the genes in a pathway all express within the retina. We hope to soon be able to have an updated multitissue expression experiment to use.

Volcano chart

Distribution Matrix

Heatmap

Distribution Matrix

Top twenty relatively overexpressed genes in retina in comparison to all tissues:

Gene SymbolAverage ExpressiontlogFCP. ValueAdjusted P. Value
TMEM98 s 1.54276e+01 1.19506e+02 3.76379e+00 1.71574e-78 5.64101e-74
UNC119 1.52132e+01 1.09719e+02 4.60785e+00 4.54282e-76 7.46795e-72
GPX3 1.57962e+01 9.99257e+01 3.96335e+00 2.02176e-73 2.21571e-69
EFEMP1 1.57655e+01 8.30937e+01 4.06639e+00 3.31868e-68 2.72779e-64
APOD 1.54900e+01 8.08101e+01 4.20815e+00 2.02969e-67 1.33464e-63
AOC3 1.46594e+01 7.49936e+01 3.50441e+00 2.59104e-65 1.41980e-61
INPP5K 1.47864e+01 7.08373e+01 2.84811e+00 1.04379e-63 4.90255e-60
SERPINF1 1.79001e+01 7.02973e+01 3.10665e+00 1.71338e-63 7.04158e-60
SLC22A17 1.58003e+01 6.94541e+01 2.91467e+00 3.74310e-63 1.36740e-59
SEPT4 1.48751e+01 6.85776e+01 2.78186e+00 8.51682e-63 2.72644e-59
GJA1 1.59067e+01 6.85049e+01 3.09793e+00 9.12187e-63 2.72644e-59
FOXC1 1.50100e+01 6.81441e+01 3.80321e+00 1.28380e-62 3.51741e-59
MGP 1.71069e+01 6.79695e+01 2.87181e+00 1.51573e-62 3.64205e-59
PTP4A3 1.45879e+01 6.79454e+01 2.45700e+00 1.55084e-62 3.64205e-59
RNASE1 1.63907e+01 6.58746e+01 2.92969e+00 1.14808e-61 2.51644e-58
CHCHD6 1.51307e+01 6.55999e+01 2.70855e+00 1.50417e-61 2.90906e-58
C1QTNF5 1.29306e+01 6.26564e+01 4.17481e+00 2.91860e-60 5.33098e-57
KANK2 1.67269e+01 6.20026e+01 2.52482e+00 5.74309e-60 9.93797e-57
GPNMB 1.56794e+01 6.17375e+01 3.23300e+00 7.57265e-60 1.24487e-56
ADGRA2 1.51704e+01 6.12170e+01 2.41251e+00 1.30741e-59 2.04690e-56

Top twenty relatively underexpressed genes in retina in comparison to all tissues:

Gene SymbolAverage ExpressiontlogFCP. ValueAdjusted P. Value
SCN8A 9.21722e+00 -4.86829e-03 -1.27044e-03 9.96130e-01 9.97512e-01
KRTAP10_4 1.08677e+01 4.86202e-03 8.24666e-04 9.96135e-01 9.97512e-01
GLRX2 1.25596e+01 4.80925e-03 7.16169e-04 9.96177e-01 9.97512e-01
INSL4 8.68381e+00 -4.72755e-03 -1.61184e-03 9.96242e-01 9.97521e-01
EHMT1 1.11133e+01 4.72200e-03 1.00922e-03 9.96247e-01 9.97521e-01
YPEL1 1.16304e+01 4.29957e-03 7.30017e-04 9.96582e-01 9.97766e-01
RFX8 1.03852e+01 4.21322e-03 1.35426e-03 9.96651e-01 9.97774e-01
DPF3 8.57543e+00 -3.91878e-03 -1.14387e-03 9.96885e-01 9.97948e-01
BANF1 1.13208e+01 -3.85001e-03 -7.99110e-04 9.96940e-01 9.97972e-01
PCDHGA6 9.22860e+00 -3.78145e-03 -1.30935e-03 9.96994e-01 9.97996e-01
CSNK1G2_AS1 9.30784e+00 -3.46594e-03 -1.26328e-03 9.97245e-01 9.98214e-01
WDR62 8.99319e+00 3.43087e-03 9.84490e-04 9.97273e-01 9.98214e-01
FLJ34790 1.07538e+01 -3.28461e-03 -9.17140e-04 9.97389e-01 9.98252e-01
PAM16 1.30554e+01 2.66893e-03 1.43794e-04 9.97879e-01 9.98577e-01
NCBP2 1.30254e+01 -2.22164e-03 -1.40418e-04 9.98234e-01 9.98826e-01
C1ORF127 1.00062e+01 -2.12662e-03 -8.15441e-04 9.98310e-01 9.98826e-01
PYY2 9.13073e+00 1.54440e-03 5.05289e-04 9.98772e-01 9.99168e-01
TEX11 9.41583e+00 7.98628e-04 2.32093e-04 9.99365e-01 9.99578e-01
C17ORF49 1.52699e+01 3.14705e-04 1.48092e-05 9.99750e-01 9.99841e-01
RERG 1.41530e+01 -8.51990e-05 -6.10756e-06 9.99932e-01 9.99934e-01

Update Notes


October 20th, 2018 - Version 2.0