An updated version of the RGASP analyses has been published on Nature Methods journal, along with an assessment of RNA-seq mapping tools, here you can find the reference to full text:
Data not to be used for publications without written permission (see further details at ENCODE web site).
Table of contents
EVALUATION PROTOCOL
Overview of Sensitivity and Specificity Scores
The accuracy measures being used along this section are described in Burset and Guigó (Genomics, 34/3:353-357, 1996), Reese et al (Genome Research, 10/4:483-501, 2000) and Guigó et al (Genome Research, 10/10:1631-1642, 2000). Figure above was adapted from the first reference (Burset and Guigó, 1996). Evaluation at gene and transcript level for the results displayed on this site was performed following the EGASP “IMIM” criteria, which is described on the supplementary web materials for Guigó et al (Genome Biology, 7/Suppl1:S2, 2006).
Computational Protocol and Software
Go to the analysis protocol and software page
RGASP: ROUND 1
Codes for Participants
ale | Alessandro Guffanti | lio | Lior Pachter | |
art | Arthur Moisdon | mar | Mario Stanke | |
ben | Ben Brown | gun | Gunnar Raetsch | |
car | Carrie A Davis | sea | Sean Grimmond | |
chr | Christian Iseli | sim | Simon White | |
eli | Elizabeth Purdom | tho | Thomas Wu | |
gau | Gautier Koscielny | tyl | Tyler Alioto | |
ger | Gerstein lab | vic | Victor Solovyev | |
hug | Hugues Richard | xia | Xiaowo Wang | |
jel | Jeltje van Baren | yar | Yarden Katz | |
jie | Jie Wu | zef | Zefeng Zhang | |
wol | Wold/Mortazavi |
Evaluation Scores Data
Hub of Graphical Results
HUMAN | WORM | FLY | DESCRIPTION | |
SNSP SCATTERPLOTs | hub | hub | hub | SN versus SP for all the submissions on all the sequences. |
BOXPLOTs by SUBMISSION | hub | hub | hub | SNSP at sequence level against submissions for each evaluation variable. |
BOXPLOTs by SEQUENCE | hub | hub | hub | SNSP at submission level against sequences for each evaluation variable. |
BOXPLOTs by VARIABLE | hub | hub | hub | SNSP at sequence level against evaluation variables for each submission. |
NOTES:
- The submissions were split into four categories: partial, CDS-only, exon-only and full-predictions. The pages linked from the table summarize, for every evaluation variable taken into consideration, a plot with all submissions together (except for the last row boxplots), followed by four plots, one per each of the mentioned categories. When data is not available for a given variable, an asterisk will mark the corresponding submission, variable or sequence boxplot. When there was no data available for all the elements within a given category on a given variable, an empty plot is shown as placeholder. In example, exon-only predictions do not contain CDS features, thus you get an empty plot for CDS related variables on that category.
- All scatterplots show data on every submission and every sequence. Each dot corresponds to the SN vs SP on a given sequence for a given submission, no labels though. To check those details, you can look at the box-plots.
- All boxplots from the above table show (SN+SP)/2 on the y-axis. Box upper and bottom edges correspond to the 3rd and 1st quartiles, the line in the middle is the median. The ticks above and below define the range of two times the interquartile range, which means that points out of those marks can be considered as outlayers. The points marked as outlayers only show their label when they refer to sequences, as the names for the submission files or evaluation variables are too long to display on those plots
RGASP: ROUND 2
Codes for Participants
chr | Christian Iseli | mar | Mario Stanke | |
ger | Gerstein lab | sea | Sean Grimmond | |
gun | Gunnar Raetsch | sim | Simon White | |
hug | Hugues Richard | tho | Thomas Wu | |
jie | Jie Wu | tyl | Tyler Alioto | |
lio | Lior Pachter | vic | Victor Solovyev | |
wol | Wold/Mortazavi |
Annotation Datasets Feature Summaries
FLY
| ||||
SET | #CDS | #EXON | TOTAL FEATS | #TRANSCRIPTS |
FILT | 103107 | 116534 | 219641 | 22584 |
LOW | 9307 | 10419 | 19726 | 2283 |
MED | 44101 | 49139 | 93240 | 9589 |
HIGH | 49696 | 56973 | 106669 | 10712 |
WORM
| ||||
SET | #CDS | #EXON | TOTAL FEATS | #TRANSCRIPTS |
FILT | 148869 | 148869 | 446607 | 22576 |
LOW | 34036 | 34036 | 102108 | 6090 |
MED | 64545 | 64545 | 193635 | 8140 |
HIGH | 45492 | 45492 | 136476 | 7331 |
HUMAN
| ||||
SET | #CDS | #EXON | TOTAL FEATS | #TRANSCRIPTS |
FILT | 1048919 | 795799 | 1781809 | 94576 |
LOW | 369822 | 291578 | 654698 | 35024 |
MED | 523206 | 393843 | 876955 | 45013 |
HIGH | 155891 | 110378 | 250156 | 14539 |
Feature totals can include other annotation features than cds/exons, like introns.
Evaluation Scores Data
HUMAN | WORM | FLY | Cols Description | |
Validation Summary | tbl NAY | tbl NAY | tbl NAY | tbl |
Nucleotide Level Evaluation | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | tbl |
Exon Level Evaluation | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | tbl |
Gene/Transcript Level Evaluation (xSEQs) | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | tbl |
Gene/Transcript Level Evaluation (xGENEs) | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | ALL FIL LOW MED HIG | tbl |
NAY means Not Available Yet.
Hub of Graphical Results
HUMAN | WORM | FLY | DESCRIPTION | |
SNSP SCATTERPLOTs | hub | hub | hub | SN versus SP for all the submissions on all the sequences. |
BOXPLOTs by SUBMISSION | hub | hub | hub | SNSP at sequence level against submissions for each evaluation variable. |
BOXPLOTs by SEQUENCE | hub | hub | hub | SNSP at submission level against sequences for each evaluation variable. |
BOXPLOTs by VARIABLE | hub | hub | hub | SNSP at sequence level against evaluation variables for each submission. |
BOXPLOTs by ANNOTATION SET | hub | hub | hub | SN comparison among the Filtered, Low, Med and High sets, for every variable on all submission. |
NOTES: Same points from the previous NOTES on the section about RGASP round 1 apply to the results linked from this table.