Research

Genes & Protein Networks

PlanNET: Navigating through planarian interologs

PlanNET is a web application that stores predicted Schmidtea mediterranea protein-protein interactions (ppi) projected over a Human ppi network, in order to provide a transcript-centric exploration of the planarian interactome. A protocol was developed to predict protein-protein interactions using sequence homology data and a reference Human interactome; it was then applied over 10 different transcriptomes. The layered design of the database can handle the multiplicity of transcriptome-specific networks, facilitates the comparison among them, and allows the integration of newly sequenced/assembled transcriptome versions. Currently, PlanNET holds 273,203 human ppi, and 729,043 Schmidtea mediterranea interactions.

Further details are available at the following reference:
PlanNET: homology-based predicted interactome for multiple planarian transcriptomes
Castillo-Lara and Abril, Bioinformatics, 2018, 34(6):1016–1023. OPEN ACCESS
You can access PlanNET from this link: https://compgen.bio.ub.edu/PlanNET

RPGeNet: Retinitis Pigmentosa Gene Network

Image by Josep F Abril

The current challenge in human molecular genetics is to bridge the gap between genotype and phenotype, particularly in highly heterogeneous monogenic disorders, such as retinitis pigmentosa (RP), as well as in polygenic and complex diseases. It is worth noting that after long years of research from many groups, more than 70 non-syndromic RP genes have been identified so far, but around 30% of the cases remain genetically unassigned. Network analysis from a wealth of heterogeneous data sources is particular useful to unveil functional clues and pinpoint unsuspected relevant genes. We hereby present the methodology used to gather information from several publicly available datasets, generate a computational analysis framework on a core network data structure, and provide a user-friendly web interface (RPGeNet) to explore that network. This tool will aid researchers in human genetics and related fields to understand the molecular role of the reported RP (and the closely related disease, Leber congenital amaurosis) genes as well as provide a rationale to identify novel candidates. The identification of key molecular nodes on this type of interaction networks will be instrumental in optimizing diagnosis and devising future efficient therapeutic approaches.

Further details are available at the following reference:
Distilling a Visual Network of Retinitis Pigmentosa Gene-Protein Interactions to Uncover New Disease Candidates
Boloc et al, PLoS ONE, 2015, 10(8): e0135307. OPEN ACCESS
You can access RPGeNet from this link: https://compgen.bio.ub.edu/RPGeNet

Gene Structure

Transcriptional Complexity of CERKL Gene

Image by Josep F Abril

In order to analyze the transcriptional repertoire of CERKL in human and mouse, both computational and wet-lab experiments were conducted. The genomic sequences were explored to determine the promoters and the corresponding transcription start sites (TSSs), the alternative splicing variants, as well as the different putative translation initiation sites (TISs), which altogether is compatible with a wide display of functional domains and contributes to the final mRNA and protein complexity for this gene.

Further details are available at the following reference:
High transcriptional complexity of the retinitis pigmentosa CERKL gene in human and mouse
Garanto, et al, IOVS, 2011, 52(8):5202.

Phylogenetic Analysis on Planarian Matrix Metalloproteinase Genes (MMPs)

Image by Josep F Abril

From human MMP sequences, four S.mediterranea homologs were found on planarian transcriptomes. They were cloned from those sequences to validate experimentally and describe their phenotypes further. PFAM domains related to the Matrix Metalloproteinase proteins were mapped onto the set of homologous sequences for distinct families of Matrix Metalloproteinase genes. Phylogenetic reconstruction over the protein sequence alignment was computed by RAxML. Both data were merged to produce image on the left using iTOL.

This research has been published in:
Planarians as a Model to Assess In Vivo the Role of Matrix Metalloproteinase Genes during Homeostasis and Regeneration
Isolani, et al, PLOS ONE, 2013, 8(2):e55649. OPEN ACCESS

Planarian OMICS

Digital Gene Expression (DGE) Analysis on Schmidtea mediterranea

Image by Josep F Abril

Taking advantage of digital gene expression (DGE) sequencing technology we compare all the available transcriptomes for S. mediterranea and improve their annotation. These results are accessible via web for the community of researchers. Using the quantitative nature of DGE, we describe the transcriptional profile of neoblasts and present 42 new neoblast genes, including several cancer-related genes and transcription factors. Furthermore, we describe in detail the Smed-meis-like gene and the three Nuclear Factor Y subunits Smed-nf-YA, Smed-nf-YB-2 and Smed-nf-YC. In conclusion, we found that DGE is a valuable tool in our case for gene discovery, quantification and annotation. The application of DGE in S. mediterranea confirms the planarian stem cells or neoblasts as a complex population of pluripotent and multipotent cells regulated by a mixture of transcription factors and cancer-related genes.

Processed DGE data can be accessed from this page.
Raw libraries and source sequencing files can be downloaded at the NCBI GEO repository.
In-house scripts and source code developed for the DGE analysis are available as a repository.
Further details will be available at the following reference:
"Digital Gene Expression approach over multiple RNA-Seq data sets to detect neoblast transcriptional changes in Schmidtea mediterranea"
Rodríguez-Esteban et al., BMC Genomics, 16:361, 2015 (AOP May 7th, 2015). OPEN ACCESS

Transcriptome Sequencing by 454

Image by Marta Rodríguez

A detailed description of the planarian transcriptome is essential for future investigation into regenerative processes using planarians as a model system. In order to obtain the most representative set of planarian genes expressed under different physiological conditions, total RNA was isolated from a mixture of non-irradiated and irradiated intact and regenerating planarians of species Schmidtea mediterranea . We have performed sequence analyses on the assemblies of reads obtained by 454-sequencing from that pool of transcripts. Among those analyses, functional annotation was useful in order to identify putative homologues of several gene families that may play a key role during regeneration, such as neurotransmitter and hormone receptors, homeobox-containing genes, and genes related to eye function.

Further details are available at the following reference:
Smed454 dataset: unravelling the transcriptome of Schmidtea mediterranea
Abril, Cebrià, et al, BMC Genomics, 2010, 11:731. OPEN ACCESS
Data is available through the Smed454 web site interface, which includes:
- a contig assembly browser (click on VIEWER button),
- BLAST searches (see BLAST button),
- as well as the sequence files ready for download as compressed tarballs (follow the DATA button).

Proteomic Analyses on Planarian Neoblasts

Image by Josep F Abril

The genome sequencing of S. mediterranea and some EST projects generated interesting data to delineate neoblast cells features. There are some molecular aspects not reflected at both, genomic and transcriptomic levels, because little information at protein dynamics level exists. This work attempted to open a new unexplored area in the planarian research field. We developed a proteomics strategy in order to identify and characterize neoblast specific proteins. In this paper we describe the method and discuss the results in comparison with genomic analysis carried out in planaria, as well as with proteomic studies using other stem cell model systems.

Further details are available at the following reference:
A proteomics approach to decipher the molecular nature of planarian stem cells
Fernández-Taboada, et al, BMC Genomics, 2011, 12:133. OPEN ACCESS
Planarian Proteomics page contains both raw and processed data from the proteomics experiments and the posterior computational analyses. This page was provided as additional material on the paper above.

Genome Annotation

Evaluation of Computational Gene-Finders

RGASP'09/10

We have participated in the evaluation of the submissions by gene prediction groups to the RNA-seq GASP (RGASP). An analysis workshop was held at the Wellcome Trust Conference Center in Hinxton, UK, on November 10-11, 2009. We provide below the related links, including the web page summarizing the whole set of evaluations:

RGASP Home Page.
RGASP Summary of Evaluation Results.
An updated version of the analyses has been published online in advance, here you can find the reference to full text:
Assessment of transcript reconstruction methods for RNA-seq
Steijger, et al., Nature Methods, 10(12):1177–1184, 2013.
Press releases:

EGASP'05

Josep F abril was involved, during his PhD thesis at Roderic Guigó's lab, in the human ENCODE Genome Annotation Assessment Project (EGASP). An analysis workshop was held at the Wellcome Trust Conference Center in Hinxton, UK, on May 6-7, 2005. You can get more information about the event and the results of the evaluations from the following links: