Gene loci information

Transcript annotation

  • This transcript has been annotated as U2 snRNP-associated SURP motif-containing protein.

Parent gene

Gene structure

  • The exon-intron structure of all isoforms are indicated below. CDS regions are colored in green. TSS and TTs that were predicted with CTR-Seq data are indicated in solid circle and squares, respectively. More specific data are shown in the table below.

Chromosome Gene Transcript Category ID Start End
chr_2 g8114 g8114.t1 TTS g8114.t1 28643872 28643872
chr_2 g8114 g8114.t1 isoform g8114.t1 28643978 28647001
chr_2 g8114 g8114.t1 exon g8114.t1.exon1 28643978 28644080
chr_2 g8114 g8114.t1 cds g8114.t1.CDS1 28643978 28644080
chr_2 g8114 g8114.t1 exon g8114.t1.exon2 28644146 28644737
chr_2 g8114 g8114.t1 cds g8114.t1.CDS2 28644146 28644737
chr_2 g8114 g8114.t1 exon g8114.t1.exon3 28644804 28644878
chr_2 g8114 g8114.t1 cds g8114.t1.CDS3 28644804 28644878
chr_2 g8114 g8114.t1 exon g8114.t1.exon4 28644950 28645233
chr_2 g8114 g8114.t1 cds g8114.t1.CDS4 28644950 28645233
chr_2 g8114 g8114.t1 exon g8114.t1.exon5 28645291 28646350
chr_2 g8114 g8114.t1 cds g8114.t1.CDS5 28645291 28646350
chr_2 g8114 g8114.t1 exon g8114.t1.exon6 28646417 28646927
chr_2 g8114 g8114.t1 cds g8114.t1.CDS6 28646417 28646927
chr_2 g8114 g8114.t1 exon g8114.t1.exon7 28646996 28647001
chr_2 g8114 g8114.t1 cds g8114.t1.CDS7 28646996 28647001
chr_2 g8114 g8114.t1 TSS g8114.t1 28647098 28647098

Sequences

>g8114.t1 Gene=g8114 Length=2631
ATGAGTAAATTATCAAAAAGAGAATTAGAGGAAAAGAGGAGAAGAGAAATTGAAGAAGAA
AATGCAAAAGCTTTTGAGGATTTTTATGAAACATTTCAAAACACATCAAACTCAGCTGCT
TCAAAGGTTTGGATAAAAGCAGGAACATACGATGCTGGTTCAAGAAAAGAAGATACAAAG
GAGAAAGGAAAACTTTATAAGCCGACTCCAAAAGCTGGAACTTCTCATGACGATAACTCT
GCTTCTTCTGCTCAAGAATATGCAAGAATGCTAGCAACGTCAGATTCTCGTAAAGAAATA
ACACCTTTGGGCAAAAAGAAAGCTCAAGAGAAGAAAAAATCTAATTTAGAAATGTTTAAA
GAAGAACTCAAACAAATGCAAGAGGAACGTGAAGAGAGACATAAATATAAACATGCGGCG
AAATCAATGATGCAAATGCAAATTCACAGTGTTTTAGAACATCCTGAACCGATTTATCGT
GAAGTTCAAGAATCACAAGGAAGTTTTGATACAGGGGATCCAACAACTACCAATTTATAT
CTTGGTAATTTAAATCCTAAAATAACTGAGCAACAGTTAATGGAAATATTTGGAAAATAC
GGACCATTAGCAAGTGTTAAAATTATGTGGCCGAGATCTGATGAGGAAAAAGCTCGCGGT
AGAAATTGTGGATTTGTAGCATTTATGGCAAGAAAAGATGCTGAGCGTGCATTAAGGCGG
TTAGCAGGTAAAGATGTTATGGGTTATGAAATGAAACTTGGTTGGGGCAAAAGTGTCCCA
ATTATGACTCATCCGATTTATATTCCACCAAAATTACAAGAATATGCAATGCCACCACAA
AAATCAGGATTACCTTTTAATGCTCAACCACTTGGTGATGTTACAAATATCAATGTAGAT
GAACTTGATTTCAAATCATATTTAGTTGATGAAGAGAAAAAAAAGCAAATAGATGAGATT
TTATCAAAGACTATTGTAAAAGTAGTAATACCCACAGAGCGACCATTGCTCATGTTAATA
CATAGAATGATTGAATTTGTAATTCGAGAAGGACCAATTTTTGAATCTATGATAATGGCA
CGAGAACAAAATAATCCAAATTTTCGTTTTTTATTCGAATTCGACAGTGCAGCGCATATT
TATTATAGATGGAAATTATTTTCGTTGCTTCAAGGTGACACACCTTTAGAATGGAGTGAA
AAAGAATTTAGAATGTTTAAAAACTCATCAATTTGGCAACCACCAAAAGTATGTCAATTT
AGTCAAGGTATGCCAGAAGAATTAATTTCCGATGATGAATTGCTTGAACCTTGCAAAGGT
CAACTCTCTGTTGCACAAAGAAATCGACTTGAGGACTTGATCAGACATTTGACTCCAGAT
CGAAGTAAGATCGGAGATGCTATGGTTTTTTGTATTGAACATGCAGATGCTGCTGATGAA
ATTTGTGACTGTATTGCTGAATCATTGACAAATCCTCAAACAGCAATTCACAAAAAAATT
GCTAGAATTTATCTCGTTTCTGATATTCTTCATAATTGCACAGTAAAGGTTCAAAATGCG
AGCTTTTTTAGGAAATCAATGGAAAAAAATCTTATTGAAATGTTTAAAGGACTTCATGAT
AGCTACAAACTTTTAGAAAGTCGTTTGAAAGCAGAAGGTTTTAAAGTGAGAATTATGAAA
ATCTTTAGAACTTGGGAAGAATGGGCAGTTTATAGTCGTGATTTTTTGATAAAATTACAA
AATACATTTTTGGGTGTAGCAATTACTGAAAACACAGGCAACAGTGGTCATCAGTTATCA
GATAAAGAAGAAGACGAAGATCTTGATGGAATGCCTCTTGATGGTGCTGCACTTTTAAAA
GGAGCCCTGATGAGAGGAATTCGAACACCTGAACATTCAGAAAATGAAAATGAAGATGAC
ATCGATGGAGTACCATTGGTAGATGAAAATATTGATGGAGTTCCACTTTTGGCACCATCA
ACGACAGCAGCAGCAACAGATTCATCATCTGCAGGTTTTGTGAAGTCAAAATGGGAAGAA
TTGGATCCCGAGCAAGTAGCTCATCAAGCAATTACAACTTCTAAATGGGAATTTGATCCC
ATTGCTCCTGAACCACCAAAAATTTCTTCAATTTGTGATTATGGTAATAGTGAAAGCGAA
AGTAGCGAAAGTGAAACAGAAGAAAAACGCAGACGTTTAAGAGAGATTGAACTTAAAATT
TGTAAATATCAAGATGAACTAGAATCTGGTGAACGACAAATGAAAAGAGGATATACTGTG
CAAGAACAAGTTGAAAGTTATAGAAGAAAATTACTACGAAAATCTGAACGACATGATTCA
GATTCTCAATCAACTTCAGATCGCTATCAATCATCATCATCAAAACGAGAGCGAAGCAGG
AGAAGTAGAAGCTCATCAAATGAAAGACGAGTAAAAAAGTCTAGAAAATCATCGTCCTCT
GAGAGAGAGAAATATTATAGCAGTAGTTCAAAAATATCAAAATCACCATCATCCTCATCA
AAATCAAAACATCCCAAACGAAGTGGCAGAAGTAGAAGCAAAGACAGCCTCTCGAATTCA
CCATCGTATTCGTCATCAAGAAAGTCACACAAGTCCAAATATAAATATTAA

>g8114.t1 Gene=g8114 Length=876
MSKLSKRELEEKRRREIEEENAKAFEDFYETFQNTSNSAASKVWIKAGTYDAGSRKEDTK
EKGKLYKPTPKAGTSHDDNSASSAQEYARMLATSDSRKEITPLGKKKAQEKKKSNLEMFK
EELKQMQEEREERHKYKHAAKSMMQMQIHSVLEHPEPIYREVQESQGSFDTGDPTTTNLY
LGNLNPKITEQQLMEIFGKYGPLASVKIMWPRSDEEKARGRNCGFVAFMARKDAERALRR
LAGKDVMGYEMKLGWGKSVPIMTHPIYIPPKLQEYAMPPQKSGLPFNAQPLGDVTNINVD
ELDFKSYLVDEEKKKQIDEILSKTIVKVVIPTERPLLMLIHRMIEFVIREGPIFESMIMA
REQNNPNFRFLFEFDSAAHIYYRWKLFSLLQGDTPLEWSEKEFRMFKNSSIWQPPKVCQF
SQGMPEELISDDELLEPCKGQLSVAQRNRLEDLIRHLTPDRSKIGDAMVFCIEHADAADE
ICDCIAESLTNPQTAIHKKIARIYLVSDILHNCTVKVQNASFFRKSMEKNLIEMFKGLHD
SYKLLESRLKAEGFKVRIMKIFRTWEEWAVYSRDFLIKLQNTFLGVAITENTGNSGHQLS
DKEEDEDLDGMPLDGAALLKGALMRGIRTPEHSENENEDDIDGVPLVDENIDGVPLLAPS
TTAAATDSSSAGFVKSKWEELDPEQVAHQAITTSKWEFDPIAPEPPKISSICDYGNSESE
SSESETEEKRRRLREIELKICKYQDELESGERQMKRGYTVQEQVESYRRKLLRKSERHDS
DSQSTSDRYQSSSSKRERSRRSRSSSNERRVKKSRKSSSSEREKYYSSSSKISKSPSSSS
KSKHPKRSGRSRSKDSLSNSPSYSSSRKSHKSKYKY

Protein features from InterProScan

Transcript Database ID Name Start End E.value
15 g8114.t1 CDD cd12223 RRM_SR140 176 259 1.53837E-53
14 g8114.t1 Coils Coil Coil 7 27 -
13 g8114.t1 Coils Coil Coil 106 129 -
12 g8114.t1 Coils Coil Coil 719 739 -
11 g8114.t1 Gene3D G3DSA:1.10.10.790 - 332 393 1.9E-18
10 g8114.t1 Gene3D G3DSA:1.25.40.90 - 442 585 1.4E-34
25 g8114.t1 MobiDBLite mobidb-lite consensus disorder prediction 49 84 -
22 g8114.t1 MobiDBLite mobidb-lite consensus disorder prediction 51 67 -
20 g8114.t1 MobiDBLite mobidb-lite consensus disorder prediction 697 731 -
24 g8114.t1 MobiDBLite mobidb-lite consensus disorder prediction 744 876 -
21 g8114.t1 MobiDBLite mobidb-lite consensus disorder prediction 762 784 -
23 g8114.t1 MobiDBLite mobidb-lite consensus disorder prediction 823 839 -
5 g8114.t1 PANTHER PTHR23140 RNA PROCESSING PROTEIN LD23810P 5 794 3.1E-230
6 g8114.t1 PANTHER PTHR23140:SF0 U2 SNRNP-ASSOCIATED SURP MOTIF-CONTAINING PROTEIN 5 794 3.1E-230
4 g8114.t1 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 179 251 5.1E-14
2 g8114.t1 Pfam PF01805 Surp module 338 388 3.5E-18
3 g8114.t1 Pfam PF04818 CID domain 450 578 6.2E-12
1 g8114.t1 Pfam PF08312 cwf21 domain 728 774 3.7E-7
26 g8114.t1 ProSiteProfiles PS50102 Eukaryotic RNA Recognition Motif (RRM) profile. 177 258 16.654
28 g8114.t1 ProSiteProfiles PS50128 SURP motif repeat profile. 339 382 13.964
27 g8114.t1 ProSiteProfiles PS51391 CID domain profile. 442 587 38.294
19 g8114.t1 SMART SM00360 rrm1_1 178 254 7.4E-17
16 g8114.t1 SMART SM00648 surpneu2 337 391 2.4E-16
17 g8114.t1 SMART SM00582 558neu5 445 584 3.4E-31
18 g8114.t1 SMART SM01115 cwf21_2 725 776 6.7E-11
7 g8114.t1 SUPERFAMILY SSF54928 RNA-binding domain, RBD 165 260 4.97E-21
8 g8114.t1 SUPERFAMILY SSF109905 Surp module (SWAP domain) 321 395 1.7E-20
9 g8114.t1 SUPERFAMILY SSF48464 ENTH/VHS domain 448 582 3.14E-9

Transmembrane regions from TMHMM

Disordered region

IUPRED3 score over 0.5 is predictive of a disordered region.

GO terms from InterProScan

GOID TERM ONTOLOGY
GO:0006396 RNA processing BP
GO:0003723 RNA binding MF
GO:0003676 nucleic acid binding MF

KEGG

Orthology

Pathway

  • This transcript belongs to the following pathways

Expression

Transcript expression in Pv11 cells

TPM values are indicated as average +/- STDEV.

Differential expression

Differentially expressed genes were identified with DESeq2 using the ‘run_DE_analysis.pl’ script from Trinity. Transcripts were determined as differentially expressed when (1) FDR < 0.05 (2) fold change > 2 (TPM calculated by RSEM). DE information and fold change between conditions are indicated in the plot below.

Raw TPM values