Gene loci information

Transcript annotation

  • This transcript has been annotated as G/T mismatch-specific thymine DNA glycosylase.

Parent gene

Gene structure

  • The exon-intron structure of all isoforms are indicated below. CDS regions are colored in green. TSS and TTs that were predicted with CTR-Seq data are indicated in solid circle and squares, respectively. More specific data are shown in the table below.

Chromosome Gene Transcript Category ID Start End
chr_3 g3383 g3383.t1 TTS g3383.t1 24998695 24998695
chr_3 g3383 g3383.t1 isoform g3383.t1 24999208 25003947
chr_3 g3383 g3383.t1 exon g3383.t1.exon1 24999208 24999734
chr_3 g3383 g3383.t1 cds g3383.t1.CDS1 24999208 24999734
chr_3 g3383 g3383.t1 exon g3383.t1.exon2 24999802 25001164
chr_3 g3383 g3383.t1 cds g3383.t1.CDS2 24999802 25001164
chr_3 g3383 g3383.t1 exon g3383.t1.exon3 25001254 25001616
chr_3 g3383 g3383.t1 cds g3383.t1.CDS3 25001254 25001616
chr_3 g3383 g3383.t1 exon g3383.t1.exon4 25001713 25001842
chr_3 g3383 g3383.t1 cds g3383.t1.CDS4 25001713 25001842
chr_3 g3383 g3383.t1 exon g3383.t1.exon5 25001907 25002809
chr_3 g3383 g3383.t1 cds g3383.t1.CDS5 25001907 25002809
chr_3 g3383 g3383.t1 exon g3383.t1.exon6 25003367 25003947
chr_3 g3383 g3383.t1 cds g3383.t1.CDS6 25003367 25003947
chr_3 g3383 g3383.t1 TSS g3383.t1 25004970 25004970

Sequences

>g3383.t1 Gene=g3383 Length=3867
ATGGAGTCTTCCACTCCACCAATATCTTCTTCCGATAATAATAATCATCCTATGAATACA
ATTGCTCAAGATCCGCTCGCATTACCAACTGAGATTTCATCGTCTAGCCTATACAATAAT
AATAATAACAGTGGTAATAGCAGCAGCGGTAATAATGAAACTCGAACACATGATAGTGGA
AATTTAACGTCGTCTTTATCGACGGGGACAACAACCCAAAGCAATAGTAGCAATAATATT
AGTAACATAAGTGATAAAGAAAAAGAAAATCTCATTGATGGCACTATAAATAATAATAAT
AATAATGGCGCTATATATTCAAATGATATAACAATAAAACATGAAGAATCAAATTCATTG
ACAAATGACGATCGACGAATGCAATATGATTCCAACCAAATAGCAAATAATTATTCTAAT
AGTGACCCAAGAAAACTTAATCATCATCATCAACATCATATGAACATGCATGTACCACCA
CAACAACAGCATGAGCCGTCACACCATAAAACATTAAGTCATTCCATAGAAAATCTAAGT
AAAACGACAGAGAAATGTACATCTAATGAAAATATGAAAATGTATCCGAATCTAATAGAT
GGTCATGAATACGATGATGGCTATGAAACAGGTGTGAGTGACTTAATGTCACATAAATAT
AAACTTCCATCGCTACCACCGCCTTCATCCTCACCCTATTCATCTCTTTCATCGTCATCG
ATGTCAACCATTTCGACTGCGACGATGACACCAACAAATTTGGTAAATCCCCAACAACAA
CAATCGCCGTCATACATGAGTGGTAGTACAGGCAACGGAAATGCCGGAAATAATAATATG
AATCATTATCAATCGATGCAACAACAACAGCAGCAATATCAAGTCATACAGCAGCAGCAG
CAACAATATCAAGCAATGCAACATTCTAACCATATGAATTCTTTTAATTATCATCATCCA
ATGTTTAGAATGCAGCAGCAACAACAACAATCAGCTCATCAACTCGCGAATGTTGTAAAA
ACTGAACCGCCAGCTAATAATGATCCATCATTAATTACTGATGATCCTTATCGTTTTGTG
GATGAAGATTTGTCAAATGGAATGAGTACAGGTCAGGCACTTCATAATTCAATACATCAA
AATCAGCAACAGCAGCACATGTCATCTCCTGTAATGTCAAATAGCAATTATTCAGCATCC
TCTTCTCCAATGCCATCACAACAACAACAGTCACAAAGTGGAGGCGGAAATGCATCGTTG
TTGCAATTCAATGAACATCACATGTCACCCCATCATATGAACCCTGTTAATTATATGAAC
CATCAGCAACATATACAATCACAACACAATTTAATGAATGGAAATGTTCATGAAAATTCT
GGTCACAGTCATGGGGGAATAAACACAAATAACATGAATATGATGATGATGAATGATACA
CCAAAAAAGCGAGGTCGCAAAAAGAAGTTACGAGATGAAAATGGTAATCCAATTGAAGAT
AGTAGACCAACAAAAGAACGAAAAAAGCATGATCGATTTAATGGTATGCCAGAGGAAGAA
GTCACAAAACGTGTTATACCAGATCATCTTACTAATAATCTTGATGTTGTTATCGTCGGA
ATTAATCCAGGTCTTTTTGCTGCATATAAGGGCCATCATTATGCGGGTCCTGGAAATCAT
TTCTGGAAGTGTCTCTATCTATCTGGACTAACTTCTGTGCAAATGACAGCTGAGGAGGAT
TATAAATTACTACATTGTGGCATTGGTTTTACGAATATGGTGCAAAGACCTACAAAAGGT
AGTGCAGATTTAACTCGTAAAGAAATTAAAGAAGGTTCTCGAATTTTACTCGAAAAATTG
CAAAAATATCAACCAAAAATTGCTGTCTTTAATGGAAAACTCATCTTTGAAGTGTTTAGT
GGCAAAAAAGACTTTTGTTTTGGCAAACAACCCGAGACAGTTGAAGGAACAAATACATAT
ATGTGGGTAATGCCATCATCAAGTGCAAGATGTGCCCAATTGCCAAGAGCAGCTGATAAG
GTCCCATTTTATGCGGCGCTTAAAAAATTTAGAGATTATTTAAATGGTCTTGTACCAGAA
TATGATGAAAGCGAATATGTATTTAATGATCCAAAATATAATCAAATTAGTACTGAAGCG
GAACTAAAAGATGATGCACTTGCTTCATTTACAACACTAGAAAATGGTGAAATTGTAGCA
ATAGACTCAAATAAAAAGAAACGTGGACGACCTAAGAAAGTGCGCGGTGAAAATGGCGAA
GAAATTGTAGCAACGCCAAGACGAAAACAAAATACTGAACTCAATCAGGATGGCTTAGAA
GATGATCCAAGTAAAAAGAAACGTGGCCGCCCAAAGAAAAATAAAGATGGAACAACAACT
TCTTCAACAAATAACCGAAAAAAGAGCAATAAAGAAAAGCAACAACCTTCACCATTAAAT
AGTGGCAATGAATTGAGTTCGATGTCCCTTCATCATCATCATTCATCATCGCCTATCTTA
CAAAATCCCCAAATGTCTAGTCTATCGAATAATATTGGAAGCAATAGTAATACAGGATCA
TCGTCAACACCTTCGTCTTCATTCTTGCCACCTATTGATTCTCCATCGGGCATTAGTTAT
CAAAATGGTGCTATAAATCAACAGCAAACGAACCTTCAAACACAATCTCCTGCAGGTCAC
TGTTTTTCTTCATCGAATTCAATCAATGAGCATGTGTCACAGCCAGAAATTTCACCACAT
CAGACATATCAGAGACAGGCAAATGCTCAACAAATGTATAATACAACTACACCCGAAATA
ACTTCTGATCAAATAAATTCGCCTGTAAATGCTGTGCCCTCACCTAGTCTAGCACCATCA
GATTTTGAACCGCCACAAAATACAAATGACCTTGATCCATACTCACAGCAGCAGTCTGAA
CCGCATTCGCAATCGCCATTGTCTCAATCGCTTTATCCACATCATCAACAGCAAGCAATG
TCGCAAATGATGAATAATGTCAGCAATAGTCCCGTATATTCACCATATTCAAGGCAACAG
TCACAGCAGCAACAGCAACAAACTCCTCAGCAAAACTACATGCAATCACCACATCATACG
CAACAGCAATCTCCAGTAGTGACTCCTCAAAATAACTCTGTAAATAGTAACAAATCAATT
GGACAACCGCAACATATTAATGGCACATCACAAACTAGCGGTGATATGTTTAAAGATGTC
GCTACAAAAAGTCTTTCGGGCCTCGAGTCCTTAGTCGATCAAATACCAAATATTAATGAA
CAAGACACAGGTCTAAGTGCGCTTTCTGGAGCACAAAATGTTCAACATCAACTTAACGAC
TATTCTAACATGCTTTCTGGTTATTCCTCCTCATCAGGCACTACAGCATCATCAAATTTG
ACTCCAGTTTCGACAGCACTTGGCTCTTCATCGCTTCTACCGCCATCAATAGGCGCTCCT
CATCCACATTATCCTTATACATCACATGCTACTGCAACTCAATCGCCCTATTCAGCAAAT
CCATTCTCTGTAAGTAGTTTGACATCATCGAATTATCCTTCGGCAGCTGCAGCAGCAATG
AATAGTTATCATCAAAATCTAATGGGAACGAGTCATTTAACGTCATCTTTTATGGAACCA
CATATGCCAGTTCCTGTTACGCCTCTTTATCATTCTTATCAACAACAAGGCTATCCTGGT
TATCCTGCACCTCCTCATCATCATCCTTCAGCTGCAGCAGCCGCTTTACATATGTCAAAC
TATCCCTATTACACAAACACAGGCTACACACAAGCACCTGGCAGCCATTCAACGTACCAT
TCAATGTTTGACAGAATAAACTTTTGA

>g3383.t1 Gene=g3383 Length=1288
MESSTPPISSSDNNNHPMNTIAQDPLALPTEISSSSLYNNNNNSGNSSSGNNETRTHDSG
NLTSSLSTGTTTQSNSSNNISNISDKEKENLIDGTINNNNNNGAIYSNDITIKHEESNSL
TNDDRRMQYDSNQIANNYSNSDPRKLNHHHQHHMNMHVPPQQQHEPSHHKTLSHSIENLS
KTTEKCTSNENMKMYPNLIDGHEYDDGYETGVSDLMSHKYKLPSLPPPSSSPYSSLSSSS
MSTISTATMTPTNLVNPQQQQSPSYMSGSTGNGNAGNNNMNHYQSMQQQQQQYQVIQQQQ
QQYQAMQHSNHMNSFNYHHPMFRMQQQQQQSAHQLANVVKTEPPANNDPSLITDDPYRFV
DEDLSNGMSTGQALHNSIHQNQQQQHMSSPVMSNSNYSASSSPMPSQQQQSQSGGGNASL
LQFNEHHMSPHHMNPVNYMNHQQHIQSQHNLMNGNVHENSGHSHGGINTNNMNMMMMNDT
PKKRGRKKKLRDENGNPIEDSRPTKERKKHDRFNGMPEEEVTKRVIPDHLTNNLDVVIVG
INPGLFAAYKGHHYAGPGNHFWKCLYLSGLTSVQMTAEEDYKLLHCGIGFTNMVQRPTKG
SADLTRKEIKEGSRILLEKLQKYQPKIAVFNGKLIFEVFSGKKDFCFGKQPETVEGTNTY
MWVMPSSSARCAQLPRAADKVPFYAALKKFRDYLNGLVPEYDESEYVFNDPKYNQISTEA
ELKDDALASFTTLENGEIVAIDSNKKKRGRPKKVRGENGEEIVATPRRKQNTELNQDGLE
DDPSKKKRGRPKKNKDGTTTSSTNNRKKSNKEKQQPSPLNSGNELSSMSLHHHHSSSPIL
QNPQMSSLSNNIGSNSNTGSSSTPSSSFLPPIDSPSGISYQNGAINQQQTNLQTQSPAGH
CFSSSNSINEHVSQPEISPHQTYQRQANAQQMYNTTTPEITSDQINSPVNAVPSPSLAPS
DFEPPQNTNDLDPYSQQQSEPHSQSPLSQSLYPHHQQQAMSQMMNNVSNSPVYSPYSRQQ
SQQQQQQTPQQNYMQSPHHTQQQSPVVTPQNNSVNSNKSIGQPQHINGTSQTSGDMFKDV
ATKSLSGLESLVDQIPNINEQDTGLSALSGAQNVQHQLNDYSNMLSGYSSSSGTTASSNL
TPVSTALGSSSLLPPSIGAPHPHYPYTSHATATQSPYSANPFSVSSLTSSNYPSAAAAAM
NSYHQNLMGTSHLTSSFMEPHMPVPVTPLYHSYQQQGYPGYPAPPHHHPSAAAAALHMSN
YPYYTNTGYTQAPGSHSTYHSMFDRINF

Protein features from InterProScan

Transcript Database ID Name Start End E.value
6 g3383.t1 CDD cd10028 UDG-F2_TDG_MUG 526 688 2.5237E-74
5 g3383.t1 Coils Coil Coil 286 306 -
4 g3383.t1 Gene3D G3DSA:3.40.470.10 - 508 703 5.8E-76
14 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 43 85 -
10 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 223 244 -
13 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 377 416 -
19 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 481 513 -
17 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 490 513 -
12 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 742 840 -
11 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 753 787 -
18 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 814 840 -
15 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 909 928 -
16 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 972 999 -
9 g3383.t1 MobiDBLite mobidb-lite consensus disorder prediction 1011 1052 -
2 g3383.t1 PANTHER PTHR12159 G/T AND G/U MISMATCH-SPECIFIC DNA GLYCOSYLASE 297 719 8.1E-112
1 g3383.t1 Pfam PF03167 Uracil DNA glycosylase superfamily 529 680 4.1E-19
7 g3383.t1 SMART SM00384 AT_hook_2 745 757 1.7
8 g3383.t1 SMART SM00384 AT_hook_2 785 797 0.96
3 g3383.t1 SUPERFAMILY SSF52141 Uracil-DNA glycosylase-like 526 691 1.48E-39

Transmembrane regions from TMHMM

Disordered region

IUPRED3 score over 0.5 is predictive of a disordered region.

GO terms from InterProScan

GOID TERM ONTOLOGY
GO:0006285 base-excision repair, AP site formation BP
GO:0003677 DNA binding MF
GO:0000700 mismatch base pair DNA N-glycosylase activity MF

KEGG

Orthology

Pathway

  • This transcript belongs to the following pathways

Expression

Transcript expression in Pv11 cells

TPM values are indicated as average +/- STDEV.

Differential expression

Differentially expressed genes were identified with DESeq2 using the ‘run_DE_analysis.pl’ script from Trinity. Transcripts were determined as differentially expressed when (1) FDR < 0.05 (2) fold change > 2 (TPM calculated by RSEM). DE information and fold change between conditions are indicated in the plot below.

Raw TPM values