Gene loci information

Transcript annotation

  • This transcript has been annotated as Host cell factor.

Parent gene

Gene structure

  • The exon-intron structure of all isoforms are indicated below. CDS regions are colored in green. TSS and TTs that were predicted with CTR-Seq data are indicated in solid circle and squares, respectively. More specific data are shown in the table below.

Chromosome Gene Transcript Category ID Start End
chr_4 g16840 g16840.t2 TSS g16840.t2 11359129 11359129
chr_4 g16840 g16840.t2 isoform g16840.t2 11359595 11364344
chr_4 g16840 g16840.t2 exon g16840.t2.exon1 11359595 11359641
chr_4 g16840 g16840.t2 cds g16840.t2.CDS1 11359595 11359641
chr_4 g16840 g16840.t2 exon g16840.t2.exon2 11359704 11359804
chr_4 g16840 g16840.t2 cds g16840.t2.CDS2 11359704 11359804
chr_4 g16840 g16840.t2 exon g16840.t2.exon3 11360153 11360165
chr_4 g16840 g16840.t2 cds g16840.t2.CDS3 11360153 11360165
chr_4 g16840 g16840.t2 exon g16840.t2.exon4 11360666 11360801
chr_4 g16840 g16840.t2 cds g16840.t2.CDS4 11360666 11360801
chr_4 g16840 g16840.t2 exon g16840.t2.exon5 11360871 11360909
chr_4 g16840 g16840.t2 cds g16840.t2.CDS5 11360871 11360909
chr_4 g16840 g16840.t2 exon g16840.t2.exon6 11360966 11361378
chr_4 g16840 g16840.t2 cds g16840.t2.CDS6 11360966 11361378
chr_4 g16840 g16840.t2 exon g16840.t2.exon7 11361436 11361694
chr_4 g16840 g16840.t2 cds g16840.t2.CDS7 11361436 11361694
chr_4 g16840 g16840.t2 exon g16840.t2.exon8 11361904 11363027
chr_4 g16840 g16840.t2 cds g16840.t2.CDS8 11361904 11363027
chr_4 g16840 g16840.t2 exon g16840.t2.exon9 11363082 11363517
chr_4 g16840 g16840.t2 cds g16840.t2.CDS9 11363082 11363517
chr_4 g16840 g16840.t2 exon g16840.t2.exon10 11363576 11363930
chr_4 g16840 g16840.t2 cds g16840.t2.CDS10 11363576 11363930
chr_4 g16840 g16840.t2 exon g16840.t2.exon11 11364241 11364344
chr_4 g16840 g16840.t2 cds g16840.t2.CDS11 11364241 11364344
chr_4 g16840 g16840.t2 TTS g16840.t2 11364824 11364824

Sequences

>g16840.t2 Gene=g16840 Length=3027
ATGTTAAAGTGGAAGAAAGTTACAAACACTTCAGGCCCACAGCCGAGACCACGGCATGGA
CATCGCGCCGTTGCTATACGTGAACTTATGGTGGTCTTTGGTGGCGGCAATGAAGGAATT
GTGGATGAACTTCATGTCTACAACACATACACAAATCAATGGTATGTGCCAGCAACAAAA
GGAGAAGTGCCACCAGGATGTGCAGCTTATGGATTTGTTGTTGATGGAACAAGAATTTTA
ATTTTTGGTGGCATGGTTGAATATGGAAAATATTCAAATGACCTTTTTGAACTTCAAGCT
ACAAAATGGGAATGGAAAAAAATTCGACCACAAGCTCCATTTTCTGGAGATGCTCCATGT
CCACGTTTAGGCCATTCATTTACACTTGTTGGTGACAAAGTTTTTCTATTTGCTGGTCTA
GCAAATGAGTCAGATGATCCAAAACACAATATCCCAAAATATCTCAATGACCTTTACATC
CTAAATACAAAAAATGGCAATTACAGTTGGGAAATACCAATAACATATGGAGAATGTCCA
CCACCTCGAGAATCACATACAGCAATAGCATGGAATGACAAATCAACTGGTCGCAATTAT
TTAGTCATTTATGGTGGCATGAGTGGTTGTCGTTTGGGTGATTTGTGGATTCTCGATATT
GAAACGCTCTCATGGAGCAAACCAAAGTGCTCTGGAACACCACCATTACCTCGATCATTG
CATTCAGCAACATTAATTAATGACAAAATGTATGTTTTTGGTGGATGGATTCCATTAATA
GTTGACAATTTTACATGCGAAAAAGAATGGCAATGTACAAATACACTTGGAGTTCTTGAT
TTACGTTCAATGACATGGGAAACTGTTGATATTAATGTTGAACCATCGCCAGGTGTTGAA
GTTGATGACATACCAAGAGCAAGAGCTGGTCATTGTGCAGTTGGCATTCATGGAAGGCTT
TATATTTGGAGTGGACGTGATGGATATCGAAAGGCATGGAATAATCAGGTTTGTTGTAAA
GATCTTTGGTATCTTGAAGTTGGTCCACCATCACAAGCTCAAAGAGTTCAATTGATAAGA
GCTAGTACATCAGCACTTGAACTTTCATGGACATCCATCCCAAATGCTCAATATTATATT
CTTGAAGTTCAAAAATTACCACCAGCACCACTCAATGAAAAACCAACAACACCATTGAAG
GAAGAAAATGAAGCAAAACCATCGCCAATAACTTCACCAACATCTATTGCCTCACCTATA
ACGTCTCAACCTAAAATCATTTATCAACCATCACCAACAGGTGATAAGAAACCCATGATC
ATCCTACAACCGAAGAAACCCCTTCAAATACCAACAGTTCGTGCTCCAACAGTTCTCAAA
GTTATGCCATCAGGATCACAAACCACCACAACGACACCAATCAAAGTTGTACAAAAACCA
GCAACCATCACACCAGCAAATGTCATTCGACCATCATTTACAGGCAATGTTATTAAACTT
ATGCCAGGAACAATTTTAAGTGGCAATAAAATTATTATGAAACCAGCAACAAGTGGAACA
GTCATTACAAAACCTGCAACACCACAACAAATTATTGTACAAAGGCCAGGAATGACAGTC
AATACAGCTGGTGCGATAAAATTAAATACAGTTCAAGCACCAAGAATTGTATCAACTACA
ACACAAAGGCCAATGAATATGACAATTGGTGGACGAACAGTCACATTACAACTTGCTGGA
CAGAAAAAAGTTACACTTGTTAATGCAGCACAAGCAACTGGAAGTACACCTAAAATTATT
ATGATGCCAGCAACATCAAGTGCACAAACAACAGTTCAACAACAACAACAACAGCAGAAA
AATACACAAATTGAACAACTCGATGGCGCTGATGTTGATTTTCAAATTGATCAACTTGAT
GGAGCAGTTGATAATGAAGATGATATTGAAAGTCAAATAAGTACAGTAAAAATTGAAAAA
CATGGCGATGGTGAAGATGTTAAAGTGACAACAAGTGGTGCTGGCGATAAAATGGAAGAA
AATGAAGCTGCTGCCATTTTGAGTACAATCAGTGAAGTCAGTCAACTTTCATCGCATTCC
GTTAATACGCACATTGCCATGCATGATGAAAAAGCACTACGGAATTCACTAAATGAAAAT
CTTCTCGCCTCACCAATTATCTCAAATTCCTCAAATGATTTCAATGGTTCAATTGAAAGT
TATGGTCGTCAACATTCACTAGACGCATTGGCAGCTGCAGCAATGCAAGCATCAAATAGC
AAAGCTGTTGCAAATCTCTCACAAATTACAGAAATCAAAACAAAAGAAAATAGCAGCAAT
TCAGAAAGTGACAATGGTGAAAAATGGATGGTTGTTGGAATTTTCAAAACACTTACACAA
AATGTTACACATTATGTTGATTATCAATGGCGTGATAATCTTGATAAATTGACTAGTGAA
AATATCCCTGATTTATCATTGCTTGAAAAAATTCCGATTGAACAGGGACGAACATATCGA
TTTAGAATTGCTGGAATTAATGCTTGTGGTGTTGGAAAGTTTAGTGAGCCAATTCATTTC
AAAACTTGTCTTCCTGGCTTCCCTGGTGCACCATCTGGAATTAAAATTACAAAATCACAA
GACGGTGCTCATTTAAGCTGGGAACCACCAAGTCAATCAAACTCACAAGGTGAAATCACT
GAATATTCAGTCTATTTGGCAGTTAAAGATCCAAATCCGTCTAAAGCAACACAACTTGCA
TTTACACGAGTTTATGTTGGTCGTGAAAATCAATGTGATGTTAGCAATGATTTGATCAAA
ACAGCTCATCTTGATTCAACTAATAAACCAGCAATCATCTTCAGAATTGCTTGTAGAAAT
GAAAAAGGTTATGGACCAGCTTGTCAAATTAAATGGCTTCAAGATCCATCAACAAAAACA
ACAACAACACCAGCAGCTGGTACAGCCATAAAGCGAACAACAATGATAGCAACTCAACAG
CAACAAAAGCGCTTCCGAACACAGTAA

>g16840.t2 Gene=g16840 Length=1008
MLKWKKVTNTSGPQPRPRHGHRAVAIRELMVVFGGGNEGIVDELHVYNTYTNQWYVPATK
GEVPPGCAAYGFVVDGTRILIFGGMVEYGKYSNDLFELQATKWEWKKIRPQAPFSGDAPC
PRLGHSFTLVGDKVFLFAGLANESDDPKHNIPKYLNDLYILNTKNGNYSWEIPITYGECP
PPRESHTAIAWNDKSTGRNYLVIYGGMSGCRLGDLWILDIETLSWSKPKCSGTPPLPRSL
HSATLINDKMYVFGGWIPLIVDNFTCEKEWQCTNTLGVLDLRSMTWETVDINVEPSPGVE
VDDIPRARAGHCAVGIHGRLYIWSGRDGYRKAWNNQVCCKDLWYLEVGPPSQAQRVQLIR
ASTSALELSWTSIPNAQYYILEVQKLPPAPLNEKPTTPLKEENEAKPSPITSPTSIASPI
TSQPKIIYQPSPTGDKKPMIILQPKKPLQIPTVRAPTVLKVMPSGSQTTTTTPIKVVQKP
ATITPANVIRPSFTGNVIKLMPGTILSGNKIIMKPATSGTVITKPATPQQIIVQRPGMTV
NTAGAIKLNTVQAPRIVSTTTQRPMNMTIGGRTVTLQLAGQKKVTLVNAAQATGSTPKII
MMPATSSAQTTVQQQQQQQKNTQIEQLDGADVDFQIDQLDGAVDNEDDIESQISTVKIEK
HGDGEDVKVTTSGAGDKMEENEAAAILSTISEVSQLSSHSVNTHIAMHDEKALRNSLNEN
LLASPIISNSSNDFNGSIESYGRQHSLDALAAAAMQASNSKAVANLSQITEIKTKENSSN
SESDNGEKWMVVGIFKTLTQNVTHYVDYQWRDNLDKLTSENIPDLSLLEKIPIEQGRTYR
FRIAGINACGVGKFSEPIHFKTCLPGFPGAPSGIKITKSQDGAHLSWEPPSQSNSQGEIT
EYSVYLAVKDPNPSKATQLAFTRVYVGRENQCDVSNDLIKTAHLDSTNKPAIIFRIACRN
EKGYGPACQIKWLQDPSTKTTTTPAAGTAIKRTTMIATQQQQKRFRTQ

Protein features from InterProScan

Transcript Database ID Name Start End E.value
15 g16840.t2 CDD cd00063 FN3 868 966 3.10652E-4
14 g16840.t2 Gene3D G3DSA:2.120.10.80 - 2 170 4.5E-17
13 g16840.t2 Gene3D G3DSA:2.120.10.80 - 171 342 1.7E-15
11 g16840.t2 Gene3D G3DSA:2.60.40.10 Immunoglobulins 782 865 7.3E-17
12 g16840.t2 Gene3D G3DSA:2.60.40.10 Immunoglobulins 866 992 2.4E-46
10 g16840.t2 MobiDBLite mobidb-lite consensus disorder prediction 1 20 -
5 g16840.t2 PANTHER PTHR46003 HOST CELL FACTOR 2 1003 0.0
1 g16840.t2 Pfam PF01344 Kelch motif 17 56 2.1E-5
3 g16840.t2 Pfam PF13854 Kelch motif 63 99 1.5E-4
4 g16840.t2 Pfam PF13418 Galactose oxidase, central domain 121 170 1.5E-4
2 g16840.t2 Pfam PF01344 Kelch motif 237 289 3.0E-5
9 g16840.t2 SMART SM00060 FN3_2 350 852 140.0
8 g16840.t2 SMART SM00060 FN3_2 868 965 0.0039
7 g16840.t2 SUPERFAMILY SSF117281 Kelch motif 29 346 2.09E-51
6 g16840.t2 SUPERFAMILY SSF49265 Fibronectin type III 348 973 6.27E-17

Transmembrane regions from TMHMM

Disordered region

IUPRED3 score over 0.5 is predictive of a disordered region.

GO terms from InterProScan

GOID TERM ONTOLOGY
GO:0005515 protein binding MF

KEGG

Orthology

Pathway

  • This transcript belongs to the following pathways

Expression

Transcript expression in Pv11 cells

TPM values are indicated as average +/- STDEV.

Differential expression

Differentially expressed genes were identified with DESeq2 using the ‘run_DE_analysis.pl’ script from Trinity. Transcripts were determined as differentially expressed when (1) FDR < 0.05 (2) fold change > 2 (TPM calculated by RSEM). DE information and fold change between conditions are indicated in the plot below.

Raw TPM values