EMBL format
General information
The European Molecular Biology Laboratory (EMBL) maintains DNA and protein sequence databases. The format of their database entries are shown in listing below. Similar to GenBank entries, a large amount of information is given for each sequence. The EMBL format use a two letter format to type the single data fields in one EMBL entry. Every EMBL entry finishes with the sequence block, starting with the Sequence shortcut SQ and finishing with the // symbol which marks the end of the entry.
SwissProt sequence format
The SwissProt Sequence Format is very similar to the EMBL sequence format. But the SwissProt entries contain more information about physical and biochemical properties of the protein.
Example
ID MMFOSB standard; RNA; MUS; 4145 BP.
XX
AC X14897;
XX
SV X14897.1
XX
DT 23-NOV-1989 (Rel. 21, Created)
DT 12-SEP-1993 (Rel. 36, Last updated, Version 2)
XX
DE Mouse fosB mRNA
XX
KW fos cellular oncogene; fosB oncogene; oncogene.
XX
OS Mus musculus (house mouse)
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus.
XX
RN [1]
RP 1-4145
RX MEDLINE; 89251612.
RA Zerial M., Toschi L., Ryseck R.P., Schuermann M., Mueller R., Bravo R.;
RT "The product of a novel growth factor activated gene, fos B, interacts with
RT JUN proteins enhancing their DNA binding activity";
RL EMBO J. 8:805-813(1989).
XX
DR MGD; MGI:95575; Fosb.
DR SWISS-PROT; P13346; FOSB_MOUSE.
DR TRANSFAC; T00291; T00291.
XX
CC clone=AC113-1; cell line=NIH3T3;
XX
FH Key Location/Qualifiers
FH
FT source 1..4145
FT /db_xref="taxon:10090"
FT /organism="Mus musculus"
FT CDS 1202..2218
FT /db_xref="SWISS-PROT:P13346"
FT /note="fosB protein (AA 1-338)"
FT /protein_id="CAA33026.1"
FT /translation="MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECA
FT GLGEMPGSFVPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSY
FT STPGLSAYSTGGASGSGGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRE
FT RNKLAAAKCRNRRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGC
FT KIPYEEGPGPGPLAEVRDLPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLF
FT THSEVQVLGDPFPVVSPSYTSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL"
XX
SQ Sequence 4145 BP; 960 A; 1186 C; 1007 G; 991 T; 1 other;
ataaattctt attttgacac tcaccaaaat agtcacctgg aaaacccgct ttttgtgaca 60
aagtacagaa ggcttggtca catttaaatc actgagaact agagagaaat actatcgcaa 120
actgtaatag acattacatc cataaaagtt tccccagtcc ttattgtaat attgcacagt 180
gcaattgcta catggcaaac tagtgtagca tagaagtcaa agcaaaaaca aaccaaagaa 240
aggagccaca agagtaaaac tgttcaacag ttaatagttc aaactaagcc attgaatcta 300
tcattgggat cgttaaaatg aatcttccta caccttgcag tgtatgattt aacttttaca 360
gaacacaagc caagtttaaa atcagcagta gagatattaa aatgaaaagg tttgctaata 420
gagtaacatt aaataccctg aaggaaaaaa aacctaaata tcaaaataac tgattaaaat 480
tcacttgcaa attagcacac gaatatgcaa cttggaaatc atgcagtgtt ttatttaaga 540
aaacataaaa caaaactatt aaaatagttt tagagggggt aaaatccagg tcctctgcca 600
ggatgctaaa attagacttc aggggaattt tgaagtcttc aattttgaaa cctattaaaa 660
agcccatgat tacagttaat taagagcagt gcacgcaaca gtgacacgcc tttagagagc 720
attactgtgt atgaacatgt tggctgctac cagccacagt caatttaaca aggctgctca 780
gtcatgaact taatacagag agagcacgcc taggcagcaa gcacagcttg ctgggccact 840
ttcctccctg tcgtgacaca atcaatccgt gtacttggtg tatctgaagc gcacgctgca 900
ccgcggcact gcccggcggg tttctgggcg gggagcgatc cccgcgtcgc cccccgtgaa 960
accgacagag cctggacttt caggaggtac agcggcggtc tgaaggggat ctgggatctt 1020
gcagagggaa cttgcatcga aacttgggca gttctccgaa ccggagacta agcttccccg 1080
agcagcgcac tttggagacg tgtccggtct actccggact cgcatctcat tccactcggc 1140
catagccttg gcttcccggc gacctcagcg tggtcacagg ggcccccctg tgcccaggga 1200
aatgtttcaa gcttttcccg gagactacga ctccggctcc cggtgtagct catcaccctc 1260
cgccgagtct cagtacctgt cttcggtgga ctccttcggc agtccaccca ccgccgccgc 1320
ctcccaggag tgcgccggtc tcggggaaat gcccggctcc ttcgtgccaa cggtcaccgc 1380
aatcacaacc agccaggatc ttcagtggct cgtgcaaccc accctcatct cttccatggc 1440
ccagtcccag gggcagccac tggcctccca gcctccagct gttgaccctt atgacatgcc 1500
aggaaccagc tactcaaccc caggcctgag tgcctacagc actggcgggg caagcggaag 1560
tggtgggcct tcaaccagca caaccaccag tggacctgtg tctgcccgtc cagccagagc 1620
caggcctaga agaccccgag aagagacact taccccagaa gaagaagaaa agcgaagggt 1680
tcgcagagag cggaacaagc tggctgcagc taagtgcagg aaccgtcgga gggagctgac 1740
agatcgactt caggcggaaa ctgatcagct tgaagaggaa aaggcagagc tggagtcgga 1800
gatcgccgag ctgcaaaaag agaaggaacg cctggagttt gtcctggtgg cccacaaacc 1860
gggctgcaag atcccctacg aagaggggcc ggggccaggc ccgctggccg aggtgagaga 1920
tttgccaggg tcaacatccg ctaaggaaga cggcttcggc tggctgctgc cgccccctcc 1980
accacccccc ctgcccttcc agagcagccg agacgcaccc cccaacctga cggcttctct 2040
ctttacacac agtgaagttc aagtcctcgg cgaccccttc cccgttgtta gcccttcgta 2100
cacttcctcg tttgtcctca cctgcccgga ggtctccgcg ttcgccggcg cccaacgcac 2160
cagcggcagc gagcagccgt ccgacccgct gaactcgccc tcccttcttg ctctgtaaac 2220
tctttagaca aacaaaacaa acaaacccgc aaggaacaag gaggaggaag atgaggagga 2280
gaggggagga agcagtccgg gggtgtgtgt gtggaccctt tgactcttct gtctgaccac 2340
ctgccgcctc tgccatcgga catgacggaa ggacctcctt tgtgttttgt gctccgtctc 2400
tggttttctg tgccccggcg agaccggaga gctggtgact ttggggacag ggggtggggc 2460
ggggatggac acccctcctg catatctttg tcctgttact tcaacccaac ttctggggat 2520
agatggctgg ctgggtgggt agggtggggt gcaacgccca cctttggcgt cttgcgtgag 2580
gctggagggg aaagggtgct gagtgtgggg tgcagggtgg gttgaggtcg agctggcatg 2640
cacctccaga gagacccaac gaggaaatga cagcaccgtc ctgtccttct tttcccccac 2700
ccacccatcc accctcaagg gtgcagggtg accaagatag ctctgttttg ctccctcggg 2760
ccttagctga ttaacttaac atttccaaga ggttacaacc tcctcctgga cgaattgagc 2820
ccccgactga gggaagtcga tgcccccttt gggagtctgc taaccccact tcccgctgat 2880
tccaaaatgt gaacccctat ctgactgctc agtctttccc tcctgggaaa actggctcag 2940
gttggatttt tttcctcgtc tgctacagag ccccctccca actcaggccc gctcccaccc 3000
ctgtgcagta ttatgctatg tccctctcac cctcaccccc accccaggcg cccttggccg 3060
tcctcgttgg gccttactgg ttttgggcag cagggggcgc tgcgacgccc atcttgctgg 3120
agcgctttat actgtgaatg agtggtcgga ttgctgggtg cgccggatgg gattgacccc 3180
cagccctcca aaactttccc tgggcctccc cttcttccac ttgcttcctc cctccccttg 3240
acagggagtt agactcgaaa ggatgaccac gacgcatccc ggtggccttc ttgctcaggc 3300
cccagacttt ttctctttaa gtccttcgcc ttccccagcc taggacgcca acttctcccc 3360
accctgggag ccccgcatcc tctcacagag gtcgaggcaa ttttcagaga agttttcagg 3420
gctgaggctt tggctcccct atcctcgata tttgaatccc caaatatttt tggactagca 3480
tacttaagag ggggctgagt tcccactatc ccactccatc caattccttc agtcccaaag 3540
acgagttctg tcccttccct ccagctttca cctcgtgaga atcccacgag tcagatttct 3600
attttttaat attggggaga tgggccctac cgcccgtccc ccgtgctgca tggaacattc 3660
cataccctgt cctgggccct aggttccaaa cctaatccca aaccccaccc ccagctattt 3720
atccctttcc tggttcccaa aaagcactta tatctattat gtataaataa atatattata 3780
tatgagtgtg cgtgtgtgtg cgtgtgcgtg cgtgcgtgcg tgcgtgcgag cttccttgtt 3840
ttcaagtgtg ctgtggagtt caaaatcgct tctggggatt tgagtcagac tttctggctg 3900
tccctttttg tcaccttttt gttgttgtct cggctcctct ggctgttgga gacagtcccg 3960
gcctctccct ttatcctttc tcaagtctgt ctcgctcaga ccacttccaa catgtctcca 4020
ctctcaatga ctctgatctc cggtntgtct gttaattctg gatttgtcgg ggacatgcaa 4080
ttttacttct gtaagtaagt gtgactgggt ggtagatttt ttacaatcta tatcgttgag 4140
aattc 4145
//
Please direct questions and comments to Martin Haubrock.