EMBL format
General information
The European Molecular Biology Laboratory (EMBL) maintains DNA and protein sequence databases. The format of their database entries are shown in listing below. Similar to GenBank entries, a large amount of information is given for each sequence. The EMBL format use a two letter format to type the single data fields in one EMBL entry. Every EMBL entry finishes with the sequence block, starting with the Sequence shortcut SQ and finishing with the // symbol which marks the end of the entry.
SwissProt sequence format
The SwissProt Sequence Format is very similar to the EMBL sequence format. But the SwissProt entries contain more information about physical and biochemical properties of the protein.
Example
ID MMFOSB standard; RNA; MUS; 4145 BP. XX AC X14897; XX SV X14897.1 XX DT 23-NOV-1989 (Rel. 21, Created) DT 12-SEP-1993 (Rel. 36, Last updated, Version 2) XX DE Mouse fosB mRNA XX KW fos cellular oncogene; fosB oncogene; oncogene. XX OS Mus musculus (house mouse) OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. XX RN [1] RP 1-4145 RX MEDLINE; 89251612. RA Zerial M., Toschi L., Ryseck R.P., Schuermann M., Mueller R., Bravo R.; RT "The product of a novel growth factor activated gene, fos B, interacts with RT JUN proteins enhancing their DNA binding activity"; RL EMBO J. 8:805-813(1989). XX DR MGD; MGI:95575; Fosb. DR SWISS-PROT; P13346; FOSB_MOUSE. DR TRANSFAC; T00291; T00291. XX CC clone=AC113-1; cell line=NIH3T3; XX FH Key Location/Qualifiers FH FT source 1..4145 FT /db_xref="taxon:10090" FT /organism="Mus musculus" FT CDS 1202..2218 FT /db_xref="SWISS-PROT:P13346" FT /note="fosB protein (AA 1-338)" FT /protein_id="CAA33026.1" FT /translation="MFQAFPGDYDSGSRCSSSPSAESQYLSSVDSFGSPPTAAASQECA FT GLGEMPGSFVPTVTAITTSQDLQWLVQPTLISSMAQSQGQPLASQPPAVDPYDMPGTSY FT STPGLSAYSTGGASGSGGPSTSTTTSGPVSARPARARPRRPREETLTPEEEEKRRVRRE FT RNKLAAAKCRNRRRELTDRLQAETDQLEEEKAELESEIAELQKEKERLEFVLVAHKPGC FT KIPYEEGPGPGPLAEVRDLPGSTSAKEDGFGWLLPPPPPPPLPFQSSRDAPPNLTASLF FT THSEVQVLGDPFPVVSPSYTSSFVLTCPEVSAFAGAQRTSGSEQPSDPLNSPSLLAL" XX SQ Sequence 4145 BP; 960 A; 1186 C; 1007 G; 991 T; 1 other; ataaattctt attttgacac tcaccaaaat agtcacctgg aaaacccgct ttttgtgaca 60 aagtacagaa ggcttggtca catttaaatc actgagaact agagagaaat actatcgcaa 120 actgtaatag acattacatc cataaaagtt tccccagtcc ttattgtaat attgcacagt 180 gcaattgcta catggcaaac tagtgtagca tagaagtcaa agcaaaaaca aaccaaagaa 240 aggagccaca agagtaaaac tgttcaacag ttaatagttc aaactaagcc attgaatcta 300 tcattgggat cgttaaaatg aatcttccta caccttgcag tgtatgattt aacttttaca 360 gaacacaagc caagtttaaa atcagcagta gagatattaa aatgaaaagg tttgctaata 420 gagtaacatt aaataccctg aaggaaaaaa aacctaaata tcaaaataac tgattaaaat 480 tcacttgcaa attagcacac gaatatgcaa cttggaaatc atgcagtgtt ttatttaaga 540 aaacataaaa caaaactatt aaaatagttt tagagggggt aaaatccagg tcctctgcca 600 ggatgctaaa attagacttc aggggaattt tgaagtcttc aattttgaaa cctattaaaa 660 agcccatgat tacagttaat taagagcagt gcacgcaaca gtgacacgcc tttagagagc 720 attactgtgt atgaacatgt tggctgctac cagccacagt caatttaaca aggctgctca 780 gtcatgaact taatacagag agagcacgcc taggcagcaa gcacagcttg ctgggccact 840 ttcctccctg tcgtgacaca atcaatccgt gtacttggtg tatctgaagc gcacgctgca 900 ccgcggcact gcccggcggg tttctgggcg gggagcgatc cccgcgtcgc cccccgtgaa 960 accgacagag cctggacttt caggaggtac agcggcggtc tgaaggggat ctgggatctt 1020 gcagagggaa cttgcatcga aacttgggca gttctccgaa ccggagacta agcttccccg 1080 agcagcgcac tttggagacg tgtccggtct actccggact cgcatctcat tccactcggc 1140 catagccttg gcttcccggc gacctcagcg tggtcacagg ggcccccctg tgcccaggga 1200 aatgtttcaa gcttttcccg gagactacga ctccggctcc cggtgtagct catcaccctc 1260 cgccgagtct cagtacctgt cttcggtgga ctccttcggc agtccaccca ccgccgccgc 1320 ctcccaggag tgcgccggtc tcggggaaat gcccggctcc ttcgtgccaa cggtcaccgc 1380 aatcacaacc agccaggatc ttcagtggct cgtgcaaccc accctcatct cttccatggc 1440 ccagtcccag gggcagccac tggcctccca gcctccagct gttgaccctt atgacatgcc 1500 aggaaccagc tactcaaccc caggcctgag tgcctacagc actggcgggg caagcggaag 1560 tggtgggcct tcaaccagca caaccaccag tggacctgtg tctgcccgtc cagccagagc 1620 caggcctaga agaccccgag aagagacact taccccagaa gaagaagaaa agcgaagggt 1680 tcgcagagag cggaacaagc tggctgcagc taagtgcagg aaccgtcgga gggagctgac 1740 agatcgactt caggcggaaa ctgatcagct tgaagaggaa aaggcagagc tggagtcgga 1800 gatcgccgag ctgcaaaaag agaaggaacg cctggagttt gtcctggtgg cccacaaacc 1860 gggctgcaag atcccctacg aagaggggcc ggggccaggc ccgctggccg aggtgagaga 1920 tttgccaggg tcaacatccg ctaaggaaga cggcttcggc tggctgctgc cgccccctcc 1980 accacccccc ctgcccttcc agagcagccg agacgcaccc cccaacctga cggcttctct 2040 ctttacacac agtgaagttc aagtcctcgg cgaccccttc cccgttgtta gcccttcgta 2100 cacttcctcg tttgtcctca cctgcccgga ggtctccgcg ttcgccggcg cccaacgcac 2160 cagcggcagc gagcagccgt ccgacccgct gaactcgccc tcccttcttg ctctgtaaac 2220 tctttagaca aacaaaacaa acaaacccgc aaggaacaag gaggaggaag atgaggagga 2280 gaggggagga agcagtccgg gggtgtgtgt gtggaccctt tgactcttct gtctgaccac 2340 ctgccgcctc tgccatcgga catgacggaa ggacctcctt tgtgttttgt gctccgtctc 2400 tggttttctg tgccccggcg agaccggaga gctggtgact ttggggacag ggggtggggc 2460 ggggatggac acccctcctg catatctttg tcctgttact tcaacccaac ttctggggat 2520 agatggctgg ctgggtgggt agggtggggt gcaacgccca cctttggcgt cttgcgtgag 2580 gctggagggg aaagggtgct gagtgtgggg tgcagggtgg gttgaggtcg agctggcatg 2640 cacctccaga gagacccaac gaggaaatga cagcaccgtc ctgtccttct tttcccccac 2700 ccacccatcc accctcaagg gtgcagggtg accaagatag ctctgttttg ctccctcggg 2760 ccttagctga ttaacttaac atttccaaga ggttacaacc tcctcctgga cgaattgagc 2820 ccccgactga gggaagtcga tgcccccttt gggagtctgc taaccccact tcccgctgat 2880 tccaaaatgt gaacccctat ctgactgctc agtctttccc tcctgggaaa actggctcag 2940 gttggatttt tttcctcgtc tgctacagag ccccctccca actcaggccc gctcccaccc 3000 ctgtgcagta ttatgctatg tccctctcac cctcaccccc accccaggcg cccttggccg 3060 tcctcgttgg gccttactgg ttttgggcag cagggggcgc tgcgacgccc atcttgctgg 3120 agcgctttat actgtgaatg agtggtcgga ttgctgggtg cgccggatgg gattgacccc 3180 cagccctcca aaactttccc tgggcctccc cttcttccac ttgcttcctc cctccccttg 3240 acagggagtt agactcgaaa ggatgaccac gacgcatccc ggtggccttc ttgctcaggc 3300 cccagacttt ttctctttaa gtccttcgcc ttccccagcc taggacgcca acttctcccc 3360 accctgggag ccccgcatcc tctcacagag gtcgaggcaa ttttcagaga agttttcagg 3420 gctgaggctt tggctcccct atcctcgata tttgaatccc caaatatttt tggactagca 3480 tacttaagag ggggctgagt tcccactatc ccactccatc caattccttc agtcccaaag 3540 acgagttctg tcccttccct ccagctttca cctcgtgaga atcccacgag tcagatttct 3600 attttttaat attggggaga tgggccctac cgcccgtccc ccgtgctgca tggaacattc 3660 cataccctgt cctgggccct aggttccaaa cctaatccca aaccccaccc ccagctattt 3720 atccctttcc tggttcccaa aaagcactta tatctattat gtataaataa atatattata 3780 tatgagtgtg cgtgtgtgtg cgtgtgcgtg cgtgcgtgcg tgcgtgcgag cttccttgtt 3840 ttcaagtgtg ctgtggagtt caaaatcgct tctggggatt tgagtcagac tttctggctg 3900 tccctttttg tcaccttttt gttgttgtct cggctcctct ggctgttgga gacagtcccg 3960 gcctctccct ttatcctttc tcaagtctgt ctcgctcaga ccacttccaa catgtctcca 4020 ctctcaatga ctctgatctc cggtntgtct gttaattctg gatttgtcgg ggacatgcaa 4080 ttttacttct gtaagtaagt gtgactgggt ggtagatttt ttacaatcta tatcgttgag 4140 aattc 4145 //
Please direct questions and comments to Martin Haubrock.