Protein Data Bank Changes Guide New Changes in Version 3.20 September 15, 2008
Protein Data Bank Changes Guide New Changes in Version 3.20 September 15, 2008
2 Page 1
Version 3.20 of the PDB file format introduces a small number of changes and extensions supporting
the annotation practices adopted by the wwPDB. These annotation practices are described in detail
in the documentation section of the wwPDB website. The complete details of the PDB file format can
be found at http://www.wwpdb.org/docs.html.
New Records in 3.2 Page 2
1. SPLIT
When a structure has too many atoms to be represented in a single PDB formatted file, it
will be split into smaller entries that will all be identified with a new record named
SPLIT. Appears after TITLE.
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
SPLIT 1VOQ 1VOR 1VOS 1VOU 1VOV 1VOW 1VOX 1VOY
SPLIT 2 1VP0 1VOZ
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
HEADER RIBOSOME 06-OCT-04 1VOV
TITLE CRYSTAL STRUCTURE OF FIVE 70S RIBOSOMES FROM ESCHERICHIA
TITLE 2 COLI IN COMPLEX WITH PROTEIN Y. THIS FILE CONTAINS THE 30S
TITLE 3 SUBUNIT OF ONE 70S RIBOSOME.
SPLIT 1VOQ 1VOR 1VOS 1VOU 1VOV 1VOW 1VOX 1VOY 1VOZ
SPLIT 2 1VP0 1VOZ
2. COMPND
An additional column will be added to extend the continuation lines up to 999.
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
COMPND 97 MOL_ID: 25;
COMPND 98 MOLECULE: 50S RIBOSOMAL PROTEIN L32E;
COMPND 99 CHAIN: Y;
COMPND 100 SYNONYM: HL5;
COMPND 101 MOL_ID: 26;
COMPND 102 MOLECULE: 50S RIBOSOMAL PROTEIN L37AE;
COMPND 103 CHAIN: Z;
COMPND 104 MOL_ID: 27;
COMPND 105 MOLECULE: 50S RIBOSOMAL PROTEIN L37E;
COMPND 106 CHAIN: 1;
COMPND 107 SYNONYM: L35E;
3. SOURCE
The NCBI Taxonomy IDs for the organism (ORGANISM_TAXID) and expression system
(EXPRESSION_SYSTEM_TAXID) are now included in the SOURCE record if they are available.
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
SOURCE MOL_ID: 1;
SOURCE 2 ORGANISM_SCIENTIFIC: LACTOBACILLUS CASEI;
SOURCE 4 ORGANISM_TAXID: 1582;
SOURCE 5 GENE: THYA;
SOURCE 6 EXPRESSION_SYSTEM: ESCHERICHIA COLI;
SOURCE 8 EXPRESSION_SYSTEM_TAXID: 562
SOURCE 9 EXPRESSION_SYSTEM_STRAIN: CHI2913RECA;
SOURCE 10 EXPRESSION_SYSTEM_VECTOR_TYPE: PLASMID;
SOURCE 11 EXPRESSION_SYSTEM_PLASMID: PSCTS9
New Records in 3.2 Page 3
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
SOURCE 99 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI;
SOURCE 100 MOL_ID: 29;
SOURCE 101 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI;
SOURCE 102 MOL_ID: 30;
SOURCE 103 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI;
SOURCE 104 MOL_ID: 31;
SOURCE 105 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI;
4. EXPDTA
The experimental methods have been standardized and follow an enumeration list. Multiple
exp methods will be listed here for joint refinement or hybrid methods. NMR entries will
be identified by solution and solid state. EM entries will be identified by electron
crystallography and electron microscopy.
X-RAY DIFFRACTION
NEUTRON DIFFRACTION
FIBER DIFFRACTION
ELECTRON CRYSTALLOGRAPHY
ELECTRON MICROSCOPY
SOLUTION NMR
SOLID-STATE NMR
SOLUTION SCATTERING
Example 1:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
EXPDTA SOLID-STATE NMR
5. NUMMDL
The new record NUMMDL specifies the number of models. Appears after EXPDTA.
6. MDLTYP
The new record MDLTYP specifies the minimized average or Ca/P atoms chains. Appears after
EXPDTA with continuous lines.
7. JRNL
Country code, ASTM and ISBN have been removed.
The PubMed ID and corresponding Digital Object Identifier (DOI) for the primary citation
have been added. Each DOI consists of a publisher prefix, a slash ("/"), and a suffix of
numbers and letters of any length. The PubMed ID and DOI start at column 20.
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
JRNL REF PROC.NATL.ACAD.SCI.USA V. 105 4621 2008
JRNL REFN ISSN 0027-8424
JRNL PMID 18344321
JRNL DOI 10.1073/pnas.0712393105
8. REMARK 2
Format has been changed to allow display of low resolution.
Resolution becomes f7.2 to allow for space and 100.00 ANGSTROM.
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 2 RESOLUTION. 1.74 ANGSTROMS.
9. REMARK 4
The versioning now has format x.xx with uniform date format dd-mmm-yy.
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 4 2G86 COMPLIES WITH FORMAT V. 3.20, 01-AUG-07
Examples:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 100 THIS ENTRY HAS BEEN PROCESSED BY RCSB on 10-MAR-06.
REMARK 100 THE RCSB ID CODE IS RCSB036809.
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 200 DATE OF DATA COLLECTION : 01-JAN-96
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 240
REMARK 240 EXPERIMENTAL DETAILS
REMARK 240 RECONSTRUCTION METHOD : CRYSTALLOGRAPHY
REMARK 240 SAMPLE TYPE : 2D CRYSTAL
REMARK 240 SPECIMEN TYPE : VITREOUS ICE (CRYO EM)
REMARK 240 DATA AQUISITION
REMARK 240 DATE OF DATA COLLECTION : 01-DEC-03
REMARK 240 TEMPERATURE (KELVIN) : 300.0
REMARK 240 PH : 6.00
REMARK 240 NUMBER OF CRYSTALS USED : 286
REMARK 240 MICROSCOPE MODEL : JEM3000SFF
REMARK 240 DETECTOR TYPE : CCD
REMARK 240 ACCELERATION VOLTAGE (KV) : 300
REMARK 240 NUMBER OF UNIQUE REFLECTIONS : 22293
REMARK 240 RESOLUTION RANGE HIGH (A) : 1.9
REMARK 240 RESOLUTION RANGE LOW (A) : 20.000
REMARK 240 DATA SCALING SOFTWARE : SOFTWARE
REMARK 240 COMPLETENESS FOR RANGE (%) : 80.0
REMARK 240 DATA REDUNDANCY : 5.700
REMARK 240 IN THE HIGHEST RESOLUTION SHELL.
REMARK 240 HIGHEST RESOLUTION SHELL, RANGE HIGH (A) : 1.90
REMARK 240 HIGHEST RESOLUTION SHELL, RANGE LOW (A) : 2.0
REMARK 240 COMPLETENESS FOR SHELL (%) : 82.0
REMARK 240 DATA REDUNDANCY IN SHELL : 5.70
REMARK 240 R MERGE FOR SHELL (I) : 0.166
REMARK 240 METHOD USED TO DETERMINE THE STRUCTURE: MOLECULAR
REMARK 240 REPLACEMENT
REMARK 240 SOFTWARE USED : CNS
REMARK 240 STARTING MODEL : PDB ENTRY 1SOR
New Records in 3.2 Page 6
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 245
REMARK 245 EXPERIMENTAL DETAILS
REMARK 245 RECONSTRUCTION METHOD : SINGLE PARTICLE
REMARK 245 SPECIMEN TYPE : VITREOUS ICE (CRYO EM)
REMARK 245
REMARK 245 ELECTRON MICROSCOPE SAMPLE
REMARK 245 SAMPLE TYPE : PARTICLE
REMARK 245 PARTICLE TYPE : MIXED SYMMETRY
REMARK 245 NAME OF SAMPLE : BACTERIOPHAGE T4
REMARK 245 SAMPLE CONCENTRATION (MG ML-1) : 20.00
REMARK 245 SAMPLE SUPPORT DETAILS : NULL
REMARK 245 SAMPLE VITRIFICATION DETAILS : NULL
REMARK 245 SAMPLE BUFFER : H2O
REMARK 245 PH : 7.50
REMARK 245 SAMPLE DETAILS : PHAGE
REMARK 245
REMARK 245 DATA ACQUISITION
REMARK 245 DATE OF EXPERIMENT : 06-JAN-02
REMARK 245 NUMBER OF MICROGRAPHS-IMAGES : NULL
REMARK 245 TEMPERATURE (KELVIN) : 100.00
REMARK 245 MICROSCOPE MODEL : FEI/PHILIPS CM300FEG/T
REMARK 245 DETECTOR TYPE : NULL
REMARK 245 MINIMUM DEFOCUS (NM) : 500.00
REMARK 245 MAXIMUM DEFOCUS (NM) : 3400.00
REMARK 245 MINIMUM TILT ANGLE (DEGREES) : 0.00
REMARK 245 MAXIMUM TILT ANGLE (DEGREES) : 0.00
REMARK 245 NOMINAL CS : 1.40
REMARK 245 IMAGING MODE : BRIGHT FIELD
REMARK 245 ELECTRON DOSE (ELECTRONS NM**-2) : 20.00
REMARK 245 ILLUMINATION MODE : SPOT SCAN
REMARK 245 NOMINAL MAGNIFICATION : 45000
REMARK 245 CALIBRATED MAGNIFICATION : 47000
REMARK 245 SOURCE : FIELD EMISSION GUN
REMARK 245 ACCELERATION VOLTAGE (KV) : 300
REMARK 245 IMAGING DETAILS : NULL
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 300 THE ASSEMBLY REPRESENTED IN THIS ENTRY HAS REGULAR
REMARK 300 CYCLIC POINT SYMMETRY (SCHOENFLIES SYMBOL = C38).
New Records in 3.2 Page 7
In non-SPLIT entries, REMARK 350 reports the full quaternary structure and software-
calculated outputs, if available.
For NMR ensembles, the missing residues will be listed in model range.
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 465 MISSING RESIDUES
REMARK 465 THE FOLLOWING RESIDUES WERE NOT LOCATED IN THE
REMARK 465 EXPERIMENT. (RES=RESIDUE NAME; C=CHAIN IDENTIFIER;
REMARK 465 SSSEQ=SEQUENCE NUMBER; I=INSERTION CODE.)
REMARK 465 MODELS 1-20
REMARK 465 RES C SSSEQI
REMARK 465 MET A 1
REMARK 465 GLY A 2
New Records in 3.2 Page 8
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 470 MISSING ATOM
REMARK 470 THE FOLLOWING RESIDUES HAVE MISSING ATOMS(RES=RESIDUE NAME;
REMARK 470 C=CHAIN IDENTIFIER; SSEQ=SEQUENCE NUMBER; I=INSERTION CODE):
REMARK 470 MODELS 1-25
REMARK 470 RES CSSEQI ATOMS
REMARK 470 ILE A 20 CD1
REMARK 470 THR A 59 CG2
Example:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 800
REMARK 800 SITE
REMARK 800 SITE_IDENTIFIER: AC1
REMARK 800 EVIDENCE_CODE: SOFTWARE
REMARK 800 SITE_DESCRIPTION: BINDING SITE FOR RESIDUE BIL A 19
REMARK 800
REMARK 800 SITE_IDENTIFIER: CAT
REMARK 800 EVIDENCE_CODE: AUTHOR
REMARK 800 SITE_DESCRIPTION: DESIGNATED RECOGNITION REGION IN PRIMARY
REMARK 800 REFERENCE. PROPOSED TO AFFECT SUBSTRATE SPECIFICITY.
New Records in 3.2 Page 9
Example, DBREF:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
DBREF 2J83 A 61 322 UNP Q8TL28 Q8TL28_METAC 61 322
In the new format, 20 characters are reserved for accession code and 10 characters are
reserved for DB numbering.
Template:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 630 MOLECULE TYPE:
REMARK 630 MOLECULE NAME:
REMARK 630 (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN IDENTIFIER;
REMARK 630 SSSEQ=SEQUENCE NUMBER; I=INSERTION CODE.)
REMARK 630
REMARK 630 M RES C SSSEQI
REMARK 630 SOURCE:
REMARK 630 SUBCOMP:
REMARK 630 STRUCTURE DETAILS:
REMARK 630 OTHER DETAILS:
21. REMARK 3 for Phenix template/example and joint refinement for X-ray/Neutron
Example, PHENIX:
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
REMARK 3
New Records in 3.2 Page 10
REMARK 3 REFINEMENT.
REMARK 3 PROGRAM : PHENIX (PHENIX.REFINE)
REMARK 3 AUTHORS : PAUL ADAMS,PAVEL AFONINE,VICENT CHEN,IAN
REMARK 3 : DAVIS,KRESHNA GOPAL,RALF GROSSE-
REMARK 3 : KUNSTLEVE,LI-WEI HUNG,ROBERT IMMORMINO,
REMARK 3 : TOM IOERGER,AIRLIE MCCOY,ERIK MCKEE,NIGEL
REMARK 3 : MORIARTY,REETAL PAI,RANDY READ,JANE
REMARK 3 : RICHARDSON,DAVID RICHARDSON,TOD ROMO,JIM
REMARK 3 : SACCHETTINI,NICHOLAS SAUTER,JACOB SMITH,
REMARK 3 : LAURENT STORONI,TOM TERWILLIGER,PETER
REMARK 3 : ZWART
REMARK 3
REMARK 3 REFINEMENT TARGET : ML
REMARK 3
REMARK 3 DATA USED IN REFINEMENT.
REMARK 3 RESOLUTION RANGE HIGH (ANGSTROMS) : 2.99
REMARK 3 RESOLUTION RANGE LOW (ANGSTROMS) : 40.07
REMARK 3 MIN(FOBS/SIGMA_FOBS) : 0.000
REMARK 3 COMPLETENESS FOR RANGE (%) : 96.7
REMARK 3 NUMBER OF REFLECTIONS : 242645
REMARK 3
REMARK 3 FIT TO DATA USED IN REFINEMENT.
REMARK 3 R VALUE (WORKING + TEST SET) : 0.293
REMARK 3 R VALUE (WORKING SET) : 0.291
REMARK 3 FREE R VALUE : 0.335
REMARK 3 FREE R VALUE TEST SET SIZE (%) : 4.980
REMARK 3 FREE R VALUE TEST SET COUNT : 12081
REMARK 3
REMARK 3 FIT TO DATA USED IN REFINEMENT (IN BINS).
REMARK 3 BIN RESOLUTION RANGE COMPL. NWORK NFREE RWORK RFREE
REMARK 3 1 40.0700 - 9.2600 0.98 8197 419 0.1970 0.2050
REMARK 3 2 9.2600 - 7.3700 0.98 7994 409 0.1560 0.1990
REMARK 3 3 7.3700 - 6.4400 0.99 7965 413 0.2060 0.2470
REMARK 3 4 6.4400 - 5.8500 0.99 7924 426 0.2330 0.2740
REMARK 3 5 5.8500 - 5.4300 0.98 7833 444 0.2550 0.3160
REMARK 3 6 5.4300 - 5.1200 0.98 7811 408 0.2530 0.3110
REMARK 3 7 5.1200 - 4.8600 0.97 7819 387 0.2550 0.3210
REMARK 3 8 4.8600 - 4.6500 0.97 7693 423 0.2690 0.3260
REMARK 3 9 4.6500 - 4.4700 0.97 7737 394 0.2790 0.2920
REMARK 3 10 4.4700 - 4.3200 0.97 7691 403 0.2690 0.3280
REMARK 3 11 4.3200 - 4.1800 0.97 7731 402 0.2560 0.3040
REMARK 3 12 4.1800 - 4.0600 0.98 7760 407 0.2610 0.3170
REMARK 3 13 4.0600 - 3.9500 0.97 7685 398 0.2710 0.3070
REMARK 3 14 3.9500 - 3.8600 0.98 7758 403 0.2970 0.3650
REMARK 3 15 3.8600 - 3.7700 0.98 7713 431 0.2890 0.3260
REMARK 3 16 3.7700 - 3.6900 0.98 7737 386 0.2870 0.3520
REMARK 3 17 3.6900 - 3.6200 0.98 7719 410 0.2910 0.3230
REMARK 3 18 3.6200 - 3.5500 0.98 7683 426 0.2770 0.3200
REMARK 3 19 3.5500 - 3.4800 0.98 7756 375 0.2950 0.3480
REMARK 3 20 3.4800 - 3.4300 0.98 7720 414 0.3110 0.3780
REMARK 3 21 3.4300 - 3.3700 0.98 7742 372 0.3200 0.3760
REMARK 3 22 3.3700 - 3.3200 0.98 7667 411 0.3440 0.4360
REMARK 3 23 3.3200 - 3.2700 0.98 7700 414 0.3410 0.3840
REMARK 3 24 3.2700 - 3.2200 0.97 7667 411 0.3350 0.3870
REMARK 3 25 3.2200 - 3.1800 0.97 7541 419 0.3400 0.3790
REMARK 3 26 3.1800 - 3.1400 0.96 7637 402 0.3460 0.4220
REMARK 3 27 3.1400 - 3.1000 0.96 7613 381 0.3580 0.3940
REMARK 3 28 3.1000 - 3.0600 0.96 7538 427 0.3790 0.4290
REMARK 3 29 3.0600 - 3.0300 0.95 7440 376 0.3760 0.4350
REMARK 3 30 3.0300 - 2.9900 0.77 6093 290 0.3950 0.4490
REMARK 3
REMARK 3 BULK SOLVENT MODELLING.
New Records in 3.2 Page 11