Macromolecular Structure Refinement against Neutron Data with Shelxl-2013

The localisation of hydrogen positions is one of the main purposes of neutron diffraction. Macromolecular structure usually lack the appropriate data to parameter ration for unrestrained refinement. While the weak diffraction power of hydrogen atoms with X-rays allows the use of constraints, we recently demonstrated [1] that hydrogen position constraints deteriorate the quality of a structure refined against neutron data even at medium resolution, while using restraints show the expected beneficial effect.

Restraints for hydrogen atoms in Amino Acids

The same work also quantifies the restraints for hydrogen atoms for all 20 standard amino acids. The restraints in shelx format are available for download. Please check this site for possible updates.

In order to generate restraints for ligands we recommend the Grade Server from Global Phasing. It generates restraints including hydrogen restraints in SHELX format.

References

  1. T Gruene, HW Hahn, AV Luebben, F Meilleur, GM Sheldrick Refinement of Macromolecular Structures against Neutron Data with SHELXL-2013 J. Appl. Cryst. 2014, 47, 462-466 DOI: 10.1107/S1600576713027659. BibTeX file.

Validation with Rcomplete

The aforementioned article makes use of the program crossflaghkl. It partitions a merged hkl-file into k subsets with the number of reflections marked with -1. Please cite Luebben and Gruene, PNAS (2015) when using this method.

The linux-version of the program can be downloaded here. It is as easy to use as

#> crossflaghkl -t30 yourfile.hkl

The hkl-file must be merged, because the program does not read an ins-file to merge the data.

The flag -t30 sets the size of each test set and the program creates an accordingly large number of hkl-files. If you set -t1 you will end up with one hkl-file for each reflection!

You should use WIGL and a sufficiently large number of refinement cycles, e.g. test one refinement with shelxl to set the number of cycles large enough for wR2 to converge.

Rcomplete can be extrated from the resulting set of .lst-files can be read out with the following bash-script:

#!/bin/bash

grep "Nfree(all)" kcross*.lst > Nfree_all.data
numentries=$(wc Nfree_all.data)

awk '{ sumDF += $5; sumFo += $7; } END {print "Rcomplete= ", sumDF/sumFo; }' Nfree_all.data
echo "   from $numentries entries"
rm -f temp.lst my_rfree.data my_r1.data

For comparison, a mean R1 and mean Rfree can be calculated likewise, using octave. In contrast to Rcomplete, the calculation Rfree is not stable because of the possible division by zero. The script may return an error message.

grep -A5 Free kcross*lst > temp.lst
grep Free temp.lst  | cut --output-delimiter=" " -c11-14,71-76 > my_rfree.data
grep -- "- R1 =" temp.lst | cut --output-delimiter="  " -c11-14,66-71 > my_r1.data

rm temp.lst

octave -q --no-window-system << eof
r1 = load ('my_r1.data');
rf = load ('my_rfree.data');
AveR1 = statistics(r1);
AveRfree = statistics(rf);
printf ("# sig(R1)  sig(Rfree)\n");
printf ("%7.5f (%7.5f)  %7.5f (%7.5f)\n", AveR1(6,2), AveR1(7,2), AveRfree(6,2), AveRfree(7,2));
eof
#!/bin/bash
rm -f rfree.data
for i in kcross*.lst; do
   num=${i%.lst}
   num=${num#kcross_set}

    grep -A6 Free $i > temp.lst
    rfree=$(grep Free temp.lst | cut --output-delimiter="  " -c14-18,52-57)
    echo "$num  $rfree" >> rfree.data
done

rm temp.lst

Last modified: Mar 24, 2020 14:36