3D stucture

From NMReDATA
Jump to: navigation, search


In version <2.0, we do not specify if the structure in the .sdf file is “flat”, or a true “3D structure”. Version 2.0 clarifies this.

The proposition is that sdf files could have two structures (.sdf files allows for any number of structure - we don’t violate any rule here). When there is only one structure, it should be “flat” (z coordinate set to zero) with all known stereo information encoded properly in it. When there are two structures, the two should have the same numbering of the atoms, but one is for “flat” display, and one the 3D structure for distances measurement, measure of angles, dihedral angled, etc.

Molblock (2D/3D) structures

The SDF file format allows to include multiple structures/model/frames in a single SDF file. They are separated by a line with "$$$$".

For the NMReDATA format, there is always one (first) structure representing the "flat" 2D structure. By flat we don't mean that chirality is not specified, but that it has a z-coordinate set to zero.

For version 2.0, we will introduce the possibility to include a 3D structure (additional to the first - not replacing it!).

The second structure (3D with non-zero z coordinates) may be added by simply appending a molblock to the SDF file and terminate (as usual), the file with "$$$$".

It should fulfil the following conditions: the order of atoms and bonds should be the same as for the main (first) structure. The "only" difference should be the x, y, z coordinates that will correspond to the determined 3D structure, instead of having z set to zero as for "flat" structure.

To obey the official specification of the MOLfile format and, hence, assure compatibility of the files with other software, the second line in the header of each molblock should include either "2D" or "3D" (the 'dimensional codes') in columns 21 and 22 (the dd below):

Line 2 has the format:
IIPPPPPPPPMMDDYYHHmmddSSssssssssssEEEEEEEEEEEERRRRRR
A2<--A8--><---A10-->A2I2<--F10.5-><---F12.5--><-I6-> )
User's first and last initials (I), program name (P), date/time (M/D/Y,H:m),
dimensional codes (d), scaling factors (S, s), energy (E) if modelling program input,
internal registry number (R) if input through MDL form.

Note that future developments may impose to include additional structures (for example for multiple conformations DFT/GIAO data...). We will need to make sure the software can unambiguously find the correct 3D structures. We may therefore have to add addition flag to indicate the 3D structure corresponding to the main structure of the NMReDATA. For now, we can consider the that the second structure in the file will be the 3D structures and ignore any addition ones (third, fourth, etc.)

We strongly recommend to have all the NMReDATA tags associated with the first structure, i.e. included before the first "$$$$" line. This is because the current reader may stop reading the SDF file at the first occurrence of "$$$$" and would miss them if they are listed after the 3D structure.

2D to 3D conversion

When a 3D visualizer does not find a 3D structure, it could generate and add the 3D structure to the output, BUT ask for permission/warning to the user and warn him on the consequences and/or guide him through the process:

-Transforming 2D into 3D is not innocent. If two enantiotopic hydrogen atoms are drawn with regular bonds (simple straight line) and assigned two different signals in the spectrum, it may be for the good reason that the assignment is not known. Introducing a 3D structure will erase the "unknown" and introduce the risk of error. When there is a risk for this to occur, one should use the "ambiguous" statement in the "NMREDATA_ASSIGNMENT" tag.

-Other problems of this type probably exist...

In principle transforming 2D into 3D is quite important and useful but has to be done carefully to avoid introducing error or removing information!