Difference between revisions of "Parser"

Revision as of 08:47, 29 January 2019

Possible structure of the object including NMReDATA reflecting the format of NMReDATA tags of SDF files.

Note: This is not designed to include ambiguities in the assignment. This is therefore only for use with Level=0.

Reading the SDF file

Libraries for divers languages exist to read SDF files. This paragraph is only relevant if you write your own reader/writer

We recommend having in mind, when reading, that an SDF file will have to be written at some point later. We recommend to write TAGS them in the same oder, but don't expect all reader to do so, so be ready for the TAGS to appear in any order.

If not too large, the content of the input should be kept in memory so that it can be written later. Since SDF files may contain divers TAGS (not only the NMREDATA tags) they should all be written in the output SDF file.

We recommend to

Open the SDF file
Read/store the molblock as chain of characters
Read/store the TAGS as chain of characters
Close the file
Analyses/check the molblock (see possible object structure of NMReDATA). Be

Determine how to read the NMREDATA tags

Scan the tags and list the index of the ones including NMREDATA_ in their name

Read the NMReDATA tag. Keep in mind the end of line problem .

First read and analyse the NMREDATA_VERSION to

determine what character should be ignored (ASCII 10, except for version 1)
determine the line separator ("\", except for version 1, in which cas the ASCII 10 is the line separator)

read the NMREDATA tags

Many simple tags have no particular format. (NMREDATA_SOLVENT, NMREDATA_VERSION, etc.)

But most "complex" tags (NMREDATAT_ASSIGNMENT, NMREDATA_J, NMREDATA_1H, etc.) all have a common general structure:

Two type of lines should be distinguished:

property lines
item in a list

The "property lines" contain a serie of characters (letters) followed the "=" sign followed by the value of the variable.

The "property lines" contain should be identified as such. Property lines should be before the list, but some may follow the lost (not recommended, but possible).

Note that a property may appear more than once. Excample:

Author=John
Author=Paul

In this case it should be stored as an array.

We recommend the first store the list as an array of array of characters, only later analyse it. See possible object structure of NMReDATA for more details.

@@ Line 3: / Line 3: @@
 '''Note:''' This is not designed to include ambiguities in the assignment. This is therefore only for use with Level=0.
-= Reading the SDF file =
+=== Reading the SDF file ===
 Libraries for divers languages exist to read SDF files. This paragraph is only relevant if you write your own reader/writer
-We recommend having in mind, when reading, that an SDF file will have to be written at some point later. We recommend to write them in the same oder, but I don't think that this suggerstion is followed by all writer, so be ready for the TAGS to appear in any order.
+We recommend having in mind, when reading, that an SDF file will have to be written at some point later. We recommend to write TAGS them in the same oder, but don't expect all reader to do so, so be ready for the TAGS to appear in any order.
 If not too large, the content of the input should be kept in memory so that it can be written later. Since SDF files may contain divers TAGS (not only the NMREDATA tags) they should all be written in the output SDF file.
@@ Line 18: / Line 18: @@
 * Analyses/check the molblock (see [[nmredata object structure|possible object structure of NMReDATA]]). Be
-= Determine how to read the NMREDATA tags =
+=== Determine how to read the NMREDATA tags ===
 Scan the tags and list the index of the ones including NMREDATA_ in their name
@@ Line 28: / Line 28: @@
 * determine the line separator ("\", except for version 1, in which cas the ASCII 10 is the line separator)
-= read the NMREDATA tags =
+=== read the NMREDATA tags ===
-Tags have two parts:
+Many simple tags have no particular format. (NMREDATA_SOLVENT, NMREDATA_VERSION, etc.)
-* in the first all line start with a variable name followed by the "=" sign followed by the content
-* in the first all line start with a variable name followed by the "=" sign followed by the content
+But most "complex" tags (NMREDATAT_ASSIGNMENT, NMREDATA_J, NMREDATA_1H, etc.) all have a common general structure:
+Two type of lines should be distinguished:
+* property lines
+* item in a list
+The "property lines" contain a serie of characters (letters) followed the "=" sign followed by the value of the variable.
+The "property lines" contain should be identified as such. Property lines should be before the list, but some may follow the lost (not recommended, but possible).
+Note that a property may appear more than once.
+Excample:
+ Author=John
+ Author=Paul
+In this case it should be stored as an array.
+We recommend the first store the list as an array of array of characters, only later analyse it.
+See [[nmredata object structure|possible object structure of NMReDATA]] for more details.

Difference between revisions of "Parser"

Revision as of 08:47, 29 January 2019

Reading the SDF file

Determine how to read the NMREDATA tags

read the NMREDATA tags

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools