0% found this document useful (0 votes)
237 views4 pages

Chem4Word Version3 Technical Manual

This technical manual describes how chemistry data from a Chemistry Add-in for Microsoft Word is stored and extracted. The chemistry is stored as hidden CustomXmlParts in Word that contain CML (Chemical Markup Language), with each part given a unique ID. These IDs are stored in the tags of corresponding visible Content Controls. Sample code is provided to extract the CustomXmlParts as CML using these IDs.

Uploaded by

Wilder Daza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
237 views4 pages

Chem4Word Version3 Technical Manual

This technical manual describes how chemistry data from a Chemistry Add-in for Microsoft Word is stored and extracted. The chemistry is stored as hidden CustomXmlParts in Word that contain CML (Chemical Markup Language), with each part given a unique ID. These IDs are stored in the tags of corresponding visible Content Controls. Sample code is provided to extract the CustomXmlParts as CML using these IDs.

Uploaded by

Wilder Daza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Chemistry Add-in for

Microsoft Word

Technical Manual

Version 3.0
Contents
Introduction ............................................................................................................................................ 3
Storage Model......................................................................................................................................... 3
Appendix A – Sample code ..................................................................................................................... 4
Introduction
This document is intended to help users integrate documents produced by the Chemistry Add-in for
Microsoft Word into other systems such as SharePoint.

Storage Model
The machine readable Chemistry is stored as hidden Word Objects called CustomXmlParts, as their
name suggests they can be used to store XML (or in our case CML, which is a dialect of XML). When
a hidden chemistry object is first created, it is given a Globally Unique Identifier (Guid). This Guid is
then stored in the Tag of each visible Content Control which contains Chemistry, this allows us to
find the Chemistry data, when an operation on a chemistry zone is carried out.

CustomXmlParts (Hidden) Content Controls (Visible)

CML.Id == CC.Tag CC
CML

CML.Id == CC.Tag CC

CML.Id == CC.Tag CC
CML

Sample code to extract the CML is given in Appendix A.


One example of an element of the extracted CML, which could be used to index all chemistry is
shown below.
<cml:name dictRef="chemspider:Inchikey">SMWDFEZZVXVKRB-UHFFFAOYSA-N</cml:name>

The Inch key1 is a unique “fingerprint” of the Chemistry, which could be used to index the data.

1
https://en.wikipedia.org/wiki/International_Chemical_Identifier#InChIKey
Appendix A – Sample code
The following code could very easily be incorporated into a SharePoint ListItemReceiver to extract and
index the chemistry when a document is created or updated in a SharePoint library.

File “Program.cs” the main entry point.


using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;

namespace ReadCustomXmlParts
{
internal class Program
{
private static void Main(string[] args)
{
string documents = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
string file = Path.Combine(documents, "Two Chemistry Zones.docx");
ChemistryReader reader = new ChemistryReader();
List<string> zones = reader.GetChemistryZones(file);
Debug.WriteLine($"Found {zones.Count} chemistry zones");
}
}
}

File “ChemistryReader.cs” this collects the Chemistry Zones as CML.


using DocumentFormat.OpenXml.Packaging;
using System.Collections.Generic;
using System.IO;
using System.Xml;

namespace ReadCustomXmlParts
{
public class ChemistryReader
{
public List<string> GetChemistryZones(string filename)
{
List<string> zones = new List<string>();
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open(filename, false))
{
var mainPart = wordDoc.MainDocumentPart;
foreach (var cxml in mainPart.CustomXmlParts)
{
using (XmlTextReader reader =
new XmlTextReader(cxml.GetStream(FileMode.Open, FileAccess.Read)))
{
reader.MoveToContent();
string str = reader.ReadOuterXml();
if (str.Contains("cml:cml"))
{
zones.Add(str);
}
}
}
}
return zones;
}
}
}

You might also like