Chem4Word Version3 Technical Manual
Chem4Word Version3 Technical Manual
Microsoft Word
Technical Manual
Version 3.0
Contents
Introduction ............................................................................................................................................ 3
Storage Model......................................................................................................................................... 3
Appendix A – Sample code ..................................................................................................................... 4
Introduction
This document is intended to help users integrate documents produced by the Chemistry Add-in for
Microsoft Word into other systems such as SharePoint.
Storage Model
The machine readable Chemistry is stored as hidden Word Objects called CustomXmlParts, as their
name suggests they can be used to store XML (or in our case CML, which is a dialect of XML). When
a hidden chemistry object is first created, it is given a Globally Unique Identifier (Guid). This Guid is
then stored in the Tag of each visible Content Control which contains Chemistry, this allows us to
find the Chemistry data, when an operation on a chemistry zone is carried out.
CML.Id == CC.Tag CC
CML
CML.Id == CC.Tag CC
CML.Id == CC.Tag CC
CML
The Inch key1 is a unique “fingerprint” of the Chemistry, which could be used to index the data.
1
https://en.wikipedia.org/wiki/International_Chemical_Identifier#InChIKey
Appendix A – Sample code
The following code could very easily be incorporated into a SharePoint ListItemReceiver to extract and
index the chemistry when a document is created or updated in a SharePoint library.
namespace ReadCustomXmlParts
{
internal class Program
{
private static void Main(string[] args)
{
string documents = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
string file = Path.Combine(documents, "Two Chemistry Zones.docx");
ChemistryReader reader = new ChemistryReader();
List<string> zones = reader.GetChemistryZones(file);
Debug.WriteLine($"Found {zones.Count} chemistry zones");
}
}
}
namespace ReadCustomXmlParts
{
public class ChemistryReader
{
public List<string> GetChemistryZones(string filename)
{
List<string> zones = new List<string>();
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open(filename, false))
{
var mainPart = wordDoc.MainDocumentPart;
foreach (var cxml in mainPart.CustomXmlParts)
{
using (XmlTextReader reader =
new XmlTextReader(cxml.GetStream(FileMode.Open, FileAccess.Read)))
{
reader.MoveToContent();
string str = reader.ReadOuterXml();
if (str.Contains("cml:cml"))
{
zones.Add(str);
}
}
}
}
return zones;
}
}
}