Ab Initio EME Technical Repository
Ab Initio EME Technical Repository
2. EME Project
A Project is a collection of graphs and its associated elements like dml, xfr, etc. in
the EME Datastore.
2.1
Project structure
Primarily a Project is a group of graphs and related objects stored under a single
directory tree. /Projects are the default root under which all the Projects are
maintained inside the Datastore. Each Project directory has a basic structure as
given below.
There is a special public Project associated with every project instance of Ab Initio
environment known as the Environment Project or stdenv. This is no different from a
regular Project in terms of the structure. It contains machine and Application
specific settings like the data directory mount points, max-core settings and
application wide parameters like current date, which are used across all Projects.
During creation of any Project, stdenv is included in it by default. A single stdenv is
required for an entire set of applications on a single machine and sharing a single
EME Datastore.
3. Sandbox
Projects held in the EME Datastore cant be manipulated directly. To work on
Projects, they must be checked out to a working area on the file system where we
can develop and modify code. This working area on the file system is known as a
Sandbox. It has exactly the similar directory structure as that of a Project in the
Datastore.
Each object that needs to be worked on is checked out to a sandbox where
modifications or enhancements are carried out. After the changes are complete the
code is checked in from the sandbox area to the EME Datastore. This action creates
a new version of the code in the EME Datastore.
3.1
Sandboxes are work areas used to develop, test or run code associated with a given
project. Only one version of the code can be held within the sandbox at any time.
The EME Datastore contains all versions of the code that have been checked into it.
Page | 3
A particular sandbox is associated with only one Project where as a Project can be
checked out to a number of sandboxes, which is a common scenario.
4. Parameters
A parameter is a name-value pair with some additional attributes that determine
when and how to interpret or resolve its value. Parameters are used to provide
logical names to physical location and should always be used instead of hard coded
paths in graphs. This makes the graph more generic in nature. We can have two
types of parameters, graph and Project parameters.
4.1
Graph parameters
Graph parameters, as the name suggests are specific to the individual graphs and
are private to them. They affect execution of the graph for which they have been
defined. Graph parameters can be defined by navigating to Edit>Parameters in the
GDE which opens the graph parameters editor.
4.2 Project parameters
Project parameters are inherited by all the graphs in the Project and are accessed
from
the
GDE
by
the
sandbox
parameter
editor
in
Project>Edit
Sandbox>Parameters.
Page | 4
This shows a dialog box prompting to enter the sandbox path. Choose the correct
host and the sandbox path and press OK to open the sandbox parameter editor,
which is exactly like the graph parameter editor.
4.3 Editing parameters
To add a new Project parameter or to modify the value of an existing one, we should
first lock the parameters in the sandbox parameter editor by clicking the lock button
on the menu. If nobody has locked it in their sandboxes, then the lock symbol turns
green indicating a successful lock. This implies we can add or modify the
parameters now. If a lock is already there before, then while opening the parameter
editor it shows a warning saying the parameters are already locked and the lock
symbol is red in such a case. After getting a lock, others are disabled from editing
the parameters.
4.4 Parameter Attributes
Scope: Scope of a parameter can be formal or local. A local parameter is internal to
the sandbox and most of the parameters have their scope as local. Its value is taken
from the value column in the parameter editor. A formal parameter is one whose
value can be set from outside, i.e. from the environment where the graph is run. Its
value is supplied from the command line. A green diamond can identify the formal
parameters with an arrow mark.
Kind: If scope is local, kind is left unspecified, but if it is formal, the kind is
automatically set to keyword.
Type: This determines the nature of the parameter. Project parameters have four
types as string, common Project, switch and dependent. Graph parameters have
different set of types.
Value: This column specifies the value of the parameter.
Interpretation: This determines how the parameter is going to be evaluated.
Constant: Value is taken literally.
$ Substitution: Variables with $ prefixes are replaced with their values
${} Substitution: Variables within {} and with $ prefixes are replaced by their
values but other occurrences of $ are ignored.
Shell: Korn shell syntax is used to evaluate the value of the parameter.
PDL: Parameter definition language enables to define the parameter interpretation
using inline DML.
Required: This attribute can take two values, required (the default) or optional. If it
is required, the value column cant be left blank but if it is optional, it can be left
blank.
Export: When this check box is checked, the corresponding parameter value is
exported as an environment variable; otherwise it is generated as a local shell
variable.
Page | 5
Page | 6
A conflict occurs when the sandbox version of an object and the latest datastore
version are different for some reasons. In such a case the check-out wizard asks
how to resolve the conflict between the sandbox and the datastore.
Example of a conflict situation:
User 1 checks out a file to a sandbox, locks and updates it. In the meanwhile user 2
also checks out the same file to his sandbox. When user 2 tries to edit the file in the
GDE, he will see that user 1 has already locked the it. He bypasses GDE and
updates the file outside it by some other means. Now user 1 checks in the file. User
2 also proceeds to check in his changes made to the file, but the check in fails due
to conflict. All user 2 needs to do is to check out the current version of the file from
EME to his sandbox and during this process he would be asked whether to keep his
sandbox version or overwrite it with the current version available in the EME
datastore.
5.4 Check in of object
Once the project files have been edited and updated they need to be checked in to
create a new version in the EME datastore, which will be available for other users.
Check in wizard is invoked by navigating to Project>Check in. Before starting the
check in wizard, it checks for any unsaved file in the sandbox and prompts whether
to save them or not.
5.5 Dependency Analysis
It analyses the Project for the dependencies within and between the graphs. The
EME examines the Project and develops a survey tracing how data is transformed
and transferred field by field from component to component. Dependency analysis
has two basic steps Translation and Analysis.
In the translation step the sandbox paths of the objects and the paths to the objects
in the data area are translated to corresponding Project relative paths inside the
EME and the actual analysis is performed in the analysis step.
6. Working with previous versions of graphs/objects
Many a times a previous version of a graph may be required to check out and
update rather than working with the latest or current version of the graph as
available in the EME data store. Using check out wizard in GDE, we may check out a
tagged version of a graph, which is not the latest version available. But GDE doesnt
allow locking such versions. In such a case, the below procedure may be followed:
Check out the required previous tagged version of the graph to your sandbox.
Check it back in with Force Overwrite in advanced option in check in wizard. This
will make it the current version in the data store.
Lock the graph now to make the changes.
Check in the graph back to the EME data store. This updated version will become
the latest version in the EME data store.
Page | 7