Predictive Targeting Suite V2 Manual
Predictive Targeting Suite V2 Manual
USER MANUAL
R. Jason Hearst
June 2008
2
Upgrades history:
3
Table of Contents
2.0 Installation 3
2.1 Installation Notes 3
3.0 Usage 3
3.1 Loading the Menu 4
3.2 Interface 4
4.0 Inputs 6
4.1 Training Grids 6
4.2 Target Grids 7
4.3 Simulation Grids 7
4.4 Weights 8
4.5 Neural Network Algorithms 8
8.0 Output 16
8.1 The Output Database 16
8.2 Gridding the Output 18
8.3 Clipping the Grid 19
10.0 References 25
4
1.0 The Neural Network Concept
The basic premise behind a neural network is that it is software that can learn
and apply what was learnt to new data.
In the case of this application, the neural network learns by being shown several
grids modeling different traits of a geological area, these grids are called the
training grids. It is then shown a grid composed entirely of 1s and 0s that
identifies specific regions where desired anomalies or know deposits exist, this
grid is called the target grid. This constitutes the training of the neural network.
The neural network is then provided grids of a different area modeling the same
traits in the same order as the first training grids; these new grids are the
simulation grids. The neural network now applies what it has learned from the
first set of grids to the second set of grids.
2.0 Installation
To use PTNN, you must install the GX nnprep.gx, or you must have some other
means of producing target grids (see 4.2 Target Grid). Once installed,
nnprep.gx is accessible from the menu.
PTNN was developed on Oasis Montaj 7.0 or later and with Microsoft .NET
Framework 2.0; although back-compatibility may be possible, it has not been
tested nor is it supported.
3.0 Usage
5
3.1 Loading the Menu
3.2 Interface
To access the PTS2 interface, click on the PGW Predictive Targeting menu and
select Neural Network Simulation. The interface will appear as shown in Figure
1.
The interface consists of six regions. Each of which is discussed briefly below.
The Neural Network Inputs region is where the training and simulation grids are
inputted. Combo boxes are filled with any grids in the working directory and
browse buttons are provided for grids outside the working directory. At least one
of each type of grid must be entered, and the number of training grids must
match the number of simulation grids. For more information on the inputs grids
see sections 4.1 Training Grids, 4.2 Target Grid, and 4.3 Simulation Grids.
6
Input weights are also featured in this section of the interface. They allow for a
bias to be put on any of the input grids. For more information on the input
weights see section 4.4 Weights.
The target grid, which defines the areas for NN training, must be selected here.
The Neural Network Output region is where the type of neural network algorithm
is chosen and where the output database name is entered. Three different
neural network (NN) algorithm options are provided. For more information on the
algorithms see section 4.5 Neural Network Algorithms.
Some basic smoothing is available to the user; smoothing here means giving a
cell the average value of all cells that surround it in a 1x1, 3x3 or 5x5 square,
depending on the user selection. For more information on data smoothing see
section 5.0 Data Smoothing.
The Fast Classification Neural Network (FCNN) can be tuned by adjusting the
output transfer function. For more information on this see section 6.3 FCNN
Tuning.
For more information on these controls, see section 7.3 LMNN Tuning.
7
Note that PTS2 only takes into account the necessary inputs, so if the selected
NN algorithm is LMNN, then the FCNN Tuning Control settings will be
disregarded; the opposite is also true. If the Combined FC & LM algorithm is
selected, then both sets of tuning controls will be taken into account.
3.2.7 Simulating
Once all of the Neural Network settings are set as desired, clicking on the “OK”
button can trigger the simulation. Hitting “Cancel” will close the window and any
information that has not finished computing will be lost.
Both the FCNN and LMNN Tuning Controls have default settings. The interface
automatically resets to the defaults for the tuning controls every time it is
restarted. In contrast entries in the NN Inputs, NN Outputs, and Data Smoothing
regions are saved and are automatically filled with the same values as were used
the last time a simulation was run. If you are unsure if a value is set to its default,
you can enter the word “default” in the control box and the default setting or value
will be used by the neural network.
4.0 Inputs
The training grids are the base grids from which you want to highlight a trend and
find it in other data (the simulation grids). The number of training grids must be
between a minimum of 1 and a maximum of 10. The training grids can be
models of different types of data (e.g. K, Th, EM, etc.), but must all have the
same grid properties (listed below); basically all grids must be of the same
geological area. All grids must be in Geosoft grid format (.GRD).
Please note that you do not need to worry about modifying the data ahead of
time as the neural network allows for some basic data smoothing (see 5.0 Data
Smoothing), and adjusts the data so that it is on comparable scales.
8
It is also possible to emphasize the importance of one training grid over another
by using the weighting tool (see 4.4 Weights).
The target grid is the grid, which highlights the area of interest. The area of
interest is the area that shows the trend in the training grids that you wish to find
in the simulation grids. Only one target grid per simulation may be provided to
the neural network. The target grid must be composed entirely of 1s and 0s.
NNPREP.GX can be used to generate such a grid if you do not have another
means.
The target grid must have the same properties (listed below) as the training grids,
and must be in Geosoft grid format (.GRD).
Properties that must be common for the target grid and the training grids are:
Number of elements in X direction
Number of elements in Y direction
Separation in X direction
Separation in Y direction
Initial X position
Initial Y position
Orientation (KX)
The simulation grids are the grids, which represent the area upon which you wish
to find the trends exhibited in your training set and targeted by the target grid. A
maximum of 10 simulation grids is allowed, but their number must match the
number of training grids. The simulation grids must be the same types of data
(e.g. K, Th, EM, etc.) and in the same order as the training grids. In addition, all
simulation grids must have the same properties (listed below); basically all
simulation grids must be of the same geological area (this can differ from the
area of the training grids though). All simulation grids must be in Geosoft grid
format (.GRD).
9
It is important to note that the simulation grids must be inputted in the same order
as the training grids for meaningful results. While the order in which the training
grids are entered does not matter, the simulation grid order must match the
training grid order because the neural network compares Training Grid #1 to
Simulation Grid #1, and Training Grid #2 to Simulation Grid #2 and so on. Thus if
Training Grid #1 and Simulation Grid #1 do not represent the same type of
geophysical data, their comparison will be meaningless.
Note that these properties do not need to match those of the training and target
grid data.
4.4 Weights
The weighting tool is provided in case you wish to offer emphasis to one data
type over another. For instance if your training set consists of only a K grid and a
Th grid, but you want to put more emphasis on the K trends, you can set the
weight of K to 2, and leave Th set to 1.
You can set as many grids as you like to any weight range. For instance you
may enter 10 grids and set all of their weights to 3, but the results of this would
be the same as setting them all to 1 and the processing time will be significantly
higher.
10
These are three different algorithms for performing the neural network training
and simulation. The first is using an instantaneously trained neural network
called a Fast Classification Neural Network or FCNN (see 6.0 The Fast
Classification Neural Network). The second option uses Levenberg-Marquardt
optimization to train the neural network and is a classic feed-forward neural
network; we call it a Levenberg-Marquardt Neural Network or LMNN (see 7.0
The Levenberg-Marquardt Neural Network). The third option is provided in case
the user wants an output that is somewhere between the two aforementioned
options. The Combined FC & LM option simply simulates the neural network
with both algorithms and then averages the results.
By default the selected algorithm is FC, but after the first time PTS2 is used in a
project, the interface will store whatever the last algorithm used was in the input
box.
The neural network allows for some basic grid data smoothing of three different
parameters with the following options:
The effects of smoothing can usually be seen both numerically (in the database)
and graphically (on the grid). Applying different amounts of smoothing to different
parameters will change the results by varying degrees depending on the input
data. In the PTS2 context, smoothing refers to preconditioning the data by
averaging adjacent cells together to make up the value of a single cell.
Note, the effect of smoothing on the target grid (1s & 0s) creates a slightly
different effect, see 5.1 Target Smoothing.
1x1: No averaging is applied at all; the data is fed into the neural
network as the user has provided it.
3x3: The value in each cell that the neural network receives is actually
the average of a 9 cell, or 3x3, grid surrounding each cell. This is
accommodated for at the edges and corners (e.g. each corner is
really only an average of a 2x2 grid).
5x5: The value in each cell that the neural network receives is actually
the average of a 25 cell, or 5x5, grid surrounding each cell. This is
11
accommodated for at the edges and corners (e.g. each corner is
really only an average of a 3x3 grid).
Selecting 3x3 smoothing applies the 3x3 smoothing discussed in 5.0 Data
Smoothing. This effectively makes the target vector a gradient representing the
target areas. The target values near the edges of the target are averaged with 0
values, which makes them, lower than the target values found near the centre of
the target. Thus the neural network will look for everything in the target area, but
put higher value to that found near the centre of the target area than that found
near the edges.
The results of the FCNN tend to be more granulated than that of the LMNN. This
is due to the fact that FCNNs do no manipulation to the target data at all,
compared to an LMNN that adjusts weights. An FCNN tends to give more points
a partial membership rating, meaning that it will find more points that partially
match the target than a LMNN whose results are more clear cut but may miss
anomalies.
12
The user is allowed no control over the training of the FCNN because it is an
intricate process that has very few variables, and those that it does have, have
little effect on the results of the simulation when applied to geophysical data.
Figure 2 Fast Classification Neural Network Diagram. Image sourced from [1].
The FCNN training creates one node for each target cell and gives it weights
directly from the training data. The radius of generalisation, a measure of how
close the nodes are to each other mathematically, is then computed, and the
output weights are set equal to the target values. Figure 2 shows the model of a
FCNN.
This process is referred to as instantaneous because all the weight values were
just set without computation. The only computed values were the radii of
generalization, which can be computed relatively quickly by a computer.
Sending the simulation data through the system of weights and functions that
was generated by the training simulates the FCNN. The purpose of the output
transfer function (the only user tuneable quantity in this NN) is to put the output
on a range [0,1] so that it can be read as a decimal percentage. The next section
discusses this in greater detail.
Only one tuning control is provided to the user for the Fast Classification Neural
Network (FCNN). This tuning control is the output transfer function.
13
The purpose of the output transfer function is to smooth and filter the data onto
the range [0,1] to represent it as a decimal percentage. Changing the transfer
function does not change the trends shown in the output, but does change the
range of the simulation results.
For greater discussion on the transfer functions, their definitions and effects on
the data see section 9.0 Transfer Functions.
The default transfer function is “Pure Linear.” The word “default” may also be
typed into this control and the neural network will use the Pure Linear function.
This LMNN is an older neural network design and provides more rounded results
than the Fast Classification NN (FCNN). The original version of the Predictive
Targeting Suite used this NN algorithm.
The LMNN is modelled again based on a nodal structure, except this time the
network consists only of two layers of nodes and no other intricacies. The first
layer is the input nodes, which filter and manipulate the data, and the second
layer is the output nodes, which do even more filtering and manipulation on the
data.
Each node consists of input weights (one for each input to the node) and an
output bias (one for each node). The goal of the weights it to adjust the value so
that it is near the target value, and the bias is to adjust for any remaining error.
The weights are created by first filling all weights with random values and then
adjusting the random values according to Levenberg-Marquardt optimization
algorithm (1), until the training ending criteria are met. Some of these criteria are
fixed within the neural network and some are user accessible. The maximum
number of epochs and error goal are both user modifiable and are discussed in
14
7.3.3 Maximum Number of Training Epochs and 7.3.4 Acceptable Error Range
respectively.
x k 1 x k [ J J I ] 1 J e
T T
(1)
An important thing to note about this method of training is that the first step in the
process is assigning the weights random values. This means that every time the
simulation is run it will have a different result, even if run on the exact same data.
Variations have been observed to be on the order of 0.05. Note that although
the numerical values do vary slightly, the trends remain exactly the same.
Once the training sets all the weights and biases, simply passing the simulation
data through the system of weights and biases runs the simulation. The general
equation that data sees at each node is modelled by (2).
oi f ( xi * wi bi ) (2)
In (2) oi represents the nodal output, xi represents the nodal input, wi represents
the nodal weight for that particular input, bi represents the nodal bias, and f(x, w,
b) is the transfer function.
Four tuning options are made available to the user for the LMNN. These options
are:
Input Transfer Function
Output Transfer Function
Maximum Number of Training Epochs
Acceptable Error Range
The purpose of the input transfer function is to smooth and filter the input data
onto the range [0,1]; this way data from different types of measurements can be
compared. Changing the transfer function does not change the trends shown in
the output, but does change the data range and values.
15
Pure Linear
Sigmoid
Hyperbolic Tangent
Elliot’s Function
For greater discussion on the transfer functions, their definitions and effects on
the data, see 9.0 Transfer Functions.
The default transfer function is “Pure Linear.” The word “default” may also be
typed into this control and the neural network will use the Pure Linear function.
The purpose of the output transfer function is to smooth and filter the output data
onto the range [0,1]. Changing the transfer function does not change the trends
shown in the output, but does change the data range and values.
The default transfer function is “Hyperbolic Tangent.” The word “default” may
also be typed into this control and the neural network will use the Hyperbolic
Tangent function.
The user is allowed to adjust the maximum number of training epochs. A training
epoch is one training cycle. If the training is not converging or not converging at
a reasonable rate, the best training result once the maximum number of epochs
is reached is taken as the correct training result. The reason for limiting the
number of epochs is that sometimes the training does not converge, or
convergence takes too many iterations for the processing to be efficient.
The default maximum number of epochs is 500; typing “default” in this selection
will send 500 to the NN.
16
7.3.4 Acceptable Error Range
The acceptable error range is the error goal of the training. The LMNN is trained
on a basis where it adjusts weighting parameters that multiply the training data in
an attempt to make it match the target. The error range is the range around the
target value that is acceptable as “matching” the target value.
Increasing this will decrease processing time and accuracy, decreasing it will do
the opposite.
The default error range is 0.001; typing “default” in this selection will send 0.001
to the NN.
The overall results of the two different neural networks are generally very similar.
The highest matching areas on either simulation set of the same data are the
same, but the mid-valued locations are where we see a predominant difference.
We find that FCNN gives a lot more semi-membership weighting to points, this
means that the FCNN finds that there are a lot more points that match the
training area well, but finds few that match it close to exactly. In contrast the
LMNN has a very straightforward approach and the resulting values almost
always form a near-perfect Gaussian distribution.
Figure 3 shows the histograms of the tutorial data simulated with default settings
using both FCNN and LMNN. Figure 4 shows the same histograms auto-scaled
so we can observe their specifics better.
Figure 3 Histograms of the tutorial data gridded at default settings. From left to right:
FCNN, LMNN
17
Figure 4 Histograms of the tutorial data gridded at default settings and auto-scaled. From
left to right: FCNN, LMNN
From observing the curves we see that the FCNN tends to give more points a
higher ranking of matching the training than the LMNN and that the region
which it is applied over is not very large, whereas the LMNN gives a very
smooth very large Gaussian distribution.
8.0 Output
The output of the neural network is a Geosoft database file (.GDB). This file is
created in the working directory of the current Oasis Montaj workspace but will
not appear in the current workspace until added.
If you wish to write over an existing database, make sure it is closed before you
start the simulation. Note that entering the name of an existing database will
write-over that database, so all information held in it prior to running the
simulation will be lost.
Once added to the workspace, the output database can be opened and should
appear in the form of Figure 5.
18
Figure 5 Geosoft database file representing the FCNN simulation of the tutorial data with
training grid smoothing set to 3x3, target grid smoothing set to 1x1, result
smoothing set to 3x3, and FCNN output transfer function set to “Pure Linear.”
In the output database, the X channel represents the x-coordinate, the Y channel
represents the Y coordinate, and the NNsimCH channel represents the
simulation value for the corresponding X and Y channel values. Note that all
three X, Y, and NNsimCH values are dependent on the neural network inputs so
they will vary from simulation to simulation.
If there is no specified output database filename, the neural network will set a
default filename to the output. Note that if this output filename already has a
simulation database stored in it, the neural network will write-over this previous
simulation result.
The default filenames of simulations vary with the selected neural network
algorithm. The default filenames are as follows:
Fast Classification (FC) “FCNN_out.GDB”
Levenberg-Marquardt (LM) “LMNN_out.GDB”
Combined FC & LM “DUONN_out.GDB”
19
8.2 Gridding the Output
The simulation results stored in the database can be gridded to provide a visual
representation of the simulation.
To grid the database results, make sure that you are currently in the database,
then go to the Grid and Image menu, select the Gridding sub-menu, and then the
Minimum Curvature option. The minimum curvature gridding prompt will appear.
Select the NNsimCH channel, and make sure that for “Grid cell size” you enter
the separation size of your simulation grids (this should be the same for all of
them). Name the grid and press “OK”.
Figure 6 shows the gridded output of the simulation of the tutorial data for all
three neural network algorithm choices.
Figure 6 Gridded simulation outputs from all three neural network algorithms. From left to
right: Fast Classification (FC), Levenberg-Marquardt (LM), Combined FC & LM.
The default settings are used to generate all three databases and grids.
You can see that the main high-points on all three grids are the same, but that
different neural networks offer different interpretations of the data, just as curves
can be fitted with different functions e.g. L1-norm, L2-norm, etc.
20
Determining which neural network provides the most accurate results comes
from research and testing. This package provides the tools, which allow this to
be done.
Grid clipping can be used to further emphasise the main regions of interest, i.e.
the regions where the neural network simulation has yielded the highest values.
This can be done by gridding only the top quartile, decile, etc. of the data. The
following description is for finding the top decile, top 10%, of the data, a similar
process can be used to find the top quartile or any other segment.
To find the top decile, view the histogram of the gridded database. To do this,
right click on the grid name in the side bar, and select “Properties.” Then click
the “Stats” button, and subsequently the “Histogram” button.
Figure 7 shows the histograms of the three grids in Figure 6 adjusted to their top
decile.
Figure 7 Histograms highlighting the top decile of the data represented by the grids in
Figure 6. From top left to right, and then bottom, the histograms represent:
FCNN, LMNN, DUONN.
21
You can adjust to the top decile by moving the cursor (the red line) to different
points on the grid until the value in the “Cursor” box next to “%” shows ~90. Then
you want to record the value next to the “X” in the same box. Thus the values
that will constitute our minimum Z values for FCNN, LMNN, and DUONN
respectively are: 0.801, 0.6, and 0.7026.
Now, in the Grid and Image menu, and the Utilities sub-menu, select Window a
Grid. A prompt will appear. Make sure that the correct grid is selected and enter
an output name. The only other parameter that must be entered is the minimum
value of the top decile, which was found from the histogram. This is entered in
the “Z MIN” selection. Press the “OK” button and the clipped grids will be
created.
Figure 8 shows the top decile clipped grids for the grids presented in Figure 6.
Figure 8 Clipped to the top decile grids representing the simulation data of the grids in
Figure 6.
The transfer functions are the one greatest tuning tool offered to the user for
modifying the output of the neural network. They allow the user to increase and
decrease the shape and range of the simulation output values. It is very
22
important to note that even though the transfer functions can squeeze or flatten
the simulation results, they do not change the trends of the results, thus
regardless of what transfer function is used, the gridded images will almost
always looked exactly the same. To change the appearance of a gridded image,
changing the data smoothing options will have a greater effect.
There are four transfer functions utilised by PTS2. They all normalise the data
onto the range [0,1]. Some of the functions themselves normalise the data onto
the range [-1, 1] and are then adjusted to [0, 1]. The definitions of the transfer
functions are listed below.
Pure Linear:
f ( x) x (3)
Hyperbolic Tangent:
e x ex
f ( x) tanh( x) (5)
e x e x
Elliot’s Function:
x
f ( x) (6)
1 | x |
Note that the Pure Linear function is always passed through a normalisation
algorithm that brings the value onto [0, 1] after it goes through the transfer
function.
Changing the transfer function will change the results of the simulation. Note that
as previously mentioned these changes in results change only the numerical
values, the overall trends observed remain generally or exactly the same
regardless of transfer function. For some applications it is better to have the data
on a small tight range and for others we want the simulation results to span the
widest possible range. The following examples are intended to show what
effects the different transfer functions have on the data.
Figure 9 through 12 show the histograms of the default simulation settings for the
FCNN but with varying transfer functions (TFs). All histograms are plotted on the
23
range [0.1, 0.9]
Figure 9 FCNN default simulation with TF: Pure Linear (default TF)
24
Figure 12 FCNN default simulation with TF: Elliot’s Function
Observing the above four figures we can see how the different transfer functions
shift and squeeze the simulation results. We will take Figure 9 as our reference
because it uses the default settings and the default transfer function, Pure Linear.
If we observe Figure 10, we see that the data has been significantly compressed
to a much smaller range, and it has been shifted to lower values. This is the
effect of the Sigmoid function.
Comparing Figure 11 to 9 we notice that the overall shape is the same but the
plateau regions in the areas of higher concentration seem to be more elongated.
Also observing the maximum and minimum values on the histogram, the data
has been shifted up a very slight amount, with both the minimum value and the
maximum value increasing on that of the default settings in Figure 9. These are
the effects of the Hyperbolic Tangent function—increasing the data range upward
and creating higher concentrations of already highly concentrated points.
Observing the last histogram in Figure 12 we see that the range has actually
been compressed and shifted upward by a slight amount. This is the effect of
Elliot’s function.
Hyperbolic Tan.: Keeps the data at the same general frequency but shifts it
upward slightly.
Elliot’s function: Compresses the data and shifts it to slightly higher values.
25
the LMNN, but we will consider only three; this sample will show the effects of all
transfer functions (TFs) on the LMNN. It is important to note that there are four
TFs available as the input TF, but only three as the output TF. This is because
Pure Linear is not allowed as an output TF because it allows for the results of the
LMNN to exceed the bounds of [0, 1] due to internal data manipulation.
Figures 13 to 15 show the results of the same LMNN simulation when different
transfer functions are applied.
Figure 13 LMNN default simulation with Input TF: Pure Linear, and Output TF: Hyperbolic
Tangent (default TFs)
Figure 14 LMNN default simulation with Input TF: Pure Linear, and Output TF: Sigmoid
26
Figure 15 LMNN default simulation with Input TF: Pure Linear, and Output TF: Elliot’s
Function
It must be noted that the default settings are input transfer function: Pure Linear,
and output transfer function: Hyperbolic Tangent. Out of all the possibilities, this
provides the widest simulation data range, and is set as the default for this
purpose. This is shown in Figure 13.
Observing Figure 14 we see that the Sigmoid function has again compressed the
data into a much smaller segment and the amount that values are repeated is
significantly higher. Also all the data has been compressed into the upper half of
the range of the default setting.
Figure 15 shows the effect of Elliot’s function, which again compresses the data
into a smaller area, but not as small as the effects of Sigmoid. Opposite to what
happens in the FCNN, this time Elliot’s function pushes the values into the lower
half of the default instead of the upper half.
Elliot’s Function: Compresses the data into a smaller area and shifts it to
lower values.
10.0 References
[1] K.W. Tang and S. Kak, "Fast classification networks for signal processing,"
Circuits, Systems Signal Processing 21 (2002) 207-224.
27
[2] K.W. Tang, "Instantaneous Learning Neural Networks." Ph.D. Dissertation,
Louisiana State University, 1999.
[3] S. Kak, "A Class of Instantaneously Trained Neural Networks," Baton
Rouge, LA; May 7, 2002.
[5] F. Au, T-C. Chen, D-J. Han, and L. Tham, “Acceleration of Levenberg-
Marquardt Training of Neural Netowrks with Variable Decay Rate,” IEEE,
0-7803-7898-9/03, 2003, 1873-1878.
[7] “Predictive Targeting with Neural Networks for Oasis montaj v6 – Tutorial
and User Guide,” Software Tutorial; Paterson, Grant and Watson Limited,
2004.
28