ARTICULO 6
ARTICULO 6
net/publication/221574179
CITATIONS READS
12 493
4 authors:
All content following this page was uploaded by Alexandre Mendes on 03 June 2014.
Table 1: The 12 questions presented by Lichtman and Keilis-Borok. The answers in parenthesis favor the
incumbent party.
Pairs
Feature Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Dominated
0 1 1 1 1 1 1 1 0 0 1 1 2359
0 1 1 1 1 0 1 1 1 0 1 1 2371
0 1 1 1 0 1 1 1 0 1 1 1 2347
0 1 1 1 0 1 1 1 1 0 1 1 2359
1 1 1 1 0 1 1 1 0 0 1 1 2329
1 1 1 1 0 0 1 1 1 0 1 1 2341
1 0 1 1 0 1 1 1 1 0 1 1 2359
0 0 1 1 0 1 1 1 1 1 1 1 2377
1 0 1 1 0 0 1 1 1 1 1 1 2359
Appearances 4 6 9 9 2 6 9 9 6 3 9 9
Popularity 0.44 0.67 1 1 0.22 0.67 1 1 0.67 0.33 1 1
Weight 237 237 255 357 267 255 245 245 267 255 231 267
Figure 2: The feature sets for α = β = 2. The feature sets are laid out in rows, with a ‘1’ indicating that
the feature represented by that column is present, a ‘0’ not present. ‘Appearances’ indicates the number of
times a feature appears in total across all feature sets, ‘Popularity’ gives this as a ratio of the total number of
feature sets, and ‘Weight’ indicates how many pairs the feature can dominate if it is in the feature set. The
final column gives the number of pairs each feature set dominates, with the largest highlighted.
4.1 Our Results Q9. The decision tree for the first feature set (that
which dominates 2377 pair vertices) can be seen in
Several experiments were conducted with the data Figure 1 (developed with the J48 and ID3 heuristics,
from (Lichtman & Keilis-Borok 1981). Initial inspec- both giving the same answer). It is noted however,
tion of the graph indicated that the maximum possi- that at this point we do not have any algorithms or
ble α and β were 2 and 2. The graph, with beta ver- heuristics able to build decision trees or rules sets
tices, consisted of 12 feature vertices corresponding that take advantage of the redundancy available with
to the 12 questions, and 465 pair vertices, 234 ‘alpha’ α > 1 and β > 0, thus the tree created for this feature
vertices and 231 ‘beta’ vertices. When β = 0 was set does not use all the features available to it. The
considered, the graph contained only the 234 ‘alpha’ sets of rules created by using the PART and PRISM
vertices, as the ‘beta’ vertices can be immediately dis- heuristics are presented in Tables 3 and 4.
carded. The feature sets for α = β = 1 were also generated.
When α = 2 and β = 2, the reduction rules added In this case the reduction rules did not indicate that
5 features to the feature set (Q3, Q7, Q8, Q11 and any features had to be in the kernel (unsurprisingly,
Q12), discarded no features, and reduced the number as there were no degree 1 pair vertices), and did not
of pair vertices from 465 to only 13. The minimal indicate that any of the features were irrelevant. It
feature set size for α = β = 2 was nine features, did however reduce the number of pair vertices from
with nine different feature sets possible9 (see Figure 465 to 43, 30 ‘alpha’ and 13 ‘beta’. From this kernel
2). Notably, if α = β = 2, Q3, Q7, Q8, Q11 and Q12 we determined that there are two minimal feature sets
must be in the feature set, as they all are attached to for α = β = 1 for this data, (Q4, Q5, Q7, Q9, Q12)
pair vertices of degree 2 (and thus these pair vertices and (Q2, Q3, Q4, Q7, Q8). Of the two, the first
require those features to be dominated the requisite dominated the most pair vertices, 1403 compared to
number of times). Interestingly Q4 was required, if 1339. The decision trees for these two feature sets are
we want to achieve the minimally sized feature set, shown in Figures 4 and 5. Note that the decision tree
but no reduction indicated this. Out of these nine for the feature set that dominates more pair vertices
feature sets, the one consisting of the six common fea- is more compact, suggesting perhaps that this extra
tures, plus features Q6, Q9 and Q10, dominated the domination indicates greater discriminatory power.
greatest number of pair vertices (including overlaps), The classification rules generated by the PRISM
2377 (see Figure 2). The next nearest feature set in heuristic are shown in Table 5, and the rules from the
these terms dominated 2371 pairs, and consisted of PART heuristic in Table 6.
the six common features plus features Q2, Q5 and Feature sets were also generated for α = 1, β = 0.
9
These feature sets were confirmed as the only 9 by complete
Of these there was 23, all of size 5, which between
enumeration. them used all of the features. The results of this can
Year Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Target Rule Outcome
1864 0 0 0 0 1 0 0 1 1 0 0 0 1
(Q4 = 1) & (Q8 = 0) Challenger Victory
1868 1 1 0 0 0 0 1 1 1 0 1 0 1
1872 1 1 0 0 1 0 1 0 0 0 1 0 1 (Q4 = 1) & (Q9 = 1) Challenger Victory
1880 1 0 0 1 0 0 1 1 0 0 0 0 1 (Q12 = 1) & (Q7 = 0) Challenger Victory
1888 0 0 0 0 1 0 0 0 0 0 0 0 1
(Q4 = 0) & (Q12 = 0) Incumbent Victory
1900 0 1 0 0 1 0 1 0 0 0 0 1 1
1904 1 1 0 0 1 0 1 0 0 0 1 0 1 (Q7 = 1) & (Q4 = 0) Incumbent Victory
1908 1 1 0 0 0 1 0 1 0 0 0 0 1 (Q8 = 1) & (Q9 = 0) Incumbent Victory
1916 0 0 0 0 1 0 0 1 0 0 0 0 1
1924 0 1 1 0 1 0 1 1 0 1 0 0 1
1928 1 1 0 0 0 0 1 0 0 0 0 0 1 Table 3: Classification rules generated by the PRISM
1936 0 1 0 0 1 1 1 1 0 0 1 0 1 heuristic for α = β = 2. Here as well as in the decision
1940 1 1 0 0 1 1 1 1 0 0 1 0 1 tree, the potential robustness given by high α and β
1944 1 1 0 0 1 0 1 1 0 0 1 0 1 values is not exploited, thus not all features from the
1948 1 1 1 0 1 0 0 1 0 0 0 0 1 feature set are used.
1956 0 1 0 0 1 0 1 0 0 0 1 0 1
1964 0 0 0 0 1 0 1 0 0 0 0 0 1 Rule Outcome
1972 0 0 0 0 1 0 0 1 1 0 0 0 1
1860 1 0 1 1 0 0 1 0 1 0 0 0 0 (Q4 = 1) & (Q8 = 0) Challenger Victory (7.0)
1876 1 1 0 1 0 1 0 0 0 1 0 0 0 (Q9 = 0) & (Q12 = 0) Incumbent Victory (14.0)
1884 1 0 0 1 0 0 1 0 1 0 1 0 0 (Q3 = 0) & (Q6 = 0) Incumbent Victory (4.0)
1892 0 0 1 0 1 0 0 1 1 0 0 1 0
1896 0 0 0 1 0 1 0 1 1 0 1 0 0
Otherwise Challenger Victory (6.0)
1912 1 1 1 1 1 0 1 0 0 0 0 0 0
1920 1 0 0 1 0 1 0 1 1 0 0 0 0 Table 4: Classification rules generated by the PART
1932 1 1 0 0 1 1 0 0 1 0 0 1 0 heuristic for α = β = 2. These rules are used in a
1952 1 0 0 1 0 0 0 0 0 1 0 1 0
1960 1 1 0 0 0 1 0 0 0 0 0 1 0
cascade fashion beginning with rule 1. The accompa-
1968 1 1 1 1 0 0 1 1 1 0 0 0 0
nying numbers indicate how many examples each rule
1976 1 1 0 1 1 0 0 0 0 1 0 0 0 classifies out of those left unclassified by the previous
1980 0 0 1 1 1 1 1 0 0 1 0 1 0 rules.
Figure 3: All 23 feature sets for α = 1,β = 0. The feature sets are laid out in rows, with a ‘1’ indicating that
the feature represented by that column is present, a ‘0’ not present. ‘Appearances’ indicates the number of
times a feature appears in total throughout the 23 feature sets, ‘Popularity’ gives this as a ratio of the total
number of feature sets, and ‘Weight’ indicates how many pairs the feature can dominate if it is in the feature
set. The final column gives the number of pairs each feature set dominates, with the largest highlighted.
several features repeatedly being used in most fea- would be at significant advantage, with 10 out of 11
ture sets, indicating that they are probably the most possible outcomes favouring him in three out of the
important (and at least have high discriminatory four trees. There would also be significantly more
power). The most obvious of these is Q4 (Was there rules that would be potentially applicable. Based on
a serious contest for the nomination of the incumbent this it seems that instability in the incumbent party
party candidate?), which occurs in all but two of the is one of the most significant factors.
feature sets, and in roughly half of the rules gener- From there, we consider the answer to Q12 to be
ated by the PART and PRISM heuristics. Similarly it ‘yes’14 , although we uncertain of this, being some-
appears that Q12 (Is the challenging party candidate what outside the system. It is interesting to consider
charismatic or a national hero?) is a highly important that currently (at the time of writing), the Republican
feature, and thus factor in the choice of vote winners. election effort seems to be directed towards reversing
Q7 (The short version being ‘Was the economy strong this opinion (i.e. making the answer to Q12 ‘no’). If
under the incumbent administration?’) is also one of they are successful at this, then three out of four of
the more important features, and appears regularly our decision trees (the fourth doesn’t consider Q12)
in the feature sets and decision trees and rules. indicate an incumbent victory. Notably if Q12 were
Comparing feature sets for fixed values of α and β ‘no’, the path in the tree leading to this decision is
(Figures 3 & 2), there seems to be no obviously signif- much shorter, and classifies 16 out of the 31 exam-
icant trends other than those mentioned above. Some ples, suggesting that U.S. elections rely largely on a
pairs of features seem to be interchangeable once the stable incumbent administration, and the discredit-
‘important’ features (Q4 and Q12) are present, such ing of the challenger. This suggests that the current
as Q8 & Q9 for α = 1, β = 0, where if one is not Republican tactic is a wise and time honoured one,
present, the other almost certainly is, though obvi- and that perhaps they are aware of this.
ously this is not a rule. Other features seem to have We believe the answer to Q7 to also be ‘yes’. From
little import at all, especially Q10 (Was the incum- our basic research the U.S. economic growth seems to
bent administration tainted by a major scandal?), be strong but it appears to have weakened at least in
which is almost never used13 . Interestingly for α = 1, the short term, though the growth rate is still above
β = 0, Q11 (Is the incumbent party candidate charis- that specified in Q7. If the answer to Q7 is in fact
matic or a national hero?), the complementary fea- ‘no’, then three of the decision trees indicate that the
ture of Q12, is almost never used. However when we challenging party will win, making the Republican’s
ask for α = β = 2, it is vital. It seems reasonable to attack on John Kerry’s persona even more relevant,
suggest that α = β = 2 indicates the more subtle in- as a change in Q12 would then change the outcome
teractions present in the data, that only appear when of the vote.
more complex feature sets are considered. From these three answers we have the result ‘In-
cumbent Victory’ for three out of four decision trees.
5.3 Our Prediction For the last we must also answer Q5 and Q9. Ad-
dressing Q9 first, we consider that there has been no
Beginning with the decision trees, Figures 1, 4, 5, major social unrest in the U.S., though again we are
and 6, we choose the answer to Q4 to be ‘no’, which far from experts, and find the question to be ambigu-
we think is a reasonable and obvious answer. If the 14
At the time of writing the current challenging candidate was
answer to Q4 were to be ‘yes’ however, the challenger John Kerry, who was decorated 5 times, including 3 Purple Hearts,
13
in the Vietnam conflict, and also seems to be reasonably charis-
The implications of this we leave to the reader. matic.
Rule Outcome
(Q4 = 1) & (Q8 = 0) Challenger Victory
(Q4 = 1) & (Q7 = 0) Challenger Victory Q4
(Q3 = 1) & (Q2 = 0) Challenger Victory No Yes
(Q4 = 1) & (Q2 = 1) Challenger Victory
(Q7 = 0) & (Q8 = 0) & (Q2 = 1) Challenger Victory
(Q4 = 0) & (Q7 = 1) Incumbent Victory Q12 Q7
(Q4 = 0) & (Q8 = 1) & (Q3 = 0) Incumbent Victory
(Q4 = 0) & (Q2 = 0) & (Q3 = 0) Incumbent Victory No Yes No Yes
(Q8 = 1) & (Q2 = 1) & (Q4 = 0) Incumbent Victory
(Q8 = 1) & (Q7 = 1) & (Q2 = 0) Incumbent Victory IV Q7 CV Q9
16 No Yes 5 No Yes
Table 5: Classification rules generated by the PRISM
heuristic for α = β = 1 from the feature set
(Q2,Q3,Q4,Q7,Q8). CV IV Q5 CV
Rule Outcome 3 1 No Yes 3
(Q4 = 1) & (Q8 = 0)
Challenger Victory (7.0)
(Q4 = 0) & (Q7 = 1)
Incumbent Victory (11.0) CV
IV
(Q4 = 1) & (Q7 =Challenger
0) Victory (2.0)
(Q2 = 0) & (Q3 = 0)
Incumbent Victory (5.0) 1 2
(Q8 = 0) Challenger Victory (2.0)
(Q2 = 1) & (Q4 = 0) Incumbent Victory (2.0)
Otherwise Challenger Victory (2.0)
No Yes No Yes
Q12 Q8
Q8 IV CV Q7
No Yes No Yes
No Yes 11 No Yes
7
Q2 Q3 CV Q2 IV Q9 CV Q9
No Yes No Yes 2 No Yes
16 No Yes 7 No Yes
IV CV IV Q2 IV CV
1 2 4
No Yes
1 1 Q5 CV IV CV
CV IV
No Yes 2 1 3
1 1
CV IV
Figure 5: Alternate decision tree for α = β = 1 based 1 1
on a different optimal feature set. ‘IV’ indicates an
incumbent victory, ‘CV’ a challenger victory. The
numbers beneath each leaf indicate how many of the
examples in the data set that they classify. Figure 6: Decision Tree for the feature set
(Q4,Q5,Q8,Q9,Q12), with α = 1 and β = 0. The
numbers below each leaf indicate how many examples
of several subsequent related articles and develop- the leaf classifies.‘IV’ indicates an incumbent victory,
ments, have also predicted that this coming election ‘CV’ a challenger victory.
will see an incumbent victory16 . Dr. Fair at Yale has
produced his own method of predicting the share of which classes example should belong to. At least for
the votes the incumbent administration will receive the extra information provided by increased α val-
(Fair 1978), and concurs with our and Dr. Licht- ues we can envision a form of decision that allows
man & Dr. Keilis-Borok’s prediction of a Republi- multiple features at each decision point. For exam-
can victory. His model is based however on entirely ple, if we (manually) construct the decision tree for
economic factors, which seems to be a popular and α = 2, we arrive at a tree with (Q4∨Q12) at the
traditional prediction method. Forbes magazine sug- root, with the ‘no’ branch corresponding to Q4 and
gests however that the state of the economy is not as Q12 as ‘no’, and classifying 16 of the ‘Incumbent Vic-
strong a factor as traditionally believed (Ackman & tory’ examples (essentially a contraction of the left
Hazlin 2004), with only 64% of elections being pre- hand branch of the exhibited trees). From the ‘yes’
dictably by economic factors alone, and even then branch we can then insert a decision vertex labelled
Forbes uses a significantly more complex economic (¬Q3∨¬Q7∨Q9), which on the ‘no’ branch classifies
model than is usually proposed, with seven interde- the final two ‘Incumbent Victories’, leaving all the
pendent variables. ‘Challenger Victories’ down the ‘yes’ branch of this
second decision. This kind of tree is obviously more
6 Conclusion useful for solutions to (α, β)k-Feature Set with
α > 1. It remains to generalise and automate this
We presented in this paper a deterministic and opti- process. It also remains to incorporate the informa-
mality preserving method of reductions to allow the tion giving by increased β values.
solutions to a series of problems related to the k- With regard to the actual data used, we predict
Feature Set problem to be found. We also pre- that it is most probable that George W. Bush will
sented a small test data set and the application of serve a second term as President of the United States
this system and other techniques for the use in anal- of America, if there are no dramatic changes in the
ysis, classification and prediction with regards to the candidates or the knowledge of the public.
system the data represents. We believe this method is
both flexible and powerful, and has clear applications 6.1 Acknowledgements
in all field where data mining techniques are used.
Further research may include the development or P. Moscato would like to thank Dr. Keilis-Borok for a
modification of current methods for generating deci- discussion they had in December 1991 in Trieste while
sion trees and classifying rule sets to allow the ex- both were at the International Centre for Theoretical
ploitation of the extra power and information offered Physics.
by the (α, β)k-Feature Set variant of the prob-
lem. The use of these ideas on unclassified data is
also an interesting area of potential research, using References
the maximisation of the α and β values to indicate
D. Ackman & M. Hazlin, It’s not the economy, stupid,
16
http://www.counterpunch.org/lichtman07292004.html Forbes,
http://www.signonsandiego.com/uniontrib/20040509/news 1n9predict.html http://moneycentral.msn.com/content/invest/forbes/
http://hnn.us/articles/6599.html P92882.asp?GT1=4529, (2004)
C. Cotta & P.ViewMoscato,
publication stats
The k-Feature Set Prob-
lem is W[2]-Complete, Journal of Computer and
System Sciences, 67(4), (2003), pages 686-690
C. Cotta, C. Sloper & P. Moscato, Evolutionary
Search of Thresholds for Robust Feature Set Se-
lection: Application to the Analysis of Microar-
ray Data, Proceedings of EvoBio2004 - 2nd Euro-
pean Workshop on Evolutionary Computation and
Bioinformatics, Coimbra, Portugal, April, (2004)
R. Downey & M. Fellows, Parameterized Complexity,
Springer, (1997)
S. Davies & S. Russell, NP-Completeness of Searches
for Smallest Possible Feature Sets, Proceedings of
the AAAI Symposium on Relevence, (1994), pages
41-43
R. Fair, The Effect of Economic Events on Votes for
President, The Review of Economics and Statis-
tics, May, (1978), pages 159-173
http://fairmodel.econ.yale.edu/vote2004/index2.htm