(Reverse Engineering)
Here's a logical description of Euclid's algorithm for finding the greatest common divisor (gcd) of
two numbers:
- The gcd of u and 0 is u.
- The gcd of u and v, if v is not 0, is the same as the gcd of v and the remainder of dividing v into u.
and here's one PROLOG version of it:
gcd(U,0,U).
gcd(U,V,W) :- not(V=0), R is U mod V, gcd(V,R,W).
Of course, there is a lot of syntactic sugar in this, but the basic idea is the same. Alternatively, the last rule
can be thought of as:
gcd(U,V,W) :- notzero(V), modval(U,V,R), gcd(V,R,W).
if you find this more intuitively appealing (But take care to note that notzero and modval need to be appropriately defined by you).
The task of this problem is two fold :
- (i) Use an ILP system like PROGOL to mine the above two rules!! This means that you will
have to think of how the data is to be specified to PROGOL, positive examples of gcd, negative examples of gcd,
background
knowledge (this is the tricky part) and the
header information. This should not be too difficult because you know what exactly you are mining for! (This
exercise will demonstrate that you can use induction to "synthesize" logic programs)
This is similar to the Eleusis card game
of induction. (Someone else plays the game and you determine the rules! :-))
- (ii) Now note down the support and the confidence (%) for each one of the two rules (Recall that support and confidence
are based solely on the data that you supply to PROGOL). Now start decreasing the number of positive and negative
examples steadily (say, in units of one).
What parameters of PROGOL (look in Chapter 5 of the manual) do you now have to tune to obtain the same rules with the same confidence
levels? (This exercise will give you an understanding of how the various parameters affect the process of rule
induction). A simple graph will help you to see the trend.