Generalization of Clauses under Implication

In the area of inductive learning, generalization is a main operation, and the usual definition of induction is based on logical implication. Recently there has been a rising interest in clausal representation of knowledge in machine learning. Almost all inductive learning systems that perform generalization of clauses use the relation theta-subsumption instead of implication. The main reason is that there is a well-known and simple technique to compute least general generalizations under theta-subsumption, but not under implication. However generalization under theta-subsumption is inappropriate for learning recursive clauses, which is a crucial problem since recursion is the basic program structure of logic programs. We note that implication between clauses is undecidable, and we therefore introduce a stronger form of implication, called T-implication, which is decidable between clauses. We show that for every finite set of clauses there exists a least general generalization under T-implication. We describe a technique to reduce generalizations under implication of a clause to generalizations under theta-subsumption of what we call an expansion of the original clause. Moreover we show that for every non-tautological clause there exists a T-complete expansion, which means that every generalization under T-implication of the clause is reduced to a generalization under theta-subsumption of the expansion.


Introduction
The topic of this paper is generalization of clauses, which is a central problem in the area of Inductive Logic Programming (ILP) (Muggleton, 1991(Muggleton, , 1993)).ILP can be seen as the intersection of inductive machine learning and computational logic.In inductive machine learning the goal is to develop techniques for inducing hypotheses from examples (observations).By using the rich representation formalism of computational logic (clauses) for hypotheses and examples, ILP can overcome the limitations of classical machine learning representations, such as decision trees (Quinlan, 1986).
By using a clausal representation we have the ability to learn all types of hypotheses describable in rst-order logic, in particular the important class of recursive hypotheses.Another advantage of using a clausal representation is that clausal theories are easy to manipulate for machine learning algorithms.This is due to that changes to a clausal theory by adding or deleting clauses or literals have clear and simple e ects on the generality of the theory.The reader is referred to two introductions to ILP, one presented by Muggleton and De Raedt (1994), and one by Lavra c and D zeroski (1994).Lavra c and De Raedt (1995) present a recent survey of ILP research.
We use the following de nition of induction.A theory (background knowledge) T, a set of positive examples fE + 1 ; : : : ; E + n g and a set of negative examples fE 1 ; : : : ; E m g of a target concept are given.Then a hypothesis H for the target concept is an inductive conclusion if and only if: c 1995 AI Access Foundation and Morgan Kaufmann Publishers.All rights reserved.
In other words, the positive examples should not be a logical consequence of the theory alone, but a logical consequence of the theory together with the hypothesis, and no negative example should be a logical consequence of the theory and the hypothesis.Using clausal representation T, H, fE + 1 ; : : : ; E + n g and fE 1 ; : : : ; E m g are sets of clauses.
In this paper we concentrate on the subproblem in inductive learning of nding a clause that is a generalization of a set of positive examples.In other words, nding a clause C such that C j = E + 1 ^: : : ^E+ n : We are particularly interested in least general generalizations, since every generalization of a set of clauses is also a generalization of the least general generalization of this set of clauses.Therefore a least general generalization in some sense represents all generalizations.
A least general generalization is also consistent with the negative examples whenever there exists a consistent generalization.
The most natural and straightforward basis for generalization is implication, since induction is de ned in terms of logical consequence.Plotkin has described (1970Plotkin has described ( , 1971a) ) a technique for the computation of least general generalizations of clauses under a relation called -subsumption.This relation has been accorded much interest, and it is often used instead of implication, since it is easier to compute.However, there is a di erence between -subsumption and implication, which sometimes causes the generalizations obtained by Plotkin's technique to be over-generalizations with respect to implication.
Consider the following clauses in which s denotes the successor function: C 1 = ( number(s(0)) number(0) ); C 2 = ( number(s 3 (0)) number(s(0)) ); D 1 = ( number(s(x)) number(y) ); and D 2 = ( number(s(x)) number(x) ): The clause D 1 is a least general generalization under -subsumption (LGG ) of C 1 and C 2 , and the clause D 2 is a least general generalization under implication (LGGI) of C 1 and C 2 .It is clear that D 1 is strictly more general than D 2 , both under -subsumption and under implication.It is also clear that D 2 is more appropriate in a de nition of natural number.
To learn recursive clauses, generalization under -subsumption is not very adequate, as illustrated above.The ability to learn recursive clauses is crucial, since recursion is the basic program structure of logic programs.
In section 2, we describe the most important results concerning generalization under -subsumption, and present a theoretically study of generalization under implication.In section 3, we present a technique to reduce implication to -subsumption based on orintroduction of literals.Finally, our results, computational complexity and future work are discussed in section 4.
We assume the reader to be familiar with the basic notions and notations in Logic Programming (Lloyd, 1987) and/or Automatic Theorem Proving (Chang & Lee, 1973;Gallier, 1986).

Generalization of Clauses
In the area of Inductive Logic Programming (ILP), the framework for generalization of clauses developed by Plotkin (1970Plotkin ( , 1971bPlotkin ( , 1971a)), has been accorded much interest.In this section we will describe this framework, which is based on a relation known assubsumption, and the most important results connected with it.
Since generalization under -subsumption is not su cient for generalization of recursive clauses, as shown in the introduction, we will study the theory of generalization under implication.We note that implicaton between clauses is undecidable, and we will therefore introduce a restricted form of implication, called T-implication.Example Consider the following clauses: C = ( p(x) q(x; y); q(y; z); q(z; w); q(w; x) ); D = ( p(x) q(x; y); q(y; x); q(x; x) ); and E = ( p(x) q(x; x) ): We have C D since Cfz=x; w=yg D, D E since Dfy=xg E, and thus C E. We also have E D. Hence, D E and still D 6 ' E.

Generalization under -subsumption
Theorem 1 states that -subsumption between clauses is decidable.This was rst shown by Robinson (1965, page 39).
Theorem 1 (Decidability of -subsumption between clauses) Let C and D be clauses.Then there exists a procedure to decide if C D.
As mentioned in the introduction, we are particularly interested in least general generalizations.The main reason is that a least general generalization includes the information of all consistent generalizations.
De nition A clause C is a generalization under -subsumption of a set of clauses S = fD 1 ; : : : ; D n g if and only if, for every 1 i n, C D i .A generalization undersubsumption C of S is a least general generalization under -subsumption (LGG ) of S if and only if, for every generalization under -subsumption C 0 of S, C 0 C.
In general, an LGG is not unique, as shown by the example above.However, it is unique up to -subsumption equivalence.Plotkin has shown (1971a, page 82) that there exists an LGG of every nite set of clauses.
Theorem 2 (Existence of LGG s) Let S be a nite set of clauses.Then there exists an LGG of S.
An LGG of a set of clauses is computable, and Plotkin has described (1971a) an algorithm for that.This algorithm is quite simple and easy to implement, but computationally expensive.

Generalization under Implication
Implication is the most natural and straightforward basis for generalization in inductive learning, since the concept of induction can be de ned as the inverse of logically entailment.It is well-known that implication is re exive and transitive.Two clauses may be equivalent under implication without being equivalent under -subsumption.
Example Consider the following clauses: C = ( p(x; y; z) p(y; z; x) ); and D = ( p(x; y; z) p(z; x; y) ): Then we have C , D, since D is a resolvent of C resolved with itself, and C is a resolvent of D resolved with itself.We also have C 6 ' D, and even C 6 D.
It has been claimed that implication and -subsumption are equivalent for function-free clauses (Helft, 1987).This is wrong as shown by the example above.The above example also shows that if a clause C implies a clause D then C does not necessarily -subsume D. It is well-known that implication is a strictly weaker relation between clauses than -subsumption.
Proposition 3 Let C and D be two clauses.If C D then C ) D.
Theorem 4 (Undecidability of implication between clauses) Let C and D be clauses.
Then there exists no procedure to decide if C ) D. Niblett (1988) has claimed that implication between Horn clauses is decidable.This result has later been proved to be false (Marcinkowski & Pacholski, 1992).
The de nition of a least general generalization under implication (LGGI) follows the de nition of an LGG .The clause E is an LGG of fC; Dg, and F is an LGGI of fC; Dg.The LGG (clause E) is strictly more general than the LGGI (clause F), both under implication and under -subsumption, since E ) F but F 6 ) E, and E F but F 6 E.
Whether there exists an LGGI of every nite set of clauses is still an open problem.However, since implication between clauses is undecidable, it is clear that in general an LGGI is not computable.

T-implication
Because implication between clauses is undecidable, we here introduce a stronger form of implication called T-implication, which is decidable between clauses.It is called Timplication since it is de ned w.r.t. a nite set of ground terms T. In our presentation we use the notions of instance set of clauses, Skolem substitution, and term set of sets of clauses.
De nition Let C be a clause, fx 1 ; : : : ; x n g the set of variables in C, and T a set of terms.
Then the instance set I(C; T) of C w.r.t.T is fC j = fx 1 =t 1 ; : : : ; x n =t n g where ft 1 ; : : : ; t n g Tg.
De nition Let be a substitution, C a clause, fx 1 ; : : : ; x n g the set of variables occurring in C, S a set of clauses, and F the set of function symbols occurring in S fCg.Then is a Skolem substitution for C w.r.t.S if and only if fx 1 =a 1 ; : : : ; x n =a n g where a 1 ; : : : ; a n are distinct constants, and F \ fa 1 ; : : : ; a n g = ;.
De nition Let fD 1 ; : : : ; D n g be a set of clauses such that D 1 ; : : : ; D n have no variables in common, S be a set of clauses, a substitution, and T a set of terms.Then T is a term set of fD 1 ; : : : ; D n g by w.r.t.S if and only if: a) is a Skolem substitution for fD 1 ; : : : ; D n g w.r.t.S, and b) T is nite and includes all terms and subterms occurring in fD 1 ; : : : ; D n g .If T is equal to the set of terms and subterms occurring in fD 1 ; : : : ; D n g then T is a minimal term set of fD 1 ; : : : ; D n g by w.r.t. S. Like implication, T-implication is re exive, but unlike implication, T-implication is not transitive (Idestam-Almquist, 1993a).The relationship between implication and Timplication, described in Corollary 6 below, follows from Herbrand's theorem.For a proof of Herbrand's theorem the reader is referred to a book by Chang and Lee (1973, page 61).In our proof of Corollary 6 we use the notion of the complement of a clause.The clause E is an LGG of fC; Dg, and F is both an LGGI and an LGGT of fC; Dg.The LGGT is strictly more speci c than the LGG , since E ) F and F 6 ) E.

De nition Let C = (
Below we prove that there exists an LGGT of every nite set of clauses.In fact we prove something stronger, namely that there exists, what we call, a complete LGGT of every nite set of non-tautological clauses.Note that a complete LGGT is -subsumed by any other generalization under T-implication.Lemma 10 Let S be a nite set of non-tautological clauses, T = ft 1 ; : : : ; t m g a term set of S, V = fx 1 ; : : : ; x m g a set of variables, and G = fC 1 ; C 2 ; : : :g the (possibly in nite) set of all generalizations under T-implication of S w.r.t.T. Then the set G 0 = I(C 1 ; V ) I(C 2 ; V ) : : : is a nite set of clauses.

De nition
Proof: Let d be the maximal depth of a clause in S, and F S and F G the sets of predicate and function symbols occurring in the clauses in S and G respectively.Then F G V is the set of variables, predicate and function symbols occurring in the clauses in G 0 .By Corollary 6, G is a set of generalizations under implication of S.Then, by Proposition 9 and the de nition of -subsumption, F G F S and the maximal depth of a clause in G is d.Hence F G V is nite and the maximal depth of a clause in G 0 is d, and consequently G 0 is a nite set of clauses. 2 Lemma 11 Let C be a clause, S a set of clauses, V = fx 1 ; : : : ; x m g a set of variables, and T = ft 1 ; : : : ; t m g a term set of S by w.r.t fCg, such that C is a generalization under T-implication of S w.r.t.T. Then there exists an LGG E of I(C; V ) such that E is a generalization under T-implication of S w.r.t.T.
Proof: Let I(C; V ) = fC 1 ; : : : ; C k g.Then, 1 ; : : : ; k are variable-pure substitutions, and for every LGG F of I(C; T) and every 1 i k, we have C F and F C i .Then, there exists an LGG E of I(C; T) and variable-pure substitutions 1 ; : : : ; k such that, for every 1 i k, E i C i .Let = fx 1 =t 1 ; : : : ; x m =t m g, and then I(C; T) = fC 1 ; : : : ; C k g.Since C is a generalization under T-implication of S w.r.t.T, we have fC 1 ; : : : ; C k g j = S .For every 1 i k, E i C i .Then, for every 1 i k, by Proposition 3, E i ) C i , and thus fE 1 ; : : : ; E k g j = fC 1 ; : : : ; C k g.Since 1 ; : : : ; k are variable-pure substitutions, we have fE 1 ; : : : ; E k g I(E; T).Thus, I(E; T) j = fC 1 ; : : : ; C k g, and I(E; T) j = S .Consequently E is a generalization under T-implication of S w.r.Theorem 13 (Existence of complete LGGTs) Let S be a nite set of non-tautological clauses, and T a term set of S. Then there exists a complete LGGT of S w.r.t.T.
Proof: Let T = ft 1 ; : : : ; t m g, V = fx 1 ; : : : ; x m g be a set of variables, and G = fC 1 ; C 2 ; : : :g the (possibly in nite) set of all generalizations under T-implication of S w.r.t.T. By Lemma 10, the set G 0 = I(C 1 ; V ) I(C 2 ; V ) : : : is a nite set of clauses.Since G 0 is nite, the set fI(C 1 ; V ); I(C 2 ; V ); : : :g is also nite.For every i 1, by Lemma 11, there exists an LGG E i of I(C i ; V ) such that E i is a generalization under T-implication of S w.r.t.T. Then rename the variables in E 1 ; E 2 ; : : : such that, for every k 1 and p 1, E k = E p whenever I(C k ; V ) = I(C p ; V ) and otherwise E k and E p have no variables in common.Then the set fE 1 ; E 2 ; : : :g is nite, since fI(C 1 ; V ); I(C 2 ; V ); : : :g is nite.Let F = E 1 E 2 : : :, which consequently is a clause.For every i 1, by the de nition of an LGG , C i E i , and thus C i F.Then, for every i 1, by Proposition 7, C i ) T F. As showed above, for every i 1, E i is a

Reduction of Implication to -subsumption
There are generalizations under implication that are not generalizations under -subsumption.Our main idea to nd all generalizations under implication, is to reduce implication to -subsumption, which can be achieved by inverting self-resolution.In this section we will describe a technique for inverting resolution based on or-introduction of literals.We will also introduce the notion of expansion of clauses, which summarizes our idea of reduction of implication to -subsumption.

Di erence between -subsumption and Implication
In section 2.2, we showed that C ) D follows from C D, but not the converse.
Hence, there are generalizations under implication that are not generalizations undersubsumption.It follows from a result by Gottlob (1987) that the di erence betweensubsumption and implication only concerns ambivalent clauses, as de ned below.Proposition 15 has been proved by Gottlob (1987, page 110).It follows from this proposition that an LGG and an LGGI of a set of clauses, including at least one nonambivalent clause, are equivalent.Muggleton (1992) has investigated the relationship between resolution and implication between clauses.He describes the subsumption theorem (Lee, 1967) in terms of input resolution, and gives a corollary about the relationship between -subsumption and implication between clauses.Unfortunately, this formulation of the subsumption theorem, which later also has been used by Idestam-Almquist (1993c, 1993a), has been shown to be wrong.Nienhuys-Cheng and de Wolf (1995) have given a counter-example which shows that the subsumption theorem for input resolution does not even hold in the special case where the considered set of clauses contains only one clause.Below we give the correct formulation of the subsumption theorem, which is based on the nth resolution (Robinson, 1965).

De nition
De nition A substitution is a uni er for a nite set of literals S if and only if S is a singleton.A uni er for S is a most general uni er (mgu) for S if and only if for each uni er of S there exists a substitution such that = .
De nition Let C be a clause, C and an mgu of .Then C is a factor of C. De nition Let T be a set of clauses.Then, the nth resolution of T, denoted R n (T), is de ned as: a) R 0 (T) = T, and b) R n (T) = R n 1 (T) fR j C; D 2 R n 1 (T) and R is a resolvent of C and Dg if n > 0.
Theorem 16 (Subsumption theorem) Let T be a set of clauses and C a non-tautological clause.Then T j = C if and only if there exists a clause D 2 R n (T) such that D C for some n 0.
Two di erent recent proofs of Theorem 16 have been presented, one by Nienhuys-Cheng and de Wolf (1995), and one by Bain and Muggleton (1992).There also exist at least two di erent earlier proofs of this theorem in the literature, one by Slagle, Chang and Lee (1969), and one by Kowalski (1970).We are interested in the number of resolutions involved in the computation of a clause, and therefore we introduce the notion of nth resolution layer.A clause in the nth resolution layer has been obtained from the original set of clauses by n 1 resolutions.
De nition Let T be a set of clauses.Then, the nth resolution layer of T, denoted L n (T), is de ned as: a) L 1 (T) = T, and b) L n (T) = fR j R is a resolvent of C 2 L m (T) and D 2 L p (T) where m+p = n 1, m 1 and p 1g if n > 1.
Corollary 17 (Implication between clauses using resolution) Let C be a clause and D a non-tautological clause.Then C ) D if and only if there exists a clause E 2 L n (fCg) such that E D for some n 1.
Corollary 17 follows from Theorem 16, and the observation that, for every n 1, if a clause C 2 L n (T) then also C 2 R n (T).This corollary tells us that implication between clauses is equivalent to a combination of self-resolution and -subsumption.Muggleton (1992) has introduced the notion of powers and roots of clauses for specializations and generalizations of clauses where the clauses are resolved with themselves.Below we present de nitions of these and related concepts modi ed w.r.t. the correct de nition of the subsumption theorem.
De nition A clause D is an nth power of a clause C if and only if D is a variant of a clause in L n (fCg) (n 1).We also say that C is an nth root of D. A clause D is an indirect nth power of a clause C if and only if there exists a clause E such that E D and E is an nth power of C. We also say that C is an indirect nth root of D. Let C be a clause and D an indirect nth power of C. Then D is a proper indirect nth power of C if and only if C 6 D. We also say that C is a proper indirect nth root of D. To say that a clause implies another non-tautological clause or to say that the clause is an indirect root of the other clause, is equivalent.However, to say that a clause is an indirect nth root for some speci ed n is more informative.
Implication between clauses can be described as a combination of self-resolution and -subsumption.Plotkin's algorithm to compute LGG s gives us a suitable tool for nding generalizations under -subsumption.Hence, to be able to nd generalizations under implication we also need a technique to invert resolution.

Inverting One Resolution by Or-introduction
Other work on inverting resolution has primarily considered the problem of constructing one parent clause given the resolvent and the other parent clause (Muggleton & Buntine, 1988;Rouveirol & Puget, 1989;Wirth, 1989;Muggleton, 1990;Hume & Sammut, 1991;Idestam-Almquist, 1992;Rouveirol, 1992).Below we will describe how or-introduction can be used to construct two parent clauses from only the resolvent.Let C and D be clauses, and the following clause R a resolvent of C and D: R = ((C fAg) (D fBg)) ; where C is a factor of C, D is a factor of D, A 2 C , B 2 D and is an mgu for fA; Bg.We seek parent clauses of R that are minimally general.Then we should let , and be empty substitutions, which corresponds to an assumption that no instantiation of variables has been done in the resolution of C and D, and thus we have A = B. We should also let C fAg = D fBg, which corresponds to an assumption that each literal in R is The set of clauses fD 1 ; D 2 g is or-introduced from C by p(f 2 (a))], and the set of clauses fE 1 ; E 2 g is or-introduced from D 1 by p(f(a))].Consequently, the set of clauses fD 2 ; E 1 ; E 2 g is or-introduced from C by p(f 2 (a)); p(f(a))].
In the example above, clause D 1 is a resolvent of E 1 and E 2 , and C is a resolvent of D 1 and D 2 .Consequently, C is derivable from fD 2 ; E 1 ; E 2 g by resolution.That a set of clauses or-introduced from a clause is logically equivalent to the clause, is shown by the following theorem.
Theorem 20 (Equivalence preservation of or-introduction) Let S be a set of clauses or-introduced from a clause C by a sequence of literals L 1 ; : : : ; L n ].Then S fCg.
Proof: The proof is by mathematical induction on n.It should be noted that S, in the statement of the theorem, in the proof is indexed by n.
Base step (n=0): S 0 is or-introduced from C by ].Hence S 0 fCg.Induction hypothesis (n=k): S k fCg, where S k is or-introduced from C by L 1 ; : : : ; L k ].
Induction step (n=k+1): Let D 2 S k .Then S k+1 = (S k fDg) fD fL k+1 g; D fL k+1 gg is or-introduced from C by L 1 ; : : : ; L k ; L k+1 ].By Proposition 18, we have fD fL k+1 g; D fL k+1 gg fDg, and consequently S k+1 S k .By the induction hypothesis S k fCg, and thus S k+1 fCg. 2 In section 3.2 we showed that it is possible to invert one resolution by or-introduction of one literal.Below we show that it is possible to invert a sequence of resolutions by or-introduction of a sequence of literals.
Lemma 21 Let D and E be clauses, fC 1 ; : : : ; C n g a set of clauses, and fD 1 ; : : : ; D n g a set of clauses or-introduced from D, such that D E and, for every 1 i n, C i D i .
Then there exists a set of clauses fE 1 ; : : : ; E n g or-introduced from E, such that for every 1 i n, C i E i .
Proof: Let D i be an arbitrary clause in fD 1 ; : : : ; D n g.Then we have D i = D i for some set of literals i , since fD 1 ; : : : ; D n g is or-introduced from D. Since C i D i , there exists a substitution i such that C i i D i .Since D E, there exists a substitution such that D E. Thus we have (D i ) (E i ), and consequently C i i E i .Let E i = E i and we have C i E i . 2 Theorem 22 (Inverting resolution using or-introduction) Let T be a set of clauses, D a clause in L n (T).Then there exists a set of clauses S or-introduced from D such that for each E 2 S there exists a clause C 2 T such that C E.
Proof: The proof is by complete mathematical induction on n.It should be noted that D and S, in the statement of the theorem, in the proof are indexed by n.
Base step (n=1): By the de nition of nth resolution layer L 1 (T) = T, and thus D 1 2 T. We have that S 1 = fD 1 g is or-introduced from D 1 by the empty sequence of literals 1 = ].Hence, for D 1 2 S 1 there exists a clause D 1 2 T such that D 1 D 1 .
Induction hypothesis (n=k): For every 1 i k, there exists a set of clauses S i orintroduced from D i by some sequence of literals i = L 1 ; : : : ; L i 1 ] such that for each E 2 S i there exists a clause C 2 T such that C E. Then it follows from the de nition of or-introduction that S k+1 = S 0 m S 0 p is a set of clauses or-introduced from D k+1 by k+1 = L; A 0 1 ; : : : ; A 0 m ; B 0 1 ; : : : ; B 0 p ]. Consequently, there exists a set of clauses S k+1 or-introduced from D k+1 such that for each E 2 S k+1 there exists a clause C 2 T such that C E. 2

Expansion of Clauses
In the section 3.3 it was described how a reduction of generalization can be achieved by replacing a clause by a set of clauses.Here we show how this set of clauses equivalently can be described by a single clause, which we call an expansion of the original clause.By de nition, if a clause C -subsumes every clause in a set of clauses S, then C will also -subsume an LGG of S. This leads us to our de nition of expansion of clauses.The idea of expansion of clauses was rst presented by Idestam-Almquist (1993c).The set of clauses fD 1 ; D 2 ; D 3 g is or-introduced from the clause D by p(f 2 (a)); p(f(a))], and E is an LGG of fD 1 ; D 2 ; D 3 g.Consequently, E is an expansion of D by p(f 2 (a)); p(f(a))].

De nition
Note that implication has been reduced to -subsumption in the example above.We Proof: By the de nition of expansion, we know that there exists a set of clauses S orintroduced from D by such that E is an LGG of S. By Theorem 20, we have fDg S. By the de nition of an LGG , E F for each F 2 S. Then by Proposition 3, E ) F for each F 2 S. Thus fEg j = S, and consequently E ) D. We have D F for each F 2 S. Then by the de nition of an LGG , we have D E, and by Proposition 3 D ) E. Consequently, E , D. 2 Below we prove that for every generalization under implication of a clause there exists an expansion of the clause such that the generalization under implication is reduced to a generalization under -subsumption.
Theorem 24 (Reduction of implication to -subsumption using expansion) Let C be a clause and D a non-tautological clause such that C ) D. Then there exists an expansion E of D such that C E.
Proof: By Corollary 17, there exists a clause D 0 2 L n (fCg) such that D 0 D for some n 1.By Theorem 22, there exists a set of clauses S 0 or-introduced from D 0 such that for each F 0 2 S 0 we have C F 0 .Then it follows from Lemma 21 that there exists a set of clauses S or-introduced from D such that for each F 2 S we have C F. Then let E be an LGG of S, and thus an expansion of D, and we have C E by the de nition of an LGG . 2

Complete Expansion
Generalizations under implication of a clause can be reduced to generalizations undersubsumption of an expansion of the clause.We are particularly interested in expansions of clauses such that every generalization under implication is reduced to a generalization under -subsumption of that particular expansion.
De nition Let D be a clause, and E an expansion of D. Then E is a complete expansion of D if and only if, for every clause C, C E whenever C ) D.
Recall that an expansion of a clause is a clause, and thus nite.Muggleton and Page (1994, page 166) has shown that complete expansions, which they call nite self-saturations, do not exist for all clauses.
Theorem 25 (Non-existence of complete expansions) There exist non-tautological clauses for which there exist no complete expansions.
The non-existence of complete expansions is due to that for some clauses there are in nitely many distinct generalizations under implication.Because of this we turn to the problem of reducing every generalization under T-implication to a generalization under -subsumption of a single expansion.If we can compute T-complete expansions of a set of clauses then we can use Plotkin's algorithm for computing an LGG to compute an LGGT.From the proof of Theorem 26 it follows that the candidate set of literals to be used to compute a T-complete expansion is nite.Since expansion is equivalence preserving we could simply test all di erent ways to expand a clause by sequences of literals from this candidate set, and in this way obtain a T-complete expansion.This is of course an extremely complex process, but at least theoretically, T-complete expansions and LGGTs are computable.

De nition
As noted in section 3.5, T-complete expansions and LGGTs are computable, but such a computation is extremely costly.This is not surprising since our framework for generalization under implication is based on and extends Plotkin's framework for generalization under -subsumption, which already su ers from complexity problems.In general an LGG of a set of clauses may grow exponentially in the number of clauses in the set (Muggleton & Feng, 1990).Even an LGG reduced under -subsumption, which means that all literals that are redundant under -subsumption are removed, may grow exponentially in the number of clauses (Kietz, 1993).Since an expansion of a clause is an LGG of a set of or-introduced clauses, the computational cost of an expansion grows exponentially in the number of literals used in the or-introduction.In the computation of a T-complete expansion a large number of literals may be considered, and consequently such a computation would be extremely costly.
However, although Plotkin's framework for generalization under -subsumption is computationally expensive, it has been widely used as a theoretical framework.Then to make it practical, a number of di erent restrictions on the clausal language has been considered, for example ij-determinacy (Muggleton & Feng, 1990).In a similar way we hope to nd restrictions under which our here presented framework for generalization under implication can be practically useful.Idestam-Almquist (1993b, 1993a) has described a technique to e ciently compute a restricted form of generalizations under implication.Recently, Muggleton has presented another approach based on generating a number of clauses, so called sub-saturants, which are candidates for being indirect roots, and then testing whether they are so or not (Muggleton, 1995).This approach might be a way to more e ciently compute some generalizations under implication.Some approaches to learn recursive de nitions (recursive logic programs) by generalization under implication have been presented (Lapointe & Matwin, 1992;Aha, Lapointe, Ling, & Matwin, 1994;Idestam-Almquist, 1995).These approaches are based on structural analysis of the given examples, but can theoretically be described in our framework.
A study by Cohen (1995aCohen ( , 1995b) ) of the learnability of recursive logic programs has previously been presented in this journal.In this study it was shown that a recursive logic program consisting of one constant-depth determinate closed k-ary recursive clause and one constant-depth determinate non-recursive clause is PAC-learnable given an additional \base-case oracle", which determines if a positive example is covered by the non-recursive base clause of the target program alone.It was also shown that generalizing this class of learning problem in any natural way leads to a computationally di cult problem.This result tells us that to e ciently learn more complex recursive hypotheses some extra information, such as rule models (Kietz & Wrobel, 1992) or program recursion schemes (Hamfelt & Nilsson, 1994), is needed.
The contributions of this paper are threefold.First, we have systematically reviewed and discussed the concepts relevant to generalization in a rst-order setting.Second, we have introduced T-implication, a stronger form of implication which is decidable between clauses.Third, we have further developed previous work of the author (Idestam-Almquist, 1993c) on extending Plotkin's framework for generalization under -subsumption to generalization under implication.
De nition A clause C -subsumes a clause D, denoted C D, if and only if there exists a substitution such that C D. Two clauses C and D are equivalent under -subsumption, denoted C D, if and only if C D and D C. -subsumption is re exive and transitive.Two clauses may be equivalent undersubsumption without being variants.Two clauses C and D are variants, denoted C ' D, if they are equal up to variable renaming.
De nition A clause C implies a clause D, denoted C ) D, if and only if every model for C is a model for D (fCg j = D).Two clauses C and D are equivalent under implication, denoted C , D, if and only if C ) D and D ) C.
De nition A clause C is a generalization under implication of a set of clauses S = fD 1 ; : : : ; D n g if and only if, for every 1 i n, C ) D i .A generalization under implication C of S is a least general generalization under implication (LGGI) of S if and only if, for every generalization under implication C 0 of S, C 0 ) C. Example Consider the following clauses: C = ( p(f(a)) p(a) ); D = ( p(f 2 (b)) p(b) ); E = ( p(f(x)) p(y) ); and F = ( p(f(z)) p(z) ): Let C and D be clauses, and T a term set of fDg by w.r.t.fCg.Then C T-implies D w.r.t.T, denoted C ) T D, if and only if I(C; T) j = D .Two clauses C and D are equivalent under T-implication w.r.t.T 0 , denoted C , T 0 D, if and only if C ) T 0 D and D ) T 0 C, where T 0 is a term set of fC; Dg.Note that the de nition of T-implication is independent of the choice of the Skolem substitution .In the following, if we say that a clause C T-implies a clause D without explicitly stating T, we mean that C T-implies D w.r.t. a minimal term set of fDg.Note that if C T-implies D w.r.t. a minimal term set of D then C T-implies D w.r.t.any term set of D. Example Consider the following clauses C and D, substitution , set of terms T and set of clauses I(C; T): C = ( p(f(x)) p(x) ); D = ( p(f 2 (y)) p(y) ); = fy=ag; T = fa; f(a); f 2 (a)g; and I(C; T) = f ( p(f(a)) p(a) ); ( p(f 2 (a)) p(f(a)) ); ( p(f 3 (a)) p(f 2 (a)) ) g: Then T is a minimal term set of fDg by w.r.t.fCg, and I(C; T) is the instance set of C w.r.t.T. We have that I(C; T) j = D and thus C ) T D. Note that C ) D, and that C 6 D.
A 1 ; : : : ; A m B 1 ; : : : ; B n ) be a clause, T a set of clauses, and = fx 1 =a 1 ; : : : ; x k =a k g a Skolem substitution for C w.r.t.T. Then the set of ground unit clauses f( A 1 ) ; : : : ; ( A m ) ; (B 1 ) ; : : : ; (B n ) g is the complement C of C by w.r.t.T.Theorem 5 (Herbrand's theorem) A set of clauses S is unsatis able if and only if there exists a nite unsatis able set S 0 of ground instances of clauses in S. Corollary 6 (Relationship between implication and T-implication) Let C and D be clauses.Then: a) if C ) T D for some term set T of fDg then C ) D, and b) if C ) D then there exists a term set T of fDg such that C ) T D. Proof: a) If C ) T D then I(C; T) j = D , where T is a term set of fDg by w.r.t.fCg.Hence, I(C; T) D j = ?, where D is the complement of D by .By Theorem 5, fCg D j = ?, and thus C ) D. b) If C ) D then fCg D j = ?, where D is the complement of D by w.r.t.fCg.Then by Theorem 5, there exists a term set T of fDg such that I(C; T) D j = ?, and thus I(C; T) j = D .Then by de nition C ) T D. 2 It follows from Corollary 6 that T-implication can become an arbitrary good approximation of implication by extending the considered term set.T-implication is a strictly stronger relation between clauses than implication.The following example illustrates that if a clause C implies a clause D then C does not necessarily T-imply D. Example Consider the following clauses C, D and E, and set of terms T: C = ( p(f(x); y) p(z; x) ); D = ( p(f(x); y) p(z; w) ); E = ( p(f(a); a) p(a; f(a)) ); and T = fa; f(a)g: Then C ) E since D is a resolvent of C resolved with itself and E is an instance of D. The set of terms T is a minimal term set of E. We do not show here the whole set I(C; T), but just point out that I(C; T) 6 j = E and thus C 6 ) T E. However if we extend T to T 0 = fa; f(a); f 2 (a)g then I(C; T 0 ) j = E, and thus C ) T 0 E. Below we show that if a clause C -subsumes a clause D then C also T-implies D. Thus, T-implication is a strictly weaker relation between clauses than -subsumption.We also show decidability of T-implication between clauses.Proposition 7 Let C and D be clauses and T a term set of fDg.If C D then C ) T D. Proof: If C D then there exists a substitution such that C D. Let T be a term set of fDg by w.r.t.fCg.Then we have C 2 I(C; T).We also have C D , and thus C D .Then by Proposition 3, C ) D (fC g j = D ).Consequently I(C; T) j = D , and then by de nition C ) T D. 2 Theorem 8 (Decidability of T-implication between clauses) Let C and D be clauses and T a term set of fDg.Then there exists a procedure to decide if C ) T D. Proof: By the de nition of T-implication we have I(C; T) j = D where T is a term set of D by w.r.t.fCg.We have that I(C; T) is a set of ground clauses and D is a ground clause.Thus, it follows from the decidability of logical consequence in propositional logic that T-implication is decidable.2 2.4 Generalization under T-implication A least general generalization under T-implication (LGGT) is de ned similar to an LGG and an LGGI.De nition Let C be a clause, S = fD 1 ; : : : ; D n g a set of clauses, and T a term set of S w.r.t.fCg.Then C is a generalization under T-implication of S w.r.t.T if and only if, for every 1 i n, C ) T D i .A generalization under T-implication C of S w.r.t.T is a least general generalization under T-implication (LGGT) of S w.r.t.T if and only if, for every generalization under T-implication C 0 of S w.r.t.T, C 0 ) T 0 C, where T 0 is a minimal term set of C. Example Consider the following clauses: C = ( p(f(a)) p(a) ); D = ( p(f 2 (b)) p(b) ); E = ( p(f(x)) p(y) ); and F = ( p(f(z)) p(z) ): Let C be an LGGT of a set of clauses S w.r.t. a term set T. Then C is a complete LGGT of S w.r.t.T if and only if, for every generalization under T-implication C 0 of S w.r.t.T, C 0 C. If C is a clause then we let C + denote the set of positive literals in C, and C the set of negative literals in C. The following proposition has been proved by Gottlob (1987, page 110).Proposition 9 Let C = C + C be a clause and D = D + D a non-tautological clause.If C ) D then C + D + and C D .
t. T. 2 Lemma 12 Let C, D and E be clauses such that C and D have no variables in common, and let T be a term set of fEg by w.r.t.fC; Dg.If C ) T E and D ) T E then C D ) T E. Proof: If C ) T E and D ) T E then by de nition I(C; T) j = E and I(D; T) j = E .Let I(C; T) = fC 1 ; : : : ; C n g and I(D; T) = fD 1 ; : : : ; D m g.Then I(C D; T) = fC i D j j 1 i n and 1 j mg.Let I be a model for I(C D; T).Then, for every 1 i n and 1 j m, I is a model for C i D j .Hence if I is not a model for C i for some 1 i n then I must be a model for D j for every 1 j m.Then it follows that either I is a model for I(C; T) or I is a model for I(D; T), and thus I is a model for E .Consequently, I(C D; T) j = E , and then by de nition C D ) T E. 2 generalization under T-implication of S w.r.t.T.Then, by Lemma 12, F is a generalization under T-implication of S w.r.t.T. Consequently, F is a complete LGGT of S w.r.t.T. 2 Theorem 14 (Existence of LGGTs) Let S be a nite set of clauses, and T a term set of S. Then there exists an LGGT of S w.r.t.T. Proof: Let D be a tautology and a Skolem substitution for D. Then > j = D , and thus for every clause C, C ) T D. If every clause in S is a tautology, then every clause is a generalization under T-implication of S w.r.t.T, and every tautology is an LGGT of S w.r.t.T. Let S 0 be the set of clauses obtained from S by removing all tautologies.It is clear that every generalization under T-implication of S w.r.t.T also is a generalization under T-implication of S 0 w.r.t.T. By Theorem 13, there exists an LGGT of S 0 w.r.t.T. Consequently there exists an LGGT of S. 2 A clause C is ambivalent if and only if there exist a positive literal A 2 C and a negative literal B 2 C such that A and B have the same predicate symbol.Example The clause C = ( p(f 2 (a)) q(b); p(a) ) is ambivalent since p(f 2 (a)) and :p(a) have the same predicate symbol.However, C is not recursive since neither p(a) nor q(b) is uni able with a variant of p(f 2 (a)).Proposition 15 Let C be a clause and D a non-ambivalent clause.Then C ) D if and only if C D.
De nition A clause R is a resolvent of two clauses C and D if and only if there are C , D , A, B and such that: a) C is a factor of C and D is a factor of D, b) C and D have no variables in common, c) A is a literal in C and B is a literal in D , d) is an mgu of fA; Bg, and e) R is the clause ((C fAg) (D fBg)) .The clauses C and D are called parent clauses of R.

Example
Consider the following clauses: C = ( p(f(x)) p(x) ); D = ( p(f 2 (x)) p(x) ); E = ( p(f 3 (x)) p(x) ); F = ( p(f 2 (a)) p(a); p(b) ); and G = ( p(x) p(b) ): The clause C is a second root of D, and a third root of E. The clause C is also an indirect second root of F, since C is a second root of D and D -subsumes F. In fact C is a proper indirect second root of F, since C 6 F. For every n 1, the clause G is an indirect nth root of itself, but none of these indirect roots is a proper indirect root.
inherited both from C and D. Then we have C = R fAg and D = R fAg; where A could be any literal, and we say that C and D are obtained from R by orintroduction of the literal A. De nition Let C be a clause and a sequence of literals.Then a set of clauses S is or-introduced from C by if and only if either: a) S = fCg and = ], or b) S = (S 0 fDg) fD fLg; D fLgg and = L 1 ; : : : ; L n ; L], where S 0 is a set of clauses or-introduced from C by L 1 ; : : : ; L n ] and D 2 S 0 .Example Consider the following clauses: C = ( p(f 3 (a)) p(a) ); D 1 = ( p(f 3 (a)); p(f 2 (a)) p(a) ); D 2 = ( p(f 3 (a)) p(f 2 (a)); p(a) ); E 1 = ( p(f 3 (a)); p(f 2 (a)); p(f(a) p(a) ); and E 2 = ( p(f 3 (a)); p(f 2 (a)) p(f(a)); p(a) ): Induction step (n=k+1): By the de nition of nth resolution layer, D k+1 is a resolvent of some clauses D m 2 L m (T) and D p 2 L p (T) such that m + p = k, 1 m k and 1 p k. Then by Proposition 19, there exists a literal L such that D m D k+1 fLg and D p D k+1 fLg.By the induction hypothesis, there exists a set of clauses S m or-introduced from D m by some sequence of literals m = A 1 ; : : : ; A m 1 ] such that for each E 2 S m there exists a clause C 2 T such that C E. Then by Lemma 21, there exists a set of clauses S 0 m or-introduced from D k+1 fLg by some sequence of literals 0 m = A 0 1 ; : : : ; A 0 m 1 ] such that for each E 0 2 S 0 m there exists a clause C 2 T such that C E 0 .By the induction hypothesis, there also exists a set of clauses S p or-introduced from D p by some sequence of literals p = B 1 ; : : : ; B p 1 ] such that for each E 2 S p there exists a clause C 2 T such that C E. Then by Lemma 21, there exists a set of clauses S 0 p or-introduced from D k+1 fLg by some sequence of literals 0 p = clause C 2 T such that C E 0 .
Let D be a clause and a sequence of literals.Then a clause E is an expansion of D by if and only if E is an LGG of a set of clauses or-introduced from D by .Example Consider the following clauses: C = ( p(f(x)) p(x) ); D = ( p(f 3 (a)) p(a) ); D 1 = ( p(f 3 (a)) p(f 2 (a)); p(a) ); D 2 = ( p(f 3 (a)); p(f 2 (a)); p(f(a) p(a) ); D 3 = ( p(f 3 (a)); p(f 2 (a)) p(f(a)); p(a) ); and E = ( p(f(x)); p(f 3 (a)) p(a); p(x) ): have C ) D and C 6 D, but for the expansion E of D we have C E.Expansion can be regarded as a transformation technique, since the expansion of a clause is logically equivalent to the clause itself.Theorem 23 (Equivalence preservation of expansion) Let D be a clause, and E an expansion of D. Then E , D.
Let D be a clause, E an expansion of D and T a term set of fDg.Then E is a T-complete expansion of D w.r.t.T if and only if, for every clause C, C E whenever C ) T D. Example Consider the following clauses: C 1 = ( p(f(x)) p(x) ); C 2 = ( p(f 2 (y)) p(y) ); D = ( p(f 4 (a)) p(a) ); E 1 = ( p(f 2 (y)); p(f 4 (a)) p(a); p(y) ); and E 2 = ( p(f(x)); p(f 2 (y)); p(f 4 (a)) p(a); p(y); p(x) ): The clauses C 1 and C 2 are proper indirect roots of D, such that C 1 ) T D and C 2 ) T D. The clause E 1 is an expansion of D by p(f 2 (a))], and E 1 is an expansion of D by p(f 2 (a)); p(f 3 (a)); p(f(a))].The expansion E 2 is a T-complete expansion but E 1 is not.In the example above the T-complete expansion E 2 of D is also a complete expansion of D. However, in contrast to complete expansions, T-complete expansions exist for all non-tautological clauses.Theorem 26 (Existence of T-complete expansions) Let D be a non-tautological clause and T a term set of fDg.Then there exists a T-complete expansion E of D w.r.t.T. Proof: By Theorem 13, there exists a complete LGGT F of fDg w.r.t.T. Hence, for every clause C, if C ) T D then C F. By the de nition of a complete LGGT, we have F ) T D, and then by Corollary 6, F ) D. By theorem 24, there exists an expansion E of D such that F E. Thus, for every clause C, if C ) T D then C E, and consequently E is a T-complete expansion of D w.r.t.T. 2