一阶归纳学习器 (FOIL) 算法

先决条件：谓词和量词，学习一个规则，顺序覆盖算法

在进入FOIL算法之前，让我们先了解一下一阶规则的含义以及其中涉及的各种术语。

一阶逻辑：

一阶逻辑中的所有表达式都由以下属性组成：

常量——例如 tyler, 23, a
变量——例如 A、B、C
谓词符号——例如男性、父亲（仅限 True 或 False 值）
函数符号——例如年龄（可以将任何常量作为值）
连接词——例如 ∧, ∨, ¬, →, ←
量词——例如 ∀, ∃

术语：它可以定义为应用于任何术语的任何常数、变量或函数。例如年龄（鲍勃）

字面量：它可以定义为应用于任何术语的任何谓词或否定谓词。例如女性（起诉），父亲（X，Y）

它有3种类型：

Ground 字面量 —不包含变量的字面量。例如女性（起诉）
Positive 字面量 —不包含否定谓词的字面量。例如女性（起诉）
否定字面量—包含否定谓词的字面量。例如父亲（X，Y）

子句 -它可以定义为变量被普遍量化的字面量的任何分离。

$M_{1} \vee \ldots . \vee M_{n}$

where, 

M1, M2, ...,Mn --> literals (with variables universally quantified)
V --> Disjunction (logical OR)

Horn 子句——它可以定义为任何只包含一个正字面量的子句。

$H \vee \neg L_{1} \vee \ldots . \vee \neg L_{n}$

自从，

$\left(\neg L_{1} \vee \ldots . \vee \neg L_{n}\right) \equiv \neg\left(L_{1} \wedge \ldots \wedge L_{n}\right)$

和

$(A \vee \neg B) \equiv(A \leftarrow B)$

那么 horn 子句可以写成

$H \leftarrow\left(L_{1} \wedge \ldots \wedge L_{n}\right)$

where, 
H --> Horn Clause
L1,L2,...,Ln --> Literals 

(A ⇠ B) --> can be read as 'if B then A' [Inductive Logic]

and

∧ --> Conjunction (logical AND)
∨ --> Disjunction (logical OR)
¬ --> Logical NOT

一阶归纳学习器 (FOIL)

在机器学习中，一阶归纳学习器（FOIL）是一种基于规则的学习算法。它是 SEQUENTIAL-COVERING 和 LEARN-ONE-RULE 算法的自然扩展。它遵循贪婪的方法。

归纳学习：

归纳学习分析和理解证据，然后使用它来确定结果。它基于归纳逻辑。

图 1：归纳逻辑

涉及的算法

FOIL(Target predicate, predicates, examples)

• Pos ← positive examples
• Neg ← negative examples
• Learned rules ← {}
• while Pos, do
    //Learn a NewRule 
    – NewRule ← the rule that predicts target-predicate with no preconditions
 
    – NewRuleNeg ← Neg 
    – while NewRuleNeg, do
        Add a new literal to specialize NewRule
        1. Candidate_literals ← generate candidates for newRule based on Predicates
        2. Best_literal ← 
                 argmaxL∈Candidate literalsFoil_Gain(L,NewRule)

        3. add Best_literal to NewRule preconditions
        4. NewRuleNeg ← subset of NewRuleNeg that satisfies NewRule preconditions 
    – Learned rules ← Learned rules + NewRule 
    – Pos ← Pos − {members of Pos covered by NewRule}
 
• Return Learned rules

算法的工作：

在算法中，内循环用于生成新的最佳规则。让我们考虑一个例子并了解算法的逐步工作。

图 2：箔片示例

Say we are tying to predict the Target-predicate- GrandDaughter(x,y).
We perform the following steps: [Refer Fig 2]

Step 1 - NewRule = GrandDaughter(x,y)
Step 2 - 
      2.a - Generate the candidate_literals. 
        (Female(x), Female(y), Father(x,y), Father(y.x), 
         Father(x,z), Father(z,x), Father(y,z), Father(z,y))

      2.b - Generate the respective candidate literal negations.
        (¬Female(x), ¬Female(y), ¬Father(x,y), ¬Father(y.x), 
         ¬Father(x,z), ¬Father(z,x), ¬Father(y,z), ¬Father(z,y))

Step 3 - FOIL might greedily select Father(x,y) as most promising, then
      NewRule = GrandDaughter(x,y) ← Father(y,z) [Greedy approach]

Step 4 - Foil now considers all the literals from the previous step as well as:
     (Female(z), Father(z,w), Father(w,z), etc.) and their negations.

Step 5 - Foil might select Father(z,x), and on the next step Female(y) leading to
         NewRule = GrandDaughter (x,y) ← Father(y,z) ∧ Father(z,x) ∧ Female(y)

Step 6 - If this greedy approach covers only positive examples it terminates 
     the search for further better results.

FOIL now removes all positive examples covered by this new rule. 
If more are left then the outer while loop continues.

箔：性能评估措施

新规则的性能不是由其熵度量定义的（如 Learn-One-Rule 算法中的PERFORMANCE方法）。

FOIL 使用增益算法来确定选择哪个新的专门规则。每个规则的效用是通过对所有正绑定进行编码所需的位数来估计的。 [等式1]

$F o i l \operatorname{Gain}(L, R) \equiv t\left(\log _{2} \frac{p_{1}}{p_{1}+n_{1}}-\log _{2} \frac{p_{0}}{p_{0}+n_{0}}\right)$

where, 
L is the candidate literal to add to rule R

p0 = number of positive bindings of R
n0 = number of negative bindings of R
p1 = number of positive binding of R + L
n1 = number of negative bindings of R + L
t  = number of positive bindings of R also covered by R + L

FOIL Algorithm 是另一种基于规则的学习算法，它扩展了 Sequential Covering + Learn-One-Rule 算法，并使用不同的性能指标（熵/信息增益除外）来确定可能的最佳规则。有关算法的任何疑问/疑问，请在下面发表评论。