33
Aprendizaje Supervisado de Reglas Aprendizaje Supervisado de Reglas Difusas mediante un Sistema Clasificador Evolutivo Estilo Michigan Albert Orriols-Puig 1,2 Albert Orriols Puig Jorge Casillas 2 Ester Bernadó-Mansilla 1 1 Grup de Recerca en Sistemes Intel·ligents Enginyeria i Arquitectura La Salle, Universitat Ramon Llull 2 Dept. de Ciencias de la Computación e IA Universidad de Granada

JAEM'2007: Aprendizaje Supervisado de Reglas Difusas mediante un Sistema Clasificador Evolutivo Estilo Michigan

Embed Size (px)

DESCRIPTION

 

Citation preview

Aprendizaje Supervisado de Reglas Aprendizaje Supervisado de Reglas Difusas mediante un Sistema

Clasificador Evolutivo Estilo Michigan

Albert Orriols-Puig1,2Albert Orriols PuigJorge Casillas2

Ester Bernadó-Mansilla1

1Grup de Recerca en Sistemes Intel·ligentsEnginyeria i Arquitectura La Salle, Universitat Ramon Llull

2Dept. de Ciencias de la Computación e IAUniversidad de Granada

Motivation

Michigan-style LCSs for supervised learning. Eg. UCS

Evolve online highly accurate models– Evolve online highly accurate models

– Competitive to the most-used machine learning techniques• (Bernadó et al, 03; Wilson, 02; Bacardit & Butz, 04; Butz, 06; Orriols & Bernadó, 07)

Main weakness: Intepretability of the rule sets– Continuous attributes represented with intervals: [li, ui] . Semantic-p [ i, i]

free variables

– Number of rules or classifiers• Reduction schemes (Wilson, 02; Fu & Davis, 02; Dixon et al., 03, Orriols & Bernadó, 2005)

Slide 2GRSI Enginyeria i Arquitectura la Salle

(Wilson, 02; Fu & Davis, 02; Dixon et al., 03, Orriols & Bernadó, 2005)

Motivation

Jorge’s Proposal:Let’s “fuzzify” UCS– Let’s fuzzify” UCS

• Change the rule representation to fuzzy rules

Framework on Michigan-style Learning Fuzzy-Classifier Systems (LFCS)

– (Valenzuela-Radón, 91 & 98)

– (Parodi & Bonelli, 93)

– (Furuhashi, Nakaoka & Uchikawa, 94)

– (Velasco, 98)

– (Ishibuchi, Nakashima & Murata, 99 & 05): First LFCS for pattern classification

– (Casillas, Carse & Bull, 07) Fuzzy-XCS

Slide 3GRSI Enginyeria i Arquitectura la Salle

Aim

Propose Fuzzy-UCSAccuracy based Michigan style LFCS– Accuracy-based Michigan-style LFCS

– Supervised learning scheme

– Derived from UCS (Bernadó & Garrell, 2003)

• Introduction of a linguistic fuzzy representation

• Modification of all operators that deal with rules

– We expect:We expect:• Achieve similar performance than UCS

• Higher interpretability• Higher interpretability

– Plus new opportunities:

Slide 4GRSI Enginyeria i Arquitectura la Salle

• Mine in uncertain environments

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 5GRSI Enginyeria i Arquitectura la Salle

Description of UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p4. Conclusions

Michigan-style LCS’s (Holland, 1975):– Derived from XCS (Wilson 1995) a reinforcement learning– Derived from XCS (Wilson, 1995), a reinforcement learning

method.

– Designed specifically for supervised learningDesigned specifically for supervised learning

Rule representation:C ti i bl t d i t l [l ]– Continuous variables represented as intervals: [li, ui]

– Eg:IF x1 Є [l1, u1] ^ x2 Є [l2, u2] … ^ xn Є[ln, nn] THEN class1

– Matching instance e: for all ei: li ≤ ei ≤ ui

IF x1 Є [l1, u1] x2 Є [l2, u2] … xn Є[ln, nn] THEN class1

– Set of parameters: Accuracy, Fitness, Numerosity, Experience, Correct set size

Slide 6GRSI Enginyeria i Arquitectura la Salle

Description of UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p4. Conclusions

Environment

M t h S t [M]P bl i t

Stream ofexamples

Population [P]

1 C A acc F num cs ts exp3 C A acc F num cs ts exp5 C A acc F num cs ts exp

Match Set [M]Problem instance+

output class

1 C A acc F num cs ts exp2 C A acc F num cs ts exp3 C A acc F num cs ts exp4 C A acc F num cs ts exp

Population [P]

Classifier

6 C A acc F num cs ts exp…

correct set4 C A acc F num cs ts exp5 C A acc F num cs ts exp6 C A acc F num cs ts exp

ClassifierParameters

UpdateMatch set generation

Correct Set [C]

correct setgeneration

ExperienceCorrectacc #

=Selection, Reproduction, mutation

Deletion 3 C A acc F num cs ts exp6 C A acc F num cs ts exp

Correct Set [C]

p

νaccFitness =Genetic AlgorithmIf there are no classfiers in [C], covering is triggered

Slide 7GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 8GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y

Rule representation

4. Conclusions

Rule representation– Linguistic fuzzy rules

– E.g.: IF x1 is A1 and x2 is A2 … and xn is An THEN class1

Disjunction of linguistic

All i bl h th ti

Disjunction of linguistic fuzzy terms

– All variables share the same semantics

– Example: Ai = {small, medium, large}

IF x1 is small and x2 is medium or large THEN class1

– Codification:

Slide 9GRSI Enginyeria i Arquitectura la Salle

IF [100 | 011] THEN class1

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y

How do we know if a given input is small, medium or large?

4. Conclusions

g p , g– Each linguistic term defined by a membership function

Belongs to medium with a degree of 0 8Belongs to medium with a degree of 0.8

Belongs to small with a degree of 0 2

ei

Belongs to small with a degree of 0.2

Triangular-shaped membership functions

Attribute valuemembership functions

Slide 10GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y

Matching degree uAk(e) [0,1]

4. Conclusions

g g A ( ) [ , ]

k: IF x1 is small and x2 is medium or large THEN class1

Example: (e1, e2)

0 8

0.2 0.2

0.8

T-conorm: bounded sum

e1 e2

max ( 1, 0.8 + 0.2) = 1

T-norm: productk

Slide 11GRSI Enginyeria i Arquitectura la Salle

uAk(e) = 1 * 0.2 = 0.2

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

The role of matching changes:• UCS: A rule matches or not an example (binary function)• Fuzzy-UCS: A rule matches an example with a certain degree

Slide 12GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Each classifier has the following parameters:1 Weight per class w :1. Weight per class wj:

• Soundness with which the rule predicts the class j.

• The class value is dynamic and corresponds to the class j with higher w• The class value is dynamic and corresponds to the class j with higher wj

2. Fitness:• Quality of the rule

3. Other parameters directly inherited from UCS:• numerosity

• Experience

Slide 13GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 14GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Learning interaction:– The environment provides an example e and its class c

– Match set creation: all classifiers that match with uAk(x) > 0

– Correct set creation: all classifiers that advocate cCorrect set creation: all classifiers that advocate c

– Covering: if there is not a classifier that maximally matches e• Create the classifier that match the input example with maximumCreate the classifier that match the input example with maximum

degree.

• Generalize the condition with probability P#

A2A1 A3For each variable:

Slide 15GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Parameters’ UpdateExperience:– Experience:

– Sum of correct matching per class j cmj:

Slide 16GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Parameters’ UpdateUse cm to update of the weights per each class:– Use cm to update of the weights per each class:

• Rule that only matches instances of class c:

• wc = 1

• For all the other classes j: wj = 0

• Rule that matches instances of all classes:

– Calculate the fitness

u e t at atc es sta ces o a c asses

• All weights wi ranging [0, 1]

Pressuring toward rules that correctly match instances of

only one classonly one class

Slide 17GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 18GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Discovery componentSteady state niched GA– Steady-state niched GA

– Roulette wheel selectionInstances that have a highergmatching degree have more

opportunities of being selected

Slide 19GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Discovery componentCrossover and mutation applied on the antecedent– Crossover and mutation applied on the antecedent

• 2 point crossoverIF [100 | 011] THEN classIF [100 | 011] THEN class1IF [101 | 100] THEN class1

• Mutation:

– Expansion IF [100 | 011] THEN class1 IF [101 | 011] THEN class1p

– Contraction

[ | ] 1 [ | ] 1

IF [100 | 011] THEN class1 IF [100 | 001] THEN class1

– Shift IF [100 | 011] THEN class1 IF [010 | 011] THEN class1

Slide 20GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCSp y

Describe the different components1. Rule representation and matching1. Rule representation and matching

2. Learning interaction

3 Di t3. Discovery component

4. Fuzzy-UCS in test mode

Slide 21GRSI Enginyeria i Arquitectura la Salle

Description of Fuzzy-UCS1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p y4. Conclusions

Class inference of a test example e– Combining the information of all rules yields better results than

taking a single rule for reasoning (Cordon et al. 1998)

• Inference:

– All experienced rules vote for the class they predict as: uAk(e) · Fk

– The most voted class is returned.

Slide 22GRSI Enginyeria i Arquitectura la Salle

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 23GRSI Enginyeria i Arquitectura la Salle

Experimental Methodology1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p gy4. Conclusions

Evaluating Fuzzy-UCS’ performance– Compare Fuzzy-UCS’ accuracy to:– Compare Fuzzy-UCS accuracy to:

• Three non-fuzzy learners: UCS, SMO, and C4.5

• Two fuzzy learners: Fuzzy LogitBoost and Fuzzy GP• Two fuzzy learners: Fuzzy LogitBoost and Fuzzy GP

– Default configuration for all methods

F UCS fi ti– Fuzzy-UCS configuration:iter = 100,000, N = 6400, F0 = 0.99, v=10, {θGA, θdel, θsub} = 50,x =0.8, u=0.04, P#=0.6x 0.8, u 0.04, P# 0.6

– Fuzzy learners: 5 linguistic labels per variable

10 fold cross validation– 10-fold cross-validation

– Averages over 10 runs

Slide 24GRSI Enginyeria i Arquitectura la Salle

Experimental Methodology1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i

p gy4. Conclusions

Data domains

#Inst #Fea #Re #In #No #Cl %Min %Max %MisAtt

annealing 898 38 6 0 32 5 0.9 76.2 0

balance 625 4 4 0 0 3 7.8 46.1 0

bupa 345 6 6 0 0 2 42 58 0

glass 214 9 9 0 0 6 4 2 35 5 oglass 214 9 9 0 0 6 4.2 35.5 o

heart-c 303 13 6 0 7 2 45,5 54.5 15,4

heart-s 270 13 13 0 0 2 44.4 56.6 0

iris 150 4 4 0 0 3 33.3 33.3 0

wbcd 699 9 0 9 0 2 34.5 65.5 11,1

wine 178 13 13 0 0 3 27 39 9 0wine 178 13 13 0 0 3 27 39.9 0

zoo 101 17 0 1 16 7 4 40.6 0

Slide 25GRSI Enginyeria i Arquitectura la Salle

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 26GRSI Enginyeria i Arquitectura la Salle

Results1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

• 1st objective: Competitive in terms of performance

Slide 27GRSI Enginyeria i Arquitectura la Salle

Results1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

• 2nd objective: Improve the interpretability

Example of rules evolved by UCS for iris

• 2nd objective: Improve the interpretability

Example of rules evolved by UCS for iris

Example of rules evolved by Fuzzy-UCS for iris – Linguistic terms: {XS, S, M, L, XL}

Slide 28GRSI Enginyeria i Arquitectura la Salle

Further work1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

Still large rule-sets!

Fuzzy UCS UCSFuzzy-UCS UCS

annealing 2769 4494b l 1212 2177balance 1212 2177bupa 1440 2961glass 2799 3359glass 2799 3359heart-c 3574 2977heart-s 2415 3735iris 480 1039wbcd 3130 2334wine 3686 3685zoo 773 1291

Slide 29GRSI Enginyeria i Arquitectura la Salle

Solution: New inference schemes

Further work1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

Still large rule-sets!

Fuzzy-UCS Fuzzy UCS UCSybest rule Fuzzy-UCS UCS

annealing 36 2769 4494b l 75 1212 2177balance 75 1212 2177bupa 39 1440 2961glass 36 2799 3359glass 36 2799 3359heart-c 46 3574 2977heart-s 62 2415 3735iris 7 480 1039wbcd 28 3130 2334wine 26 3686 3685zoo 10 773 1291

Slide 30GRSI Enginyeria i Arquitectura la Salle

Solution: New inference schemes

Outline

1 D i i f F UCS1. Description of Fuzzy-UCS

2 Experimental Methodology2. Experimental Methodology

3. Results3. Results

4. Conclusions

Slide 31GRSI Enginyeria i Arquitectura la Salle

Conclusions and Further Work1. Description of Fuzzy-UCS2. Experimental Methodology3. Results4 C l i4. Conclusions

Conclusions– We proposed a Michigan-style LFCS for supervised learning– Competitive with respect to:

• Some of the most-used machine learners: UCS, SMO, and C4.5• Recent proposals of Fuzzy-learners: Fuzzy LogitBoost and Fuzzy GP

– Improvement in terms of interpretability with respect to UCS

Further workEvolve more reduced populations– Evolve more reduced populations

– Enhance the comparison with new real-world problems– Compare to other LFCS– Compare to other LFCS– Exploit the incremental learning approach to dig large datasets

Slide 32GRSI Enginyeria i Arquitectura la Salle

Aprendizaje Supervisado de Reglas Aprendizaje Supervisado de Reglas Difusas mediante un Sistema

Clasificador Evolutivo Estilo Michigan

Albert Orriols-Puig1,2Albert Orriols PuigJorge Casillas2

Ester Bernadó-Mansilla1

1Grup de Recerca en Sistemes Intel·ligentsEnginyeria i Arquitectura La Salle, Universitat Ramon Llull

2Dept. de Ciencias de la Computación e IAUniversidad of Granada