Presentation1 Pca

Embed Size (px)

Citation preview

  • 8/13/2019 Presentation1 Pca

    1/48

    Principal Component

    Analysis

    Courtesy :University of Louisville, CVIP Lab

  • 8/13/2019 Presentation1 Pca

    2/48

    PCA

    PCA is

    A backbone of modern data analysis.

    Ablack boxthat is widely usedbut poorly understood.PCA

    OK lets dispel the magic behind

    this black box

  • 8/13/2019 Presentation1 Pca

    3/48

    PCA - Overview

    It is a mathematical tool from applied linear

    algebra.

    It is a simple, non-parametric method ofextractingrelevant information from confusingdata sets.

    It provides a roadmap for how toreduceacomplex data set to a lower dimension.

  • 8/13/2019 Presentation1 Pca

    4/48

    What do we need under our BELT?!!!

    Basics of statistical measures, e.g. variance andcovariance.

    Basics of linear algebra:

    -Matrices

    -Vector space

    -Basis

    -Eigen vectors and eigen values

  • 8/13/2019 Presentation1 Pca

    5/48

    Variance

    A measure of the spread of the data in a data

    set with mean X

    Variance is claimed to be the original statistical

    measure of spread of data.

    5

  • 8/13/2019 Presentation1 Pca

    6/48

    Covariance Variance - measure of the deviation from the mean for points in one

    dimension, e.g., heights

    Covariance - a measure of how mucheach of the dimensionsvariesfrom the meanwith respect to each other.

    Covariance is measured between 2 dimensions to see if there is a

    relationship between the 2 dimensions, e.g., number of hours studiedand grade obtained.

    The covariance between one dimension and itself is the variance

    n

    ( X

    X )(X

    X )var(X ) =

    cov(X ,Y )

    i =1

    n

    i =1=

    i i

    ( n 1 )

    ( X X )( Y Y )i i

    ( n 1)

  • 8/13/2019 Presentation1 Pca

    7/48

    Covariance

    What is the interpretation of covariance calculations?

    Say you have a 2-dimensional data set

    -X: number of hours studied for a subject

    -Y: marks obtained in that subject

    And assume the covariance value (between X and Y) is:104.53

    What does this value mean?

    7

  • 8/13/2019 Presentation1 Pca

    8/48

    CovarianceExact value is not as important as its sign.

    A positive value of covariance indicates that bothdimensions increase or decrease together, e.g., as thenumber of hours studied increases, the grades in that

    subject also increase.

    A negative value indicateswhile one increases the otherdecreases, or vice-versa, e.g., active social life vs.performance in ECE Dept.

    If covariance is zero: the two dimensions areindependentof each other, e.g., heights of students vs. grades obtainedin a subject.

    8

  • 8/13/2019 Presentation1 Pca

    9/48

    Covariance

    Why bother with calculating (expensive)

    covariance when we could just plot the 2 valuesto see their relationship?

    Covariance calculations are used to find

    relationships between dimensions in highdimensional data sets (usually greater than 3)

    where visualization is difficult.

    9

  • 8/13/2019 Presentation1 Pca

    10/48

    Covariance Matrix

    Representing covariance among dimensions as amatrix, e.g., for 3 dimensions:

    Properties:

    -Diagonal:variancesof the variables

    -cov(X,Y)=cov(Y,X), hence matrix issymmetricalaboutthe diagonal (upper triangular)

    -m-dimensional data will result inmxmcovariancematrix10

  • 8/13/2019 Presentation1 Pca

    11/48

  • 8/13/2019 Presentation1 Pca

    12/48

  • 8/13/2019 Presentation1 Pca

    13/48

  • 8/13/2019 Presentation1 Pca

    14/48

  • 8/13/2019 Presentation1 Pca

    15/48

  • 8/13/2019 Presentation1 Pca

    16/48

  • 8/13/2019 Presentation1 Pca

    17/48

    Transformation Matrices

    Scale vector (3,2) by a value 2 to get (6,4)Multiply by the square transformation matrix

    And we see that the result is still scaled by 4.

    WHY?

    A vector consists of both length and direction. Scaling avector only changes its length and not its direction. This isan important observation in the transformation of matricesleading to formation ofeigenvectors and eigenvalues.

    Irrespective of how much we scale (3,2) by, the solution(under the given transformation matrix) is always a multiple

    of 4. 22

  • 8/13/2019 Presentation1 Pca

    18/48

    Eigenvalue Problem

    The eigenvalue problem is any problem having thefollowing form:

    A.v=.v

    A:mxmmatrixv:mx 1 non-zero vector

    : scalar

    Any value of for which this equation has asolution is called the eigenvalue of A and the vectorvwhich corresponds to this value is called theeigenvector ofA.

    23

  • 8/13/2019 Presentation1 Pca

    19/48

  • 8/13/2019 Presentation1 Pca

    20/48

    Calculating Eigenvectors & Eigenvalues

    Simple matrix algebra shows that:

    A.v=.v

    A.v-.I.v=0

    (A- . I). v = 0

    Finding the roots of |A- . I| will give the eigenvaluesand for each of these eigenvalues there will be aneigenvector

    Example

    25

  • 8/13/2019 Presentation1 Pca

    21/48

  • 8/13/2019 Presentation1 Pca

    22/48

  • 8/13/2019 Presentation1 Pca

    23/48

    Example of a problem

    We collected mparameters about 100 students:

    -Height

    -Weight

    -Hair color

    -Average grade

    -

    We want to find the most important parameters thatbest describe a student.

  • 8/13/2019 Presentation1 Pca

    24/48

    Example of a problem

    Each student has a vector of datawhich describes him of length m:

    -(180,70,purple,84,)

    We have n = 100 such vectors.Lets put them in one matrix,

    where each column is one

    student vector.

    So we have a mxnmatrix. Thiswill be the input of our problem.

  • 8/13/2019 Presentation1 Pca

    25/48

    Example of a problem

    Every student is a vector thatlies in an m-dimensionalvector space spanned by anorthnormal basis.

    All data/measurement vectorsin this space are linear

    combination of this set ofunit length basis vectors.

    n-students

  • 8/13/2019 Presentation1 Pca

    26/48

  • 8/13/2019 Presentation1 Pca

    27/48

  • 8/13/2019 Presentation1 Pca

    28/48

    Questions

    How we describemost important

    features using math?

    -Variance

    How do we represent our data so thatthe most important features can be

    extracted easily?-Change of basis

  • 8/13/2019 Presentation1 Pca

    29/48

    Redundancy

    Multiple sensors record the same dynamic information.

    Consider a range of possible plots between two arbitrary measurement

    typesr1andr2.

    Panel(a) depicts two recordings with no redundancy, i.e. they areun-correlated, e.g. persons height and his GPA.

    However, in panel(c) both recordings appear to bestrongly related, i.e.

    one can be expressed in terms of the other.

  • 8/13/2019 Presentation1 Pca

    30/48

  • 8/13/2019 Presentation1 Pca

    31/48

    Covariance Matrix

    where

    The ijthelement of the variance is the dot product between thevector of the ith measurement type with the vector of the jth

    measurement type.

    -SXis a square symmetricmmmatrix.

    -The diagonal terms of SX are the variance of particular

    measurement types.

    -The off-diagonal terms ofSX are the covariance betweenmeasurement types.

  • 8/13/2019 Presentation1 Pca

    32/48

    Covariance Matrix

    where

    ComputingSX quantifies the correlations between all possiblepairs of measurements. Between one pair of measurements, a

    large covariance corresponds to a situation like panel (c), whilezero covariance corresponds to entirely uncorrelated data as in

    panel (a).

  • 8/13/2019 Presentation1 Pca

    33/48

    PCA Process - STEP 1

    Subtract the mean from each of the dimensions

    This produces a data set whose mean is zero.

    Subtracting the mean makes variance and covariancecalculation easier by simplifying their equations.

    The variance and co-variance values are not affected by

    the mean value.

    Suppose we have two measurement typesX1 andX2,

    hencem= 2, and ten samples each, hencen= 10.

    56

  • 8/13/2019 Presentation1 Pca

    34/48

  • 8/13/2019 Presentation1 Pca

    35/48

  • 8/13/2019 Presentation1 Pca

    36/48

  • 8/13/2019 Presentation1 Pca

    37/48

  • 8/13/2019 Presentation1 Pca

    38/48

    PCA Process - STEP 4

    Reduce dimensionality and formfeature vector

    The eigenvector with thehighesteigenvalue is theprincipalcomponentof the data set.

    In our example, the eigenvector with the largest eigenvalueis the one that points down the middle of the data.

    Once eigenvectors are found from the covariance matrix,

    the next step is toorder them by eigenvalue, highest tolowest. This gives the components in order of significance.

    61

  • 8/13/2019 Presentation1 Pca

    39/48

    PCA Process - STEP 4

    Now, if youd like, you can decide toignorethe componentsof lesser significance.

    You dolose some information, but if the eigenvalues aresmall, you dont lose much

    m dimensions in your data

    calculatemeigenvectors and eigenvalues

    choose only the firstreigenvectors

    final data set has onlyrdimensions.

    62

  • 8/13/2019 Presentation1 Pca

    40/48

    PCA Process - STEP 4

    When the is are sorted in descending order, the proportionof variance explained by therprincipal components is:

    r

    i1+ +K+

    i=1

    m = 1+

    2 r

    +K

    + +K

    +i=1

    i

    2 p m

    If the dimensions are highly correlated, there will be a small

    number of eigenvectors with large eigenvalues andrwill bemuch smaller thanm.

    If the dimensions are not correlated,rwill be as large asmand PCA does not help.

    63

  • 8/13/2019 Presentation1 Pca

    41/48

  • 8/13/2019 Presentation1 Pca

    42/48

  • 8/13/2019 Presentation1 Pca

    43/48

    PCA Process - STEP 5

    FinalDatais the final data set, with data items incolumns, and dimensions along rows.

    What does this give us?

    The original datasolely in terms of the vectors wechose.

    We have changed our data from being in terms

    of the axesX1andX2, to now be in terms ofour 2 eigenvectors.

    66

  • 8/13/2019 Presentation1 Pca

    44/48

    PCA Process - STEP 5

    FinalData (transpose: dimensions along columns)

    newX1

    newX2

    0.827870186 0.175115307

    1.77758033 0.142857227

    0.992197494 0.384374989

    0.274210416 0.130417207

    1.67580142 0.209498461

    0.912949103 0.175282444

    0.0991094375 0.349824698

    1.14457216 0.0464172582

    0.438046137 0.0177646297

    1.22382956 0.162675287 67

  • 8/13/2019 Presentation1 Pca

    45/48

    PCA Process - STEP 5

    68

  • 8/13/2019 Presentation1 Pca

    46/48

    Reconstruction of Original Data

    Recall that:FinalData = RowFeatureVector x RowZeroMeanData

    Then:

    RowZeroMeanData = RowFeatureVector-1x FinalData

    And thus:

    RowOriginalData = (RowFeatureVector-1x FinalData) +

    OriginalMean

    If we use unit eigenvectors, the inverse is the sameas the transpose (hence, easier).

    69

  • 8/13/2019 Presentation1 Pca

    47/48

  • 8/13/2019 Presentation1 Pca

    48/48

    Next

    We will practice with each other how would we use this powerful

    mathematical tool in one of computer vision applications, which is

    2D face recognition so stay excited