Processing math: 100%

Sunday, 22 January 2023

PRINCIPAL COMPONENT ANALYSIS (PCA)

(PCA): - Dimension reduction analysis


-> It is a technique for feature extraction from a given data set.


-> There are N-number of Principal Component corresponding to the N-number of data.


-> 95% of the features of the extracted data belong to the first principal component.


-> Therefore, we have to select the first n-number of the principal component corresponding to the N-number of data; the choice of the n-number principal component is determined by the precision we are aiming for. 


-> So, PCA reduces the N-number of the principal components corresponding to the N-number of data into the n-number of the features; N >> n


-> Consequently, another name for this method is a dimension reduction analysis.


-> Example: -  Considering the situation for the data sets X and Y.


            X     =     1,     2,     3,     4,     5,     6,     7,     8,     9,     10.

            Y     =     1,     4,     9,    16,   25,   36,   49,   64,    81,  100.

Here, the number of features = 2  and the number of samples = 10.


The steps for computing PCA are given as follows:

Step 1:    Generate the covariance matrix for datasets X and Y. 


                        | Cov (X, X)    Cov (X, Y) |
A _{(2 X 2)}   =      
                        | Cov (Y, X)     Cov (Y, Y) |


\begin{align}Cov{(X, Y)}=\sum_{i=1}^N\frac{(x_i-\mu_X)(y_i-\mu_Y)}{N}\end{align}

Where, \; \mu_X and \mu_Y are the mean of the given data sets X and Y respectively.



                            |  8.25          90.75 |
A _{(2 X 2)}   =      
                            | 90.75    1051.05 |


Step 2:    Generate the characteristics equation by using covariance matrix A_{(2 X 2)}.


    Note:-  det (A_{(2 X 2)} -  \lambda  I ) = 0; represents the characteristics equation and I = unit matrix.


               | 8.25 - \lambda          90.75 | 
      \det                                           = 0
               | 90.75    1051.05 - \lambda |  


   \implies (8.25 - \lambda) (1051.05 - \lambda) - 90.75 * 90.75 = 0



\implies  \lambda^{2} - 1059.3 \lambda + 435.6 = 0    .  . .   (1)                  

\implies \lambda_{1}=1058.89,      \lambda_{2}=0.411375 .  

        The  \lambda_{1}, \lambda_{2} represents the Eigen Values of the matrix A_{(2 X 2)}.

The first principal component is defined by the largest eigenvalue, the second principal component by the second-largest eigenvalue, and so on.


Step 3:    The computation of the Eigen Vectors corresponding to the Eigen Values. 


        (A_{(2 X 2)} - \lambda_{i} I) U_{i} = 0   .  .  .  (2)


        When \lambda_{1} = 1058.89,    then  the (A_{(2 X 2)} - \lambda_{i} I) U_{i} 

     | -1050.64    90.75 |  | u_{1} |                           | 0 |
                                                           
     |   90.75       -7.84  |  | u_{2}                          | 0 |


Now equating the matrix on both sides, we get. 

-1050.64 * u_{1} + 90.75 * u_{2} = 0      . .  . (3)
and      90.75 * u_{1} - 7.84 * u_{2} = 0      ... (4)

The Eigen Vectors corresponding to equations (3) and (4) are as follows:


|u_{1}|              | 90.75 * k|                     |7.84*k|
             =                                    OR
|u_{2}|              |1050.64*k|                  |90.75*k

Where 'k=1' is a constant.

        When \lambda_{2} = 0.411375,    then  the (A_{(2 X 2)} - \lambda_{i} I) U_{i} =

  | 7.838625                90.75 |  | u_{1} |                 | 0 |
                                                                     
  |   90.75       1050.638625  |  | u_{2}                | 0 |


Now equating the matrix on both sides, we get. 

7.838625 * u_{1} + 90.75 * u_{2} = 0      .   .   .  (5)
and   
  90.75 * u_{1} + 1050.638625 * u_{2} = 0 . . . (6)

The Eigen Vectors corresponding to equations (5) and (6) are as follows: 


|u_{1}|              |90.75*k|                   |1050.64*k|
             =                                  OR
|u_{2}|              |-7.84*k|                  |-90.75*k

Where 'k=1' is a constant.

Step 4:  Computes the Normalized eigenvectors.









Page 1 of 6123...6Next »Last

Software Engineering Lab - I, II

Software Engineering Lab:- Software Engineering Lab is aimed to provide you hands-on experience with different aspects of Software Engineeri...