Processing math: 100%
Showing posts with label PRINCIPAL COMPONENT ANALYSIS (PCA). Show all posts
Showing posts with label PRINCIPAL COMPONENT ANALYSIS (PCA). Show all posts

Sunday, 22 January 2023

PRINCIPAL COMPONENT ANALYSIS (PCA)

(PCA): - Dimension reduction analysis


-> It is a technique for feature extraction from a given data set.


-> There are N-number of Principal Component corresponding to the N-number of data.


-> 95% of the features of the extracted data belong to the first principal component.


-> Therefore, we have to select the first n-number of the principal component corresponding to the N-number of data; the choice of the n-number principal component is determined by the precision we are aiming for. 


-> So, PCA reduces the N-number of the principal components corresponding to the N-number of data into the n-number of the features; N >> n


-> Consequently, another name for this method is a dimension reduction analysis.


-> Example: -  Considering the situation for the data sets X and Y.


            X     =     1,     2,     3,     4,     5,     6,     7,     8,     9,     10.

            Y     =     1,     4,     9,    16,   25,   36,   49,   64,    81,  100.

Here, the number of features = 2  and the number of samples = 10.


The steps for computing PCA are given as follows:

Step 1:    Generate the covariance matrix for datasets X and Y. 


                        | Cov (X, X)    Cov (X, Y) |
A _{(2 X 2)}   =      
                        | Cov (Y, X)     Cov (Y, Y) |


\begin{align}Cov{(X, Y)}=\sum_{i=1}^N\frac{(x_i-\mu_X)(y_i-\mu_Y)}{N}\end{align}

Where, \; \mu_X and \mu_Y are the mean of the given data sets X and Y respectively.



                            |  8.25          90.75 |
A _{(2 X 2)}   =      
                            | 90.75    1051.05 |


Step 2:    Generate the characteristics equation by using covariance matrix A_{(2 X 2)}.


    Note:-  det (A_{(2 X 2)} -  \lambda  I ) = 0; represents the characteristics equation and I = unit matrix.


               | 8.25 - \lambda          90.75 | 
      \det                                           = 0
               | 90.75    1051.05 - \lambda |  


   \implies (8.25 - \lambda) (1051.05 - \lambda) - 90.75 * 90.75 = 0



\implies  \lambda^{2} - 1059.3 \lambda + 435.6 = 0    .  . .   (1)                  

\implies \lambda_{1}=1058.89,      \lambda_{2}=0.411375 .  

        The  \lambda_{1}, \lambda_{2} represents the Eigen Values of the matrix A_{(2 X 2)}.

The first principal component is defined by the largest eigenvalue, the second principal component by the second-largest eigenvalue, and so on.


Step 3:    The computation of the Eigen Vectors corresponding to the Eigen Values. 


        (A_{(2 X 2)} - \lambda_{i} I) U_{i} = 0   .  .  .  (2)


        When \lambda_{1} = 1058.89,    then  the (A_{(2 X 2)} - \lambda_{i} I) U_{i} 

     | -1050.64    90.75 |  | u_{1} |                           | 0 |
                                                           
     |   90.75       -7.84  |  | u_{2}                          | 0 |


Now equating the matrix on both sides, we get. 

-1050.64 * u_{1} + 90.75 * u_{2} = 0      . .  . (3)
and      90.75 * u_{1} - 7.84 * u_{2} = 0      ... (4)

The Eigen Vectors corresponding to equations (3) and (4) are as follows:


|u_{1}|              | 90.75 * k|                     |7.84*k|
             =                                    OR
|u_{2}|              |1050.64*k|                  |90.75*k

Where 'k=1' is a constant.

        When \lambda_{2} = 0.411375,    then  the (A_{(2 X 2)} - \lambda_{i} I) U_{i} =

  | 7.838625                90.75 |  | u_{1} |                 | 0 |
                                                                     
  |   90.75       1050.638625  |  | u_{2}                | 0 |


Now equating the matrix on both sides, we get. 

7.838625 * u_{1} + 90.75 * u_{2} = 0      .   .   .  (5)
and   
  90.75 * u_{1} + 1050.638625 * u_{2} = 0 . . . (6)

The Eigen Vectors corresponding to equations (5) and (6) are as follows: 


|u_{1}|              |90.75*k|                   |1050.64*k|
             =                                  OR
|u_{2}|              |-7.84*k|                  |-90.75*k

Where 'k=1' is a constant.

Step 4:  Computes the Normalized eigenvectors.









Software Engineering Lab - I, II

Software Engineering Lab:- Software Engineering Lab is aimed to provide you hands-on experience with different aspects of Software Engineeri...