PCA analysis for specta and get Principal Component values (which correspond to reduced features)

tomy · July 27, 2024, 5:58pm

I’d like to perform extraction of reduced features
(it means that spectrum data has 200 points, I need to reduce the features and extract general 3 kinds of principal component (PCA1, PCA2, PCA3), for instance)

I prepared some code below,

    // 5 spectra
    Double_t spectra[5][200] = {
        {1.2, 0.8, 1.0, 1.5, 0.9, .... 0.7, 1.4, 0.6, 1.1, 1.3},
        {0.7, 0.9, 1.2, 0.6, 1.5, .... 1.1, 1.3, 1.4, 0.8, 0.7},
        {1.3, 1.4, 0.9, 1.1, 0.8, .... 1.0, 1.2, 0.7, 0.6, 1.5},
        {0.8, 1.2, 0.7, 1.5, 0.6, .... 1.3, 1.1, 0.9, 1.4, 0.9},
        {1.1, 1.0, 0.6, 1.4, 1.3, .... 1.5, 0.9, 0.7, 0.8, 1.2}
    };

    Int_t n = 200; // data points 
    Int_t m = 5;  // #of spectra

    TPrincipal* principal = new TPrincipal(n, "ND");

    for (Int_t i = 0; i < m; i++) {
        principal->AddRow(spectra[i]);
    }

    // We delete the data after use, since TPrincipal got it by now.
    //delete [] data;

    // Do the actual analysis
    principal->MakePrincipals();

But, I do not know how I extract the PCA values (principal component, or reduced features) for each spectra…
How can I extract the PCA values (principal component, or reduced features) ?

devajith · July 27, 2024, 6:24pm

Hi @tomy,

Thanks for the question.

Let me add @moneta in the loop.

Thanks,
Dev

Eddy_Offermann · July 28, 2024, 8:09am

Hi @tomy ,

Look at the following class methods.

eigen vectors:

const TMatrixD * TPrincipal::GetEigenVectors ()const

principal components:

const TVectorD * TPrincipal::GetEigenValues ()const

-Eddy

tomy · July 28, 2024, 10:02am

Thank you for the comment

Yes, I also checked Print() method, where there is explanation

// M Print mean values of original data
// S Print sigma values of original data
// E Print eigenvalues of covariance matrix
// V Print eigenvectors of covariance matrix

But, I do not know which one show which info…

if I set the small example as below

// 5 spectra
    Double_t spectra[5][10] = {
        {1.2, 0.8, 1.0, 1.5, 0.9, 0.7, 1.4, 0.6, 1.1, 1.3},
        {0.7, 0.9, 1.2, 0.6, 1.5, 1.1, 1.3, 1.4, 0.8, 0.7},
        {1.3, 1.4, 0.9, 1.1, 0.8, 1.0, 1.2, 0.7, 0.6, 1.5},
        {0.8, 1.2, 0.7, 1.5, 0.6, 1.3, 1.1, 0.9, 1.4, 0.9},
        {1.1, 1.0, 0.6, 1.4, 1.3, 1.5, 0.9, 0.7, 0.8, 1.2}
    };
    TPrincipal* principal = new TPrincipal(3, "ND");

I assume that I will get “3 kinds of Principal Component for 5 spectra” as the principal components,

But if I did principal->Print(“MSVE”), what I got is below

I can not get 3PC for 5 spectra

my some settings is wrong…?

Variable #  | Mean Value |   Sigma    | Eigenvalue
-------------+------------+------------+------------
           0 |       1.02 |     0.2315 |     0.5251
           1 |       1.06 |     0.2154 |     0.2638
           2 |       0.88 |     0.2135 |     0.2112

Eigenvector # 0
Vector (3)  is as follows

     |        1  |
------------------
   0 |-0.517169
   1 |-0.611229
   2 |0.599113

Eigenvector # 1
Vector (3)  is as follows

     |        1  |
------------------
   0 |-0.85241
   1 |0.304832
   2 |-0.424823

Eigenvector # 2
Vector (3)  is as follows

     |        1  |
------------------
   0 |0.0770352
   1 |-0.730395
   2 |-0.678667

Eddy_Offermann · July 28, 2024, 11:20am

You seem to misunderstand the class. You are supplying 5 observations (data points) with each 10 variables. (naming that 2-d array is “spectra” confuses matters). Therefore, you should do

TPrincipal* principal = new TPrincipal(10, "ND");

Read carefully the description of the Principal class.

tomy · July 28, 2024, 11:32am

Thank you for the knd comment,

In that case, I got the below as Matrix components.

But, still I do not know which one is the the first Principal Component for each dataset. (5 kinds of datasets.)

Matrix components:
Component (1,1): 0.366
Component (1,2): -0.3248
Component (1,3): 0.2524
Component (1,4): -0.1448
Component (1,5): -0.3426
Component (1,6): 0.513
Component (1,7): -0.2739
Component (1,8): 0.1268
Component (1,9): -0.2131
Component (1,10): 0.3986
Component (2,1): 0.2267
Component (2,2): 0.09625
Component (2,3): 0.2401
Component (2,4): 0.7674
Component (2,5): 0.2294
Component (2,6): 0.1839
Component (2,7): 0.258
Component (2,8): 0.3472
Component (2,9): -0.1309
Component (2,10): -0.04564
Component (3,1): -0.3673
Component (3,2): -0.3902
...
...
...
Component (9,5): 0.1124
Component (9,6): 0.5117
Component (9,7): -0.3014
Component (9,8): 0.1541
Component (9,9): -0.07629
Component (9,10): -0.3269
Component (10,1): 0.3896
Component (10,2): -0.2969
Component (10,3): 0.2467
Component (10,4): -0.04195
Component (10,5): -0.1161
Component (10,6): -0.07383
Component (10,7): -0.1676
Component (10,8): 0.06008
Component (10,9): 0.4235
Component (10,10): -0.6837

Eddy_Offermann · July 28, 2024, 12:18pm

What you are printing here are the eigen vectors of the eigen values.

I assume that you will use this principal decomposition for further analysis. In that case use the class methods I have given above.

system · August 11, 2024, 12:18pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.