Subscribe to our Newsletters !!
Fermentation is a natural method which is applied
With the MAGIO series, Julabo is expanding its pro
The human metapneumovirus, more simply referred to
We are truly honored to share that HiMedia Laborat
Alembic Pharmaceuticals Limited (Alembic) announce
'Simplicity' seems to be the perfect word to descr
Dear Readers, Welcome to the latest issue of Mi
Principal Component Analysis (PCA) is a widely used dimensionality reduction technique in machine learning and data analysis. It aims to make a dataset change into another coordinate system in which variance of the data is maximized along new axes, referred as principal components. The dimensionality of high-dimensional data can be reduced by PCA while keeping the maximum information.
First, subtract each feature’s mean from the corresponding feature values to center the data. This step makes sure that origin is at the center of the new coordinate system (principal components). Centered data: X_c = X – μ
Also read:Exploring the Benefits of Multivariate Data Analysis in the Pharma Industry
Find out covariance matrix of centered data. Covariance matrix helps to find principal components by showing how different features relate with one another. Covariance matrix: Σ = (1/n) * (X_c^T * X_c) Where:
Compute eigenvectors and eigenvalues for covariance matrix. Eigenvectors are principal components while eigenvalues show what proportion of variance is expressed by each principal component. Σ * V = λ * V
Sort eigenvalues in descending order and rearrange their corresponding eigenvectors accordingly to ensure greatest principal components come first.
Take the highest k eigenvectors (principal components) that account for most of the variance. Usually, it is such that retains a large part of the total variation like 95% or 99%. This step leads to reduction in dimensionality from original space of features to new subspace defined by selected principal components.
Get reduced-dimensional representation by projecting centered data onto chosen principal components. Transformed data: X_pca = X_c * V_k
PCA allows noise reduction, feature extraction and data visualization, among other uses. In this way, it becomes easier to analyze complex datasets while preserving the maximum information and visualize them with ease as well.