Understanding the Mathematics Behind Principal Component Analysis(PCA)

Understanding the Mathematics Behind Principal Component Analysis(PCA)


  • Post By :

  • Source: Microbioz India

  • Date: 13 Nov,2023

Principal Component Analysis (PCA) is a widely used dimensionality reduction technique in machine learning and data analysis. It aims to make a dataset change into another coordinate system in which variance of the data is maximized along new axes, referred as principal components. The dimensionality of high-dimensional data can be reduced by PCA while keeping the maximum information.

Let’s take an overview of the mathematics that are behind Principal Component Analysis (pca):

Data Centering:

First, subtract each feature’s mean from the corresponding feature values to center the data. This step makes sure that origin is at the center of the new coordinate system (principal components).
Centered data: X_c = X – μ


  1. X_c: Centered data matrix
  2. X: Original data matrix
  3. μ: Mean vector of the original data

Also read:

Exploring the Benefits of Multivariate Data Analysis in the Pharma Industry

Covariance Matrix:

Find out covariance matrix of centered data. Covariance matrix helps to find principal components by showing how different features relate with one another.
Covariance matrix: Σ = (1/n) * (X_c^T * X_c)

  1. Σ: Covariance matrix
  2. n: Number of samples
  3. X_c: Centered data matrix
  4. X_c^T: Transpose of the centered data matrix


Compute eigenvectors and eigenvalues for covariance matrix. Eigenvectors are principal components while eigenvalues show what proportion of variance is expressed by each principal component.
Σ * V = λ * V


  1. V: Matrix of eigenvectors (each column is an eigenvector)
  2. λ: Vector of eigenvalues

Sorting Eigenvalues:

Sort eigenvalues in descending order and rearrange their corresponding eigenvectors accordingly to ensure greatest principal components come first.

Dimensionality Reduction:

Take the highest k eigenvectors (principal components) that account for most of the variance. Usually, it is such that retains a large part of the total variation like 95% or 99%. This step leads to reduction in dimensionality from original space of features to new subspace defined by selected principal components.


Get reduced-dimensional representation by projecting centered data onto chosen principal components.
Transformed data: X_pca = X_c * V_k


  1. X_pca: Reduced-dimensional data
  2. V_k: Matrix of the top k eigenvectors

PCA allows noise reduction, feature extraction and data visualization, among other uses. In this way, it becomes easier to analyze complex datasets while preserving the maximum information and visualize them with ease as well.

About Author