Principal Component Analysis (2024)

In this post, we will learn about Principal Component Analysis (PCA) — a popular dimensionality reduction technique in Machine Learning. Our goal is to form an intuitive understanding of PCA without going into all the mathematical details.

At the time of writing this post, the population of the United States is roughly 325 million. You may think millions of people will have a million different ideas, opinions, and thoughts, after all, every person is unique. Right?

Wrong!

Humans are like sheep. We follow a herd. It’s sad but true.

Let’s say you select 20 top political questions in the United States and ask millions of people to answer these questions using a yes or a no. Here are a few examples

1. Do you support gun control?
2. Do you support a woman’s right to abortion?


And so on and so forth. Technically, you can get Principal Component Analysis (1) different answers sets because you have 20 questions and each question has to be answered using a yes or a no.

In practice, you will notice the answer set is much much smaller. In fact, you replace the top 20 questions with a single question

“Are you a democrat or a republican?”

and accurately predict the answer to the rest of the questions with a high degree of accuracy. So, this 20-dimensional data is compressed to a single dimension and not much information is lost!

This is exactly what PCA allows us to do. In a multi-dimensional data, it will help us find the dimensions that are most useful and contain the most information. It will help us extract essential information from data by reducing the dimensions.

We will need some mathematical tools to understand PCA and let’s begin with an important concept in statistics called the variance.

What is variance?

The variance measures the spread of the data. In Figure 1 (a), the points have a high variance because they are spread out, but in Figure 1 (b), the points have a low variance because they are close together.

Also, note that in Figure 1 (a) the variance is not the same in all directions. The direction of maximum variance is especially important. Let’s see why.

Why do we care about the direction of maximum variance?

Variance encodes information contained in the data. For example, if you had 2D data represented by points with Principal Component Analysis (4) coordinates. For n such points, you need 2n numbers to represent this data. Consider a special case where for every data point the value along the y-axis was 0 (or constant). This is shown in Figure 2

It is fair to say that there is no (or very little) information along the y-axis. You can compactly represent this data using n numbers to represent its value along the x-axis and only 1 common number to represent the constant along the y-axis. Because there is more variance along the x-axis, there is more information, and hence we have to use more numbers to represent this data. On the other hand, since there is no variance along the y-axis, a single number can be used to represent all information contained in n points along this axis.

What is Principal Component Analysis

Now consider a slightly more complicated dataset shown in Figure 3 using red dots. The data is spread in a shape that roughly looks like an ellipse. The major axis of the ellipse is the direction of maximum variance and as we know now, it is the direction of maximum information. This direction, represented by the blue line in Figure 3, is called the first principal component of the data.

The second principal component is the direction of maximum variance perpendicular to the direction of the first principal component. In 2D, there is only one direction that is perpendicular to the first principal component, and so that is the second principal component. This is shown in Figure 3 using a green line.

Now consider 3D data spread like an ellipsoid (shown in Figure 4). The first principal component is represented by the blue line. There is an entire plane that is perpendicular to the first principal component. Therefore, there are infinite directions to choose from and the second principal component is chosen to be the direction of maximum variance in this plane. As you may have guessed, the third principal component is simply the direction perpendicular to both the first and second principal components.

PCA and Dimensionality Reduction

In the beginning of this post, we had mentioned that the biggest motivation for PCA is dimensionality reduction. In other words, we want to capture information contained in the data using fewer dimensions.

Let’s consider the 3D data shown in Figure 4. Every data point has 3 coordinates – x, y, and z which represent their values along the X, Y and Z axes. Notice the three principal components are nothing but a new set of axes because they are perpendicular to each other. We can call these axes formed by the principal components X’, Y’ and Z’.

In fact, you can rotate the X, Y, Z axes along with all the data points in 3D such that the X-axis aligns with the first principal component, the Y-axis aligns with the second principal component and the Z-axis aligns with the third principal components. By applying this rotation we can transform any point (x, y, z) in the XYZ coordinate system to a point (x’, y’, z’) in the new X’Y’Z’ coordinate system. It is the same information presented in a different coordinate system, but the beauty of this new coordinate system X’Y’Z’ is that the information contained in X’ is maximum, followed by Y’ and then Z’. If you drop the coordinate z’ for every point (x’, y’, z’) we still retain most of the information, but now we need only two dimensions to represent this data.

This may look like a small saving, but if you have 1000 dimensional data, you may be able to reduce the dimension dramatically to maybe just 20 dimensions. In addition to reducing the dimension, PCA will also remove noise in the data.

What are Eigen vectors and Eigen values of a matrix?

In the next section, we will explain step by step how PCA is calculated, but before we do that, we need to understand what Eigen vectors and Eigen values are.

Consider the following 3×3 matrix

Principal Component Analysis (7)

Consider a special vector Principal Component Analysis (8), where

Principal Component Analysis (9)

Let us multiply the matrix Principal Component Analysis (10) with the vector Principal Component Analysis (11) and see why I called this vector special.

Principal Component Analysis (12)

where Principal Component Analysis (13). Notice, when we multiplied the matrix Principal Component Analysis (14) to the vector Principal Component Analysis (15), it only changed the magnitude of the vector Principal Component Analysis (16) by Principal Component Analysis (17) but did not change its direction. There are only 3 directions ( including the Principal Component Analysis (18) in the example above ) for which the matrix Principal Component Analysis (19) will act like a scalar multiplier. These three directions are the Eigen vectors of the matrix and the scalar multipliers are the Eigen values.

So, an Eigen vector Principal Component Analysis (20) of a matrix Principal Component Analysis (21) is a vector whose direction does not change when the matrix is multiplied to it. In other words,

Principal Component Analysis (22)

where Principal Component Analysis (23) is a scalar ( just a number ) and is called the Eigen value corresponding to the Eigen vector Principal Component Analysis (24)

How to Calculate PCA?

Usually, you can easily find the principal components of given data using a linear algebra package of your choice. In the next post, we will learn how to use the PCA class in OpenCV. Here, we briefly explain the steps for calculating PCA so you get a sense of how it is implemented in various math packages.

Here are the steps for calculating PCA. We have explained the steps using 3D data for simplicity, but the same idea applies to any number of dimensions.

  1. Assemble a data matrix: The first step is to assemble all the data points into a matrix where each column is one data point. A data matrix, Principal Component Analysis (25), of Principal Component Analysis (26) 3D points would like something like this

    Principal Component Analysis (27)

  2. Calculate Mean: The next step is to calculate the mean (average) of all data points. Note, if the data is 3D, the mean is also a 3D point with x, y and z coordinates. Similarly, if the data is m dimensional, the mean will also be m dimensional. The mean Principal Component Analysis (28) is culculated as

    Principal Component Analysis (29)

  3. Subtract Mean from data matrix: We next create another matrix Principal Component Analysis (30) by subtracting the mean from every data point of Principal Component Analysis (31)

    Principal Component Analysis (32)

  4. Calculate the Covariance matrix: Remember we want to find the direction of maximum variance. The covariance matrix captures the information about the spread of the data. The diagonal elements of a covariance matrix are the variances along the X, Y and Z axes. The off-diagonal elements represent the covariance between two dimensions ( X and Y, Y and Z, Z and X ).The covariance matrix, Principal Component Analysis (33) is calculated using the following product.

    Principal Component Analysis (34)


    where, Principal Component Analysis (35) represents the transpose operation. The matrix Principal Component Analysis (36) is of size Principal Component Analysis (37) where Principal Component Analysis (38) is the number of dimensions ( which is 3 in our example ).Figure 5 shows how the covariance matrix changes depending on the spread of data in different directions.
  • Calculate the Eigen vectors and Eigen values of the covariance matrix: The principal components are the Eigen vectors of the covariance matrix. The first principal component is the Eigen vector corresponding to the largest Eigen value, the second principal component is the Eigen vector corresponding to the second largest Eigen value and so on and so forth.
  • If you are more interested in why this procedure works, here is an excellent article titled A Geometric Interpretation of the Covariance Matrix.

    There is another widespread dimensionality reduction and visualization technique called t-SNE.

    Subscribe & Download Code

    If you liked this article and would like to download code (C++ and Python) and example images used in this post, please click here. Alternately, sign up to receive a free Computer Vision Resource Guide. In our newsletter, we share OpenCV tutorials and examples written in C++/Python, and Computer Vision and Machine Learning algorithms and news.

    Download Example Code

    Principal Component Analysis (2024)
    Top Articles
    Protons Neutrons & Electrons of All Elements (List + Images)
    Carpets from the Islamic World, 1600–1800 | Essay | The Metropolitan Museum of Art | Heilbrunn Timeline of Art History
    Otc School Calendar
    Zachary Zulock Linkedin
    Petco Westerly Ri
    Dyi Urban Dictionary
    Jocko Joint Warfare Review
    Join MileSplit to get access to the latest news, films, and events!
    When Does Dtlr Close
    Parx Raceway Results
    What Is Flipping Straights Ted Lasso
    Hangar 67
    Timothy Warren Cobb Obituary
    Seattle Rub Rating
    Sam's Club Key Event Dates 2023 Q1
    When His Eyes Opened Chapter 2981
    SpaceX Polaris Dawn spacewalk - latest: 'It's gorgeous' - billionaire Jared Isaacman's awed reaction as he steps out of capsule on historic spacewalk
    Brise Stocktwits
    Autoplay Media Studio 9.5 Full
    O'reilly's In Mathis Texas
    Craigslist Storage Containers
    All Obituaries | Dante Jelks Funeral Home LLC. | Birmingham AL funeral home and cremation Gadsden AL funeral home and cremation
    Miller's Yig
    پنل کاربری سایت همسریابی هلو
    The Autopsy of Jane Doe - Kritik | Film 2016 | Moviebreak.de
    Virtualrewardcenter.com/Activate
    Rubios Listens Com
    Dez Juggs
    N33.Ultipro
    Red Dragon Fort Mohave Az
    Gracex Rayne
    Verizon Fios Internet Review: Plans, Prices And Speed 2024
    Mikayla Campinos: The Rising Star Of EromeCom
    Bella Isabella 1425
    Journal articles: 'State of New York and the Military Society of the War of 1812' – Grafiati
    Live Gold Spot Price Chart | BullionVault
    Raz-Plus Literacy Essentials for PreK-6
    Black Myth Wukong All Secrets in Chapter 6
    Serenity Of Lathrop Reviews
    Dinar Guru Iraqi Dinar
    Bible Gateway Lookup
    Sprague Brook Park Camping Reservations
    Tampa Catholic Calendar
    Yuba Sutter Craigslist Free Stuff
    Snapcamms
    76 Games Unblocked Fnf
    Tacoma Craigslist Free
    Buhsd Studentvue
    Schedule360 Minuteclinic
    Vci Classified Paducah
    Four Observations from Germany’s barnstorming 5-0 victory over Hungary
    Imagetrend Elite Delaware
    Latest Posts
    Article information

    Author: Domingo Moore

    Last Updated:

    Views: 5491

    Rating: 4.2 / 5 (53 voted)

    Reviews: 84% of readers found this page helpful

    Author information

    Name: Domingo Moore

    Birthday: 1997-05-20

    Address: 6485 Kohler Route, Antonioton, VT 77375-0299

    Phone: +3213869077934

    Job: Sales Analyst

    Hobby: Kayaking, Roller skating, Cabaret, Rugby, Homebrewing, Creative writing, amateur radio

    Introduction: My name is Domingo Moore, I am a attractive, gorgeous, funny, jolly, spotless, nice, fantastic person who loves writing and wants to share my knowledge and understanding with you.