disc13_gsi

.pdf

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

100

Subject

Computer Science

Date

Apr 26, 2024

Type

pdf

Pages

8

Uploaded by ChefJackal3025 on coursehero.com

Data 100, Spring 2024 Discussion #13 Solutions Note: Your TA will probably not cover all the problems. This is fine; the discussion worksheets are not designed to be finished within an hour. They are deliberately made slightly longer so they can serve as resources you can use to practice, reinforce, and build upon concepts discussed in lectures, labs, and homework. PCA Basics 1. Consider the following dataset, where X is the corresponding 4 × 3 design matrix. The mean and variance for each of the features are also provided. Observations Feature 1 Feature 2 Feature 3 1 -3.59 7.39 -0.78 2 -8.37 -5.32 0.90 3 1.75 -0.61 -0.62 4 10.21 -1.46 0.50 Mean 0 0 0 Variance 47.56 21.35 0.51 Suppose we perform a singular value decomposition (SVD) on this data to obtain X = USV T : ( Note: U and V T are not perfectly orthonormal due to rounding to 2 decimal places. ) U = 0 . 25 0 . 81 0 . 20 0 . 61 0 . 56 0 . 24 0 . 13 0 . 06 0 . 85 0 . 74 0 . 18 0 . 41 , S = 13 . 79 0 0 0 9 . 32 0 0 0 0 . 81 , V T = 1 . 00 0 . 02 0 . 00 0 . 02 0 . 99 0 . 13 0 . 00 0 . 13 0 . 99 (a) Recall that XV contains the principal components of dataset X . and that we can alterna- tively calculate it using US . Prove, using the definition from lecture, that XV = US . Solution: Since V is orthonormal, we know that V T V = I . Starting with X = USV T , we right multiply by V on both sides: XV = USV T V = USI = US This completes the proof. Staff Notes: This is a great time to stress the properties of the U, Σ , V T . For example, V T being orthonormal means V 1 = V T , what S being a diagonal matrix means, etc. 1
Discussion #13 2 (b) Compute the vector for the first principal component (round to 2 decimal places). Solution: We compute the first principal component by multiplying X by the first row of V T to get 3 . 44 8 . 47 1 . 74 10 . 18 T (your values may differ slightly due to rounding). You can also compute the first PC by observing that XV = US . Therefore, the first principal component is also the first column of US . (c) What is the component score of the first principal component? In other words, how much variance does it capture of the original data X ? Solution: The variance captured by i -th principal component of the original data X is equal to ( i -th singular value ) 2 number of observations n In this case, n = 4 , and σ 1 = 13 . 79 . Therefore, the component score can be computed as follows: 13 . 79 2 4 = 47 . 54 (d) (Bonus) Given the results of (a), how can we interpret the rows of V T ? What do the values in these rows represent? Solution: Each principal component of X is a linear combination of X ’s features. The rows of V T correspond to the weights of each feature in the linear combinations that make up their respective principal components.
Discussion #13 3 Applications of PCA 2. Lillian wants to apply PCA to food PCA , a dataset of food nutrition information to understand the different food groups. She needs to preprocess her current dataset in order to use PCA. (a) What are the appropriate preprocessing steps when performing PCA on a dataset? A. Transform each row to have a magnitude of 1 (Normalization) B. Transform each column to have a mean of 0 (Centering) C. Transform each column to have a mean of 0 and a standard deviation of 1 (Standardization) D. None of the above Solution: We can use standardization or centering of the columns for PCA, since each column contains values of a particular feature for many observations. Standard- ization ensures that the the standard deviation of each collection of feature values is 1 , so that the variability in each feature across the data points is on a uniform scale. Additionally, we cannot compute the covariance matrix correctly using SVD if the feature columns are not centered with mean 0. Choice (A) is incorrect because it doesn’t really make sense to preprocess by row in PCA, since PCA is all about find- ing combinations of features (columns) as opposed to rows. (b) Assume you have correctly preprocessed your data using the correct response in part (a). Write a line of code that returns the first 3 principal components assuming you have the correctly preprocessed food PCA and the following variables returned by SVD. u, s, vt = np.linalg.svd(food_PCA, full_matrices = False) first_3_pcs = __________________________________
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help