CS156 LBA: Alexanderplatz PCA and Reconstruction.
For this assignment, I chose Alexanderplatz as the landmark of interest. I took several photos and performed PCA on them. I grayscaled the images, resized and reshaped them as arrays in order to plug the entire dataset into Scikit-Learn's PCA module. After this, I used the module's inverse_transform method along with NumPy reshaping, to recompose the images. Finally, I picked the image that was furthest away from the dataset in 2d form, and separately reconstructed it.
Loading the data
Performing PCA
As we can see, our two components explain 28% and 14% of the variance. This means that overall, with just two components, we explain 42% of the variance, which is not bad given the low dimension. We can compare it to 3 component PCA, and see how much improvement we would get if we did PCA with 3 components.
As we can see, the 3rd component explains 7% of the variance, in different scenarios, we might consider this a valuable enough and use 3 components, or consider it not very informative and stick to 2.
Image Reconstruction
"Outlier" Reconstruction
As we can see, the image is still fairly recognizable. There are two reasons behind this: Firstly, all images in the dataset were fairly similar to each other, which was useful in the fit stage of the PCA since it could detect patterns better. Secondly, as we can see in the original images, there is a clear distinction between the main object of interest (the tower and nearby buildings) and the the sky, which was mainly just gray and thus there was little noise. This also explains why this particular image was an outlier - we can see that my camera angle here was slightly lower and thus the stores and buildings were not as well captured as in other images, there was more sky here than in other images, which the PCA clearly recognized. It's very possible that the maximum value on x-axis had something to do with the properties of the sky's patterns, since its most prominent here. But we cannot guarantee that for sure.