Application: data visualization by projection on a lineVectors > Basics | Scalar product, Norms | Projection on a line | Orthogonalization | Hyperplanes | Linear functions | Application
Senate voting data
Visualization of high-dimensional data via projectionAs seen in the picture above, simply plotting the raw data is often not very informative. We can try to visualize the data set, by projecting each data point (each row or column of the matrix) on (say) a one-, two- or three-dimensional space. Each ‘‘view’’ corresponds to a particular projection, that is, a particular one-, two- or three-dimensional subspace on which we choose to project the data. Let us detail what it means to project on a one-dimensional set, that is, on a line. Projecting on a line allows to assign a single number, or ‘‘score’’, to each data point, via a scalar product. We choose a (normalized) direction ![]() We thus obtain a vector of values ![]() The zero-mean condition implies ![]() is the vector of sample averages of the different data points. The vector ![]() In order to be able to compare the relative merits of different directions, we can assume, without loss of generality, that the direction vector u is normalized (so that Note that our definition of In the Senate voting example above, a particular projection (that is, a direction in ExamplesProjection on a random directionProjection on the ‘‘all-ones’’ vectorClearly, not all directions are ‘‘good’’, in the sense of producing informative plots. Here, we discuss a general principle that allows to choose an ‘‘informative’’ direction. But for this data set, a good guess could be to choose the direction that corresponds to the ‘‘average bill’’. That is, we choose the direction |