CHAPTER 4 - SCALING A DATA MATRIX IN A LOW-DIMENSIONAL EUCLIDEAN SPACE  pp. 183-268

SCALING A DATA MATRIX IN A LOW-DIMENSIONAL EUCLIDEAN SPACE

By Michael J. Greenacre and Leslie G. Underhill

Image View Previous Chapter Next Chapter



INTRODUCTION

The principles underlying data scaling originated in the field of psychology. Direct measurement of intelligence, aggressiveness, depression and other mental states are impossible, whereas it is fairly easy to observe various manifestations of these states. For example, a typical IQ test consists of a set of questions designed to test various aspects of the underlying quality which the researcher calls “intelligence”. Every person that undergoes such a test generates a vector of responses and these are then converted into a single value which is called the IQ of that person. The process whereby this is achieved is called scaling, because each person becomes a point on a single dimension, in this case the IQ scale. Geometrically, the original data vector may be considered a point in a high-dimensional space and the process of scaling maps this point to a point in a one-dimensional subspace. These ideas can be generalized to mapping data vectors to points in a two-dimensional subspace so that the original vectors are “scaled” as points in a plane. Whether the dimensionality of the final subspace of representation be one, two or higher, the underlying principles are the same. Firstly, scaling transforms data points of very high dimensionality to points of much lower dimensionality. Secondly, the original data vectors are often points in a non-Euclidean space, that is distances are not defined in the usual physical way.