Supervised and Unsupervised Classification: Differences and Important Algorithms (Especially for GATE-Geospatial 2022)

Get unlimited access to the best preparation resource for competitive exams : get questions, notes, tests, video lectures and more- for all subjects of your exam.

Examrace Books on Mapping, GIS, and Remote Sensing prepares you throughly for a wide range of practical applications.

The main use of remote sensing data is in classifying the myriad of features in a scene (usually presented as an image) into meaningful categories or classes. The image then becomes a thematic map (the theme is selectable, e. g. , land use; geology; vegetation types; rainfall) . This is done by creating an unsupervised classification when features are separated solely on their spectral properties and a supervised classification when we use some prior or acquired knowledge of the classes in a scene in setting up sites to estimate and identify the spectral characteristics of each class.

Supervised Classification

A supervised classification shows the distribution of the named (identified) classes, as these were established by the investigator who knew their nature from field observations. In conducting the classification, representative pixels of each class were lumped into one or more training sites that were manipulated statistically to compare unknown class pixels to these site references.

  • Maximum likelihood Classification: Maximum likelihood Classification is a statistical decision criterion to assist in the classification of overlapping signatures; pixels are assigned to the class of highest probability. The maximum likelihood classifier is considered to give more accurate results than parallelepiped classification however it is much slower due to extra computations. We put the word ‘accurate’ in quotes because this assumes that classes in the input data have a Gaussian distribution and that signatures were well selected; this is not always a safe assumption.
  • Minimum distance Classification: Minimum distance classifies image data on a database file using a set of 256 possible class signature segments as specified by the signature parameter. Each segment specified in the signature, for example, stores signature data pertaining to a class. Only the mean vector in each class signature segment is used. Other data, such as standard deviations and covariance matrices, are ignored (though the maximum likelihood classifier uses this) . The result of the classification is a theme map directed to a specified database image channel. A theme map encodes each class with a unique grey level. The grey-level value used to encode a class is specified when the class signature is created. If the theme map is later transferred to the display, then a pseudo-colour table should be loaded so that each class is represented by a different colour.

The Concept of Maximum Likelihood Method

The Concept of Maximum Likelihood Method

Unsupervised Classification

This system of classification does not utilize training data as the basis of classification. This classifier involves algorithms that examine the unknown pixels in the image and aggregate them into E, number of classes based on the natural groupings or cluster present in the image. The classes that result from this type of classification are spectral classes. Unsupervised classification is the identification, labelling and mapping of these natural classes. This method is usually used when there is less information about the data before classification. There are several mathematical strategies to represent the clusters of data in spectral space.

  • Sequential Clustering: In this method, the pixels are analyzed one at a time pixel by pixel and line by line. The spectral distance between each analyzed pixel and previously defined cluster means are calculated. If the distance is greater than some threshold value, the pixel begins a new cluster otherwise it contributes to the nearest existing clusters in which case cluster mean is recalculated. Clusters are merged if too many of them are formed by adjusting the threshold value of the cluster means.
Sequential Clustering
  • Statistical Clustering: It overlooks the spatial relationship between adjacent pixels. The algorithm uses 3x3 windows in which all pixels have the similar vector in space. The process has two steps
    • Testing for homogeneity within the window of pixels under consideration.
    • Cluster merging and deletion

Here the windows are moved one at a time through the image avoiding the overlap. The mean and standard derivation is calculated for each band of the window. The smaller the standard deviation for a given band the greater the homogeneity of the window. These values are then compared by the user-specified parameter for delineating the upper and lower limit of the standard deviation. If the window passes the homogeneity test it forms a cluster. Clusters are created until then number exceeds the user-defined maximum number of clusters at which point some are merged or deleted according to their weighting and spectral distances.

  • RGB Clustering: It is a quick method for 3 bands, 8-bit data. The algorithm plots all pixels in spectral space and then divides this space into 32 x 32 x 32 clusters. A cluster is required to have a minimum number of pixels to become a class. RGB clustering is not based on any part of the data.

Developed by: