The Confusion Matrix: Evaluating the Accuracy of a Classification and Determining the Errors in Classification (Especially for GATE-Geospatial 2022)

Get top class preparation for competitive exams right from your home: get questions, notes, tests, video lectures and more- for all subjects of your exam.

Examrace Books on Mapping, GIS, and Remote Sensing prepares you throughly for a wide range of practical applications.

Ground Truth and Classification Accuracy Assessment

  • Ground truth or field survey is done in order to observe and collect information about the actual condition on the ground at a test site and determine the relationship between remotely sensed data and the object to be observed. It is recommended to have a ground truth at the same time of data acquisition, or at least within the time that the environmental condition does not change.
  • Classification accuracy assessment is a general term for comparing the classification to geographical data that are assumed to be true to determine the accuracy of the classification process. Usually, the assumed true data are derived from ground truth. It is usually not practical to ground truth or otherwise test every pixel of a classified image. Therefore, a set of reference pixels is usually used.
  • Reference pixels are points on the classified image for which actual data are (will be) known. The reference pixel is randomly select (Congalton, 1991) .

Evaluating the Accuracy of a Classification

The basic idea is to compare the predicted classification (supervised or unsupervised) of each pixel with the actual classification as discovered by ground truth. A good review of methods is given by (Congalton, 1991) . Four kinds of accuracy information:

  1. Nature of the errors: what kinds of information are confused?
  2. The frequency of the errors: how often do they occur?
  3. The magnitude of errors: how bad are they? E. g. , confusing old-growth with second, growth forest is not as ‘bad’ an error as confusing water with forest.
  4. Source of errors: why did the error occur?

The Confusion Matrix: Determining the Errors in Classification

The analyst selects a sample of pixels and then visits the sites (or vice-versa) , and builds a confusion matrix: (IDRISI module CONFUSE) . This is used to determine the nature and frequency of errors.

ENVIConfusionMatrix:: ConfusionMatrix
  • Columns = ground Data (assumed ‘correct’ )
  • Rows = map data (classified by the automatic procedure)
  • Cells of the matrix = count of the number of observations for each (ground, map) combination
  • Diagonal elements = agreement between ground and map; ideal is a matrix with all zero off-diagonals
  • Errors of omission (map producer՚s accuracy) = incorrect in column/total in the column, Measures how well the map maker was able to represent the ground features.
  • Errors of commission (map user՚s accuracy) = incorrect in row/total in row. Measures how likely the map user is to encounter correct information while using the map.
  • Overall map accuracy = total on diagonal/grand total
  • A statistical test of the classification accuracy for the whole map or individual cells is possible using the Kappa index of agreement.
Accuracy Metrics

Developed by: