## Probability plotting

Graphical presentation of the relationship between values x and cumulative distribution function F(x) in arithmetic scale is not usually suitable when extreme values are of interest. The probabilities of extreme values are relatively small, and it is very hard to read them from such a plot.

A special type of graph used for presenting the relationship between the probability and data values is known as the probability plot. Probability plots are designed for particular theoretical distributions by transforming the scale of the probability axis so that a given distribution is represented by a straight line. This is achieved by introducing so called reduced variable y, which is a transform of F(x) and is linearly related to x.

Usually, horizontal axis of the probability plot shows reduced variable in linear scale and it is accompanied with another horizontal axis showing non-exceedance probability F(x) or exceedance probability 1 – F(x) in non-linear scale. Values x of the random variable are usually plotted on vertical axis. Specially designed graph paper which is ruled with vertical grid for probabilities and horizontal grid for data values is called probability paper.

Normal probability plot. It enables normal distribution to be plotted as a straight line. Here, reduced variable is the standard normal variable z, which is linearly related to normal variable x in a well-known relationship:

where m and s are mean and standard deviation of variable x. Therefore, normal probability plot is a zx plot. On the other hand, for each z there is a unique value of cumulative distribution function FZ(z) = FX(x) (e.g. see tables of normal distribution). Then additional axis can be constructed for F(x) by marking F-values at corresponding z-values. For example, to make a tick-mark for 0.9 on F-axis, one has to measure 1.282 on z-axis (because FZ(1.282) = 0.9).

Log-normal probability plot. Normal probability plot with x-axis in logarithmic scale is used for plotting the log-normal distribution. Linear relationship behind this plot is between standard normal variable z and log-transform of the variable x:

where my and sy are mean and standard deviation of variable y = ln x.

Gumbel (EV1) probability plot. Gumbel or EV1 distribution is represented by as a straight line on this plot. Gumbel standard variable y is used as the reduced variable and is linearly related to Gumbel variable x:

where u and a are parameters of the Gumbel distribution. Gumbel standard variable y and  cumulative distribution function F are related through y = –ln(–ln F), which is used to construct the F-axis corresponding to y-axis.

Application. Probability plots and papers are very useful for graphical inspection of the goodness-of-fit between theoretical and empirical distributions. Consider, for example, a log-normal probability paper, on which log-normal distribution is represented by a straight line. If empirical distribution of sample data plots close to the straight line on the log-normal paper, then that is an indication that data can be fitted with log-normal distribution. Otherwise, if empirical distribution does not appear to be a straight line on the log-normal paper, then it should not be fitted with log-normal distribution.