A skewed distribution refers to a probability distribution that is uneven and asymmetric in nature. Unlike a standard normal distribution, which resembles a bell curve in shape, skewed distributions are shifted to one side, possessing a longer tail on one side relative to the other side of the median. The other side of the curve will possess a clustered peak of values where the majority of data points occur. This type of distribution curve is usually classified as having either a positive skew or a negative skew, depending on the direction of the curve's shift.
Generally, a skewed distribution is said to possess positive skew if the tail of the curve is longer on the right side when compared to the left side. This skewed distribution is also referred to as skewed to the right because the right side possesses the wider extension of data points. Positive skew curves possess the largest number of values toward the left side of the curve.
In contrast, negatively skewed distributions possess the most data points on the right side of the curve. These curves have longer tails on the left sides, so they are said to be skewed to the left. An important rule in determining the direction of skew is to consider the length of the tail rather than the location of the mean or median. This is because the skew is ultimately caused by the furthest outlying values, which stretch the curve out toward that side of the graph.
Understanding the properties of a skewed distribution is important in many statistical applications. Many people assume that data follows a bell curve, or normal distribution, so they also assume that a graph has zero skewness. These assumptions, however, could lead them to misinterpret information about the actual distribution.
A skewed distribution is inherently uneven in nature, so it will not follow standard normal patterns such as standard deviation. Normal distributions involve one standard deviation that applies to both sides of the curve, but skewed distributions will have different standard deviation values for each side of the curve. This is because the two sides are not mirror images of each other, so the equations describing one side cannot be applied to the other. The standard deviation value is generally larger for the side with the longer tail because there is a wider spread of data on that side when compared to the shorter tail.