What do you call the measure of central tendency that occupies the middle position in an array of values?

Measures of Central TendencyDescriptive measures that are used to indicate where the center, the middleproperty, or the most typical value of a set of data lies are called measures of centraltendency, often referred to as average. The three most important measures of centraltendency are: the mean, the median, and the mode.Summation NotationsIn statistics, it is necessary to work with sums of numerical values and to expressthese; we make use of standard form.Let us consider the exam scores of Michael Tan in his statistics subject76, 45, 51, 27, 6, 76, 62, 12, 2In mathematical form, we let the letter X denote the values in a set of data. Theexams scores above, we haveX1=76, X2= 45, X3= 51, X4= 27, X5= 6, X6= 76, X7= 62, X8= 12, X9= 2.The numbers, 1, 2, 3, 4, 5, 6, 7, 8, and 9 written below the Xs are called subscripts.Using this notation, the sum of the exam score data can be expressed symbolically asX1+X2+X3+X4+X5+X6+X7+X8+X9.Summation notation provide a short hand for this last sum the notation uses thesymbolƩ[greek capital letter “sigma”] to denote that the subscripted values are to beadded. We can write the sum of the scores asi=19Xiwhere we read “summation of Xi, i going from 1 to 9”. We can simply useX, read“summation of X” or “the sum of X values”. Thus,i=1nXi=X1+X2+X3+….+Xnwhere Xi=X1,X2, X3,….Xn. the subscript I may be omitted together with n if noambiguity will result. Hence for the exams score data.i=19Xi=X1+X2+X3+X4+X5+X6+X7+X8+X9=76+45+51+27+6+76+62+12+2=357Rules for Summation Notation SymbolsRule 1. The summation of the sum of two or more variables is the sum of their summations.Thus,i=1n[Xi+Yi+Zi]=i=1nXi+i=1nYi+i=1nZiProof: expanding the left side and grouping we have,i=1n[Xi+Yi+Zi]i=1n[Xi+Yi+Zi]=[X1+Y1+Z1]+[X2+Y2+Z2]+ [X3+Y3+Z3]+…. +[Xn+Yn+Zn]

Mean, Median, and Mode Section

A measure of central tendency is an important aspect of quantitative data. It is an estimate of a “typical” value.

Three of the many ways to measure central tendency are the mean, median and mode.

There are other measures, such as a trimmed mean, that we do not discuss here.

Mean The mean is the average of data.

Sample MeanLet $x_1, x_2, \ldots, x_n$ be our sample. The sample mean is usually denoted by $\bar{x}$\[\bar{x}=\sum_{i=1}^n \dfrac{x_i}{n}=\dfrac{1}{n}\sum_{i=1}^n x_i\]where n is the sample size and \[x_i\] are the measurements. One may need to use the sample mean to estimate the population mean since usually only a random sample is drawn and we don't know the population mean.

The sample mean is a statistic and a population mean is a parameter. Review the definitions of statistic and parameter in Lesson 0.2.

Note on Notation

What if we say we used $y_i$ for our measurements instead of $x_i$? Is this a problem? No. The formula would simply look like this: \[\bar{y}=\sum_{i=1}^n \dfrac{y_i}{n}=\dfrac{1}{n}\sum_{i=1}^n y_i\] The formulas are exactly the same. The letters that you select to denote the measurements are up to you. For instance, many textbooks use $y$ instead of $x$ to denote the measurements. The point is to understand how the calculation that is expressed in the formula works. In this case, the formula is calculating the mean by summing all of the observations and dividing by the number of observations. There is some notation that you will come to see as standards, i.e, n will always equal sample size. We will make a point of letting you know what these are. However, when it comes to the variables, these labels can [and do] vary.

Median

The median is the middle value of the ordered data.

The most important step in finding the median is to first order the data from smallest to largest.

Steps to finding the median for a set of data:

  1. Arrange the data in increasing order, i.e. smallest to largest.
  2. Find the location of the median in the ordered data by \[\frac{n+1}{2}\], where n is the sample size.
  3. The value that represents the location found in Step 2 is the median.

Note on Odd or Even Sample Sizes

If the sample size is an odd number then the location point will produce a median that is an observed value. If the sample size is an even number, then the location will require one to take the mean of two numbers to calculate the median. The result may or may not be an observed value as the example below illustrates.

Mode The mode is the value that occurs most often in the data. It is important to note that there may be more than one mode in the dataset.

Example 1-5: Test Scores Section

Consider the aptitude test scores of ten children below:

95, 78, 69, 91, 82, 76, 76, 86, 88, 80

Find the mean, median, and mode.

Answer

Mean

\[\bar{x}=\frac{1}{10}[95+78+69+91+82+76+76+86+88+80]=82.1\]

Median

First, order the data.

69, 76, 76, 78, 80, 82, 86, 88, 91, 95

With n = 10, the median position is found by [10 + 1] / 2 = 5.5. Thus, the median is the average of the fifth [80] and sixth [82] ordered value and the median = 81

Mode

The most frequent value in this data set is 76. Therefore the mode is 76.

Note! Mean, median and mode are usually not equal.

Effects of Outliers Section

One shortcoming of the mean is that means are easily affected by extreme values. Measures that are not that affected by extreme values are called resistant. Measures that are affected by extreme values are called sensitive.

Example 1-6: Test Scores Cont'd... Section

Using the data from Example 1-5, how would the mean and median change, if the entry 91 is mistakenly recorded as 9?

Answer

The data set would be

9, 69, 76, 76, 78, 80, 82, 86, 88, 95

Mean

The mean would be \[\bar{x}=\frac{1}{10}[9+78+69+95+82+76+76+86+88+80]=73.9\]

The mean would be 73.9, which is very different from 82.1.

Median

Let us see the effect of the mistake on the median value.

The data set [with 91 coded as 9] in increasing order is:

9, 69, 76, 76, 78, 80, 82, 86, 88, 95

where the median = 79

The medians of the two sets are not that different. Therefore the median is not that affected by the extreme value 9.

The mean is a sensitive measure [or sensitive statistic] and the median is a resistant measure [or resistant statistic].

After reading this lesson you should know that there are quite a few options when one wants to describe central tendency. In future lessons, we talk about mainly about the mean. However, we need to be aware of one of its shortcomings, which is that it is easily affected by extreme values.

Unless data points are known mistakes, one should not remove them from the data set! One should keep the extreme points and use more resistant measures. For example, use the sample median to estimate the population median. We will discuss methods using the median in Lesson 11.

Adding and Multiplying Constants Section

What happens to the mean and median if we add or multiply each observation in a data set by a constant?

Consider for example if an instructor curves an exam by adding five points to each student’s score. What effect does this have on the mean and the median? The result of adding a constant to each value has the intended effect of altering the mean and median by the constant.

For example, if in the above example where we have 10 aptitude scores, if 5 was added to each score the mean of this new data set would be 87.1 [the original mean of 82.1 plus 5] and the new median would be 86 [the original median of 81 plus 5].

Similarly, if each observed data value was multiplied by a constant, the new mean and median would change by a factor of this constant. Returning to the 10 aptitude scores, if all of the original scores were doubled, the then the new mean and new median would be double the original mean and median. As we will learn shortly, the effect is not the same on the variance!

Looking Ahead!

Why would you want to know this? One reason, especially for those moving onward to more applied statistics [e.g. Regression, ANOVA], is the transforming data. For many applied statistical methods, a required assumption is that the data is normal, or very near bell-shaped. When the data is not normal, statisticians will transform the data using numerous techniques e.g. logarithmic transformation. We just need to remember the original data was transformed!!

Shape

The shape of the data helps us to determine the most appropriate measure of central tendency. The three most important descriptions of shape are Symmetric, Left-skewed, and Right-skewed. Skewness is a measure of the degree of asymmetry of the distribution.

Symmetric

  • mean, median, and mode are all the same here
  • no skewness is apparent
  • the distribution is described as symmetric
A symmetrical distribution.

Mean = Median = Mode Symmetrical

Left-Skewed or Skewed Left

  • mean < median
  • long tail on the left
A left skewed distribution.

Median Mean Mode Skewed to the left

Right-skewed or Skewed Right

  • mean > median
  • long tail on the right
A right skewed distribution.

Median Mean Mode Skewed to the right

Note! When one has very skewed data, it is better to use the median as measure of central tendency since the median is not much affected by extreme values.

Application: The Skewed Nature of Salary Data Section

Salary distributions are almost always right-skewed, with a few people that make the most money. To illustrate this, consider your favorite sports team or even the company for which you work. There will be one or two players or personnel that earn the “big bucks”, followed by others who earn less. This will produce a shape that is skewed to the right. Knowing this can be a useful aid in negotiating a higher salary.

When one interviews for a position and the discussion gets around to compensation, it is common that the interviewer states an offer that is “typical for someone in your position”. That is, they are offering you the average salary for someone with your particular skill set [e.g. little experience]. But is this average the mode, median, or mean? The company – for whom business is business! – will want to pay you the least they can while you prefer to earn the most you can. Since salaries tend to be skewed to the right, the offer will most likely reflect the mode or median. You simply need to ask to which “average” the offer refers and what is the mean of this average since the mean would be the highest of the three values. Once you have these averages, you can begin to negotiate toward the highest number.

  • Previous1.5 - Summarizing One Quantitative Variable
  • Next1.5.2 - Measures of Position

Which measure of central tendency identifies the value in the middle?

The median is the middle value in distribution when the values are arranged in ascending or descending order. The median divides the distribution in half [there are 50% of observations on either side of the median value]. In a distribution with an odd number of observations, the median value is the middle value.

Which measure of central tendency is found halfway in the data set?

The mean of two numbers is the number that is halfway between them.

What do you call the measure of central tendency that is referred to as the most frequently occurring value in a given set?

Mean [Arithmetic] The mean [or average] is the most popular and well known measure of central tendency. It can be used with both discrete and continuous data, although its use is most often with continuous data [see our Types of Variable guide for data types].

What statistical measure is in the middle of the interval?

The median is the value exactly in the middle of your data set. To find the middle position, take the value at [n+1]/2 where n is the total number of values.

Bài Viết Liên Quan

Chủ Đề