STATISTICAL MEASUREMENT

CHAPTER ONE

INTRODUCTION TO MEASUREMENT AND STATISTICS

"Statistics can be fun or at least they don't need to be feared." Many folks have trouble believing this premise. Often, individuals walk into their first statistics class experiencing emotions ranging from slight anxiety to borderline panic. It is important to remember, however, that the basic mathematical concepts that are required to understand introductory statistics are not prohibitive for any university student. The key to doing well in any statistics course can be summarized by two words, "KEEP UP!". If you do not understand a concept--reread the material, do the practice questions, and do not be afraid to ask your professor for clarification or help. This is important because the material discussed four weeks from today will be based on material discussed today. If you keep on top of the material and relax a little bit, you might even find you enjoy this introduction to basic measurements and statistics.

With that preface out of the way, we can now get down to the business of discussing, "What do the terms measurement and statistic mean?" and "Why should we study measurement and statistics?"

What is a Statistic?

Statistics are part of our everyday life. Science fiction author H. G. Wells in 1903 stated, ""Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write." Wells was quite prophetic as the ability to think and reason about statistical information is not a luxury in today's information and technological age. Anyone who lacks fundamental statistical literacy, reasoning, and thinking skills may find they are unprepared to meet the needs of future employers or to navigate information presented in the news and media On a most basic level, all one needs to do open a newspaper, turn on the TV, examine the baseball box scores, or even just read a bank statement (hopefully, not in the newspaper) to see statistics in use on a daily basis.

Statistics in and of themselves are not anxiety producing. For example, most individuals (particularly those familiar with baseball) will not experience anxiety when a player's batting average is displayed on the television screen. The "batting average" is a statistic but as we know what it means and how to interpret it, we do not find it particularly frightening. The idea of statistics is often anxiety provoking simply because it is a tool with which we are unfamiliar. Therefore, let us examine what is meant by the term statistic; Kuzma (1984) provides a formal definition: :

A body of techniques and procedures dealing with the collection, organization, analysis, interpretation, and presentation of information that can be stated numerically.

Perhaps an example will clarify this definition. Say, for example, we wanted to know the level of job satisfaction nurses experience working on various units within a particular hospital (e.g., psychiatric, cardiac care, obstetrics, etc.). The first thing we would need to do is collect some data. We might have all the nurses on a particular day complete a job satisfaction questionnaire. We could ask such questions as "On a scale of 1 (not satisfied) to 10 (highly satisfied), how satisfied are you with your job?". We might examine employee turnover rates for each unit during the past year. We also could examine absentee records for a two month period of time as decreased job satisfaction is correlated with higher absenteeism. Once we have collected the data, we would then organize it. In this case, we would organize it by nursing unit.

Absenteeism Data by Unit in Days


	Psychiatric	Cardiac Care	Obstetrics

	3	8	4

	6	9	4

	4	10	3

	7	8	5

	5	10	4

Mean =	5	9	4

Thus far, we have collected our data and we have organized it by hospital unit. You will also notice from the table above that we have performed a simple analysis. We found the mean (you probably know it by the name "average") absenteeism rate for each unit. In other words, we added up the responses and divided by the number of them. Next, we would interpret our data. In this case, we might conclude that the nurses on the cardiac care unit are less satisfied with their job as indicated by the high absenteeism rate (We would compare this result to the analyses of our other data measures before we reached a conclusion). We could then present our results at a conference, in a journal, or to the hospital administration. This might lead to further research concerning job satisfaction on cardiac care units or higher pay for nurses working in this area of specialization.

This example clarifies the process underlying statistical analyses and interpretation. The various techniques and procedures used in the process described above are what make up the content of this chapter. Thus, we will first learn a little bit about the process of data collection/research design. Second, we will examine the use and interpretation of basic statistical analyses used within the context of varying data and design types. And finally, we will examine the process of data presentation.

To further our understanding of the term statistics, it is important to be aware that statistics can be divided into two general categories: descriptive and inferential statistics. Each of these will be discussed below.

Descriptive statistics are used to organize or summarize a particular set of measurements. In other words, a descriptive statistic will describe that set of measurements. For example, in our study above, the mean described the absenteeism rates of five nurses on each unit. The U.S. census represents another example of descriptive statistics. In this case, the information that is gathered concerning gender, race, income, etc. is compiled to describe the population of the United States at a given point in time. A baseball player's batting average is another example of a descriptive statistic. It describes the baseball player's past ability to hit a baseball at any point in time. What these three examples have in common is that they organize, summarize, and describe a set of measurements.

Inferential statistics use data gathered from a sample to make inferences about the larger population from which the sample was drawn. For example, we could take the information gained from our nursing satisfaction study and make inferences to all hospital nurses. We might infer that cardiac care nurses as a group are less satisfied with their jobs as indicated by absenteeism rates. Opinion polls and television ratings systems represent other uses of inferential statistics. For example, a limited number of people are polled during an election and then this information is used to describe voters as a whole.

What is Measurement ?

Normally, when one hears the term measurement, they may think in terms of measuring the length of something (e.g., the length of a piece of wood) or measuring a quantity of something (ie. a cup of flour).This represents a limited use of the term measurement. In statistics, the term measurement is used more broadly and is more appropriately termed scales of measurement. Scales of measurement refer to ways in which variables/numbers are defined and categorized. Each scale of measurement has certain properties which in turn determines the appropriateness for use of certain statistical analyses. The four scales of measurement are nominal, ordinal, interval, and ratio.

Nominal: Categorical data and numbers that are simply used as identifiers or names represent a nominal scale of measurement. Numbers on the back of a baseball jersey and your social security number are examples of nominal data. If I conduct a study and I'm including gender as a variable, I may code Female as 1 and Male as 2 or visa versa when I enter my data into the computer. Thus, I am using the numbers 1 and 2 to represent categories of data.

Ordinal: An ordinal scale of measurement represents an ordered series of relationships or rank order. Individuals competing in a contest may be fortunate to achieve first, second, or third place. first, second, and third place represent ordinal data. If Roscoe takes first and Wilbur takes second, we do not know if the competition was close; we only know that Roscoe outperformed Wilbur. Likert-type scales (such as "On a scale of 1 to 10, with one being no pain and ten being high pain, how much pain are you in today?") also represent ordinal data. Fundamentally, these scales do not represent a measurable quantity. An individual may respond 8 to this question and be in less pain than someone else who responded 5. A person may not be in exactly half as much pain if they responded 4 than if they responded 8. All we know from this data is that an individual who responds 6 is in less pain than if they responded 8 and in more pain than if they responded 4. Therefore, Likert-type scales only represent a rank ordering.

Interval: A scale that represents quantity and has equal units but for which zero represents simply an additional point of measurement is an interval scale. The Fahrenheit scale is a clear example of the interval scale of measurement. Thus, 60 degree Fahrenheit or -10 degrees Fahrenheit represent interval data. Measurement of Sea Level is another example of an interval scale. With each of these scales there are direct, measurable quantities with equality of units. In addition, zero does not represent the absolute lowest value. Rather, it is point on the scale with numbers both above and below it (for example, -10degrees Fahrenheit).

Ratio: The ratio scale of measurement is similar to the interval scale in that it also represents quantity and has equality of units. However, this scale also has an absolute zero (no numbers exist below zero). Very often, physical measures will represent ratio data (for example, height and weight). If one is measuring the length of a piece of wood in centimeters, there is quantity, equal units, and that measure cannot go below zero centimeters. A negative length is not possible.

The table below will help clarify the fundamental differences between the four scales of measurement:

	Indications Difference	Indicates Direction of Difference	Indicates Amount of Difference	Absolute Zero

Nominal	X
Ordinal	X	X
Interval	X	X	X
Ratio	X	X	X	X

You will notice in the above table that only the ratio scale meets the criteria for all four properties of scales of measurement.

Interval and Ratio data are sometimes referred to as parametric and Nominal and Ordinal data are referred to as nonparametric. Parametric means that it meets certain requirements with respect to parameters of the population (for example, the data will be normal--the distribution parallels the normal or bell curve). In addition, it means that numbers can be added, subtracted, multiplied, and divided. Parametric data are analyzed using statistical techniques identified as Parametric Statistics. As a rule, there are more statistical technique options for the analysis of parametric data and parametric statistics are considered more powerful than nonparametric statistics. Nonparametric data are lacking those same parameters and cannot be added, subtracted, multiplied, and divided. For example, it does not make sense to add Social Security numbers to get a third person. Nonparametric data are analyzed by using Nonparametric Statistics.

As a rule, ordinal data is considered nonparametric and cannot be added, etc.. Again, it does not make sense to add together first and second place in a race--one does not get third place. However, many assessment devices and tests (e.g., intelligence scales) as well as Likert-type scales represent ordinal data but are often treated as if they are interval data. For example, the "average" amount of pain that a person reports on a Likert-type scale over the course of a day would be computed by adding the reported pain levels taken over the course of the day and dividing by the number of times the question was answered. Theoretically, as this represents ordinal data, this computation should not be done.

As stated above, many measures (e.g,. personality, intelligence, psychosocial, etc.) within the psychology and the health sciences represent ordinal data. IQ scores may be computed for a group of individuals. They will represent differences between individuals and the direction of those differences but they lack the property of indicating the amount of the differences. Psychologists have no way of truly measuring and quantifying intelligence. An individual with an IQ of 70 does not have exactly half of the intelligence of an individual with an IQ of 140. Indeed, even if two individuals both score a 120 on an IQ test, they may not really have identical levels of intelligence across all abilities. Therefore, IQ scales should theoretically be treated as ordinal data.

In both of the above illustrations, the statement is made that they should be theoretically treated as ordinal data. In practice, however, they are usually treated as if they represent parametric (interval or ratio) data. This opens up the possibility for use of parametric statistical techniques with these data and the benefits associated with the use of techniques.

CHAPTER TWO

IMPORTANCE OF STATISTICAL MEASUREMENT

Measures of central tendency are very useful in Statistics. Their importance is because of the following reasons:

(i) To find representative value:

Measures of central tendency or averages give us one value for the distribution and this value represents the entire distribution. In this way averages convert a group of figures into one value.

(ii) To condense data:

Collected and classified figures are vast. To condense these figures we use average. Average converts the whole set of figures into just one figure and thus helps in condensation.

(iii) To make comparisons:

To make comparisons of two or more than two distributions, we have to find the representative values of these distributions. These representative values are found with the help of measures of the central tendency.

(iv) Helpful in further statistical analysis:

Many techniques of statistical analysis like Measures of Dispersion, Measures of Skewness, Measures of Correlation, and Index Numbers are based on measures of central tendency. That is why; measures of central tendency are also called as measures of the first order.

CHAPTER THREE

MEASURES OF RELATIONSHIP

Chapter 5 of the textbook introduced you to the two most widely used measures of relationship: the Pearson product-moment correlation and the Spearman rank-order correlation. We will be covering these statistics in this section, as well as other measures of relationship among variables.

What is a Relationship?

Correlation coefficients are measures of the degree of relationship between two or more variables. When we talk about a relationship, we are talking about the manner in which the variables tend to vary together. For example, if one variable tends to increase at the same time that another variable increases, we would say there is a positive relationship between the two variables. If one variable tends to decrease as another variable increases, we would say that there is a negative relationship between the two variables. It is also possible that the variables might be unrelated to one another, so that there is no predictable change in one variable based on knowing about changes in the other variable.

As a child grows from an infant into a toddler into a young child, both the child's height and weight tend to change. Those changes are not always tightly locked to one another, but they do tend to occur together. So if we took a sample of children from a few weeks old to 3 years old and measured the height and weight of each child, we would likely see a positive relationship between the two.

A relationship between two variables does not necessarily mean that one variable causes the other. When we see a relationship, there are three possible causal interpretations. If we label the variables A and B, A could cause B, B could cause A, or some third variable (we will call it C) could cause both A and B. With the relationship between height and weight in children, it is likely that the general growth of children, which increases both height and weight, accounts for the observed correlation. It is very foolish to assume that the presence of a correlation implies a causal relationship between the two variables. There is an extended discussion of this issue in Chapter 7 of the text.

Scatter Plots and Linear Relationships

A helpful way to visualize a relationship between two variables is to construct a scatter plot, which you were briefly introduced to in our discussion of graphical techniques. A scatter plot represents each set of paired scores on a two dimensional graph, in which the dimensions are defined by the variables. For example, if we wanted to create a scatter plot of our sample of 100 children for the variables of height and weight, we would start by drawing the X and Y axes, labeling one height and the other weight, and marking off the scales so that the range on these axes is sufficient to handle the range of scores in our sample. Let's suppose that our first child is 27 inches tall and 21 pounds. We would find the point on the weight axis that represents 21 pounds and the point on the height axis that represents 27 inches. Where these two points cross, we would put a dot that represents the combination of height and weight for that child, as shown in the figure below.

Pearson Product-Moment Correlation

The Pearson product-moment correlation was devised by Karl Pearson in 1895, and it is still the most widely used correlation coefficient. This history behind the mathematical development of this index is fascinating. Those interested in that history can click on the link. But you need not know that history to understand how the Pearson correlation works.

The Pearson product-moment correlation is an index of the degree of linear relationship between two variables that are both measured on at least an ordinal scale of measurement. The index is structured so the a correlation of 0.00 means that there is no linear relationship, a correlation of +1.00 means that there is a perfect positive relationship, and a correlation of -1.00 means that there is a perfect negative relationship. As you move from zero to either end of this scale, the strength of the relationship increases. You can think of the strength of a linear relationship as how tightly the data points in a scatter plot cluster around a straight line. In a perfect relationship, either negative or positive, the points all fall on a single straight line. We will see examples of that later. The symbol for the Pearson correlation is a lowercase r, which is often subscripted with the two variables. For example, r_xy would stand for the correlation between the variables X and Y.

The Phi Coefficient

The Phi coefficient is an index of the degree of relationship between two variables that are measured on a nominal scale. Because variables measured on a nominal scale are simply classified by type, rather than measured in the more general sense, there is no such thing as a linear relationship. Nevertheless, it is possible to see if there is a relationship. For example, suppose you want to study the relationship between religious background and occupations. You have a classification systems for religion that includes Catholic, Protestant, Muslim, Other, and Agnostic/Atheist. You have also developed a classification for occupations that include Unskilled Laborer, Skilled Laborer, Clerical, Middle Manager, Small Business Owner, and Professional/Upper Management. You want to see if the distribution of religious preferences differ by occupation, which is just another way of saying that there is a relationship between these two variables.

Advanced Correlational Techniques

Correlational techniques are immensely flexible and can be extended dramatically to solve various kinds of statistical problems. Covering the details of these advanced correlational techniques is beyond the score of this text and website. However, we have included brief discussions of several advanced correlational techniques on the Student Resource Website, including multidimensional scaling, path analysis, taxonomic search techniques, and statistical analysis of neuroimages.

Nonlinear Correlational Procedures

The vast majority of correlational techniques used in psychology are linear correlations. However, there are times when one can expect to find nonlinear relationships and would like to apply statistical procedures to capture such complex relationships. This topic is far too complex to cover here. The interested student will want to consult advanced statistical textbooks that specialize in regression analyses.

There are two words of caution that we want to state about using such nonlinear correlational procedures. Although it is relatively easy to do the computations using modern statistical software, you should not use these procedures unless you actually understand them and their pitfalls. It is easy to misuse the techniques and to be fooled into believing things that are not true from a naive analysis of the output of computer programs.

The second word of caution is that there should be a strong theoretical reason to expect a nonlinear relationship if you are going to use nonlinear correlational procedures. Many psychophysiological processes are by their nature nonlinear, so using nonlinear correlations in studying those processes makes complete sense. But for most psychological processes, there is no good theoretical reasons to expect a nonlinear relationship

CONCLUSION

To summarize, the five reasons to study statistics are to be able to effectively conduct research, to be able to read and evaluate journal articles, to further develop critical thinking and analytic skills, to act as an informed consumer, and to know when you need to hire outside statistical help.

REFERENCES

Duncan, O. D. (1992, September). What if? Contemporary Sociology, 21(5), 667-668.

Falmagne, J.-C., & Narens, L. (1983). Scales and meaningfulness of quantitative laws. Synthese, 55, 287-325.

Hammersley, M. (1989). The dilemma of qualitative method: Herbert Blumer and the Chicago Tradition. New York: Routledge.

Wise, M. N. (Ed.). (1995). The values of precision. Princeton, New Jersey: Princeton University Press.

Trending

APPLICATION OF ICT IN ENTERTAINMENT

University of Maryland: A Comprehensive Guide

AGRICULTURE IN THE PRECOLONIAL ERA OF NIGERIA: FORMS, IMPORTANCE, PROBLEMS AND SOLUTION TO THE MODERN DAYS

APPLICATION OF ICT IN THE HOSPITAL

SEPLAT UNDERGRADUATE SCHOLARSHIP

STATISTICAL MEASUREMENT