Respond to the following in a minimum of 175 words:
An experimenter is examining the relationship between age and self-disclosure. A large sample of participants that are 25 to 35 years old and participants that are 65 to 75 years old are compared, and significant differences are found with younger participants disclosing much more than older people. The researcher reports an effect size of .34. What does this mean?
PART2-SEE ATTACHMENT….Week Six Homework Exercise.
PART3-SEE ATTACHMENT…Developmental Research Matrix. I ONLY HAVE TO ANSWER ONE QUESTION THIS QUESTION “Describe the research method. What will you do? What instruments will you use to measure sexual attitudes”?
- Contrast the three ways of describing results: comparing group percentages, correlating scores, and comparing group means.
- Describe a frequency distribution, including the various ways to display a frequency distribution.
- Describe the measures of central tendency and variability.
- Define a correlation coefficient.
- Define effect size.
- Describe the use of a regression equation and a multiple correlation to predict behavior.
- Discuss how a partial correlation addresses the third-variable problem.
- Summarize the purpose of structural equation models.
STATISTICS HELP US UNDERSTAND DATA COLLECTED IN RESEARCH INVESTIGATIONS IN TWO WAYS: FIRST, STATISTICS ARE USED TO DESCRIBE THE DATA. Second, statistics are used to make inferences and draw conclusions, on the basis of sample data, about a population. We examine descriptive statistics and correlation in this chapter; inferential statistics are discussed in Chapter 13. This chapter will focus on the underlying logic and general procedures for making statistical decisions. Specific calculations for a variety of statistics are provided in Appendix C.
SCALES OF MEASUREMENT: A REVIEW
Before looking at any statistics, we need to review the concept of scales of measurement. Whenever a variable is studied, the researcher must create an operational definition of the variable and devise two or more levels of the variable. Recall from Chapter 5 that the levels of the variable can be described using one of four scales of measurement: nominal, ordinal, interval, and ratio. The scale used determines the types of statistics that are appropriate when the results of a study are analyzed. Also recall that the meaning of a particular score on a variable depends on which type of scale was used when the variable was measured or manipulated.
The levels of nominal scale variables have no numerical, quantitative properties. The levels are simply different categories or groups. Most independent variables in experiments are nominal, for example, as in an experiment that compares behavioral and cognitive therapies for depression. Variables such as gender, eye color, hand dominance, college major, and marital status are nominal scale variables; left-handed and right-handed people differ from each other, but not in a quantitative way.
Variables with ordinal scale levels exhibit minimal quantitative distinctions. We can rank order the levels of the variable being studied from lowest to highest. The clearest example of an ordinal scale is one that asks people to make rank-ordered judgments. For example, you might ask people to rank the most important problems facing your state today. If education is ranked first, health care second, and crime third, you know the order but you do not know how strongly people feel about each problem: Education and health care may be very close together in seriousness with crime a distant third. With an ordinal scale, the intervals between each of the items are probably not equal.
Interval scale and ratio scale variables have much more detailed quantitative properties. With an interval scale variable, the intervals between the levels are equal in size. The difference between 1 and 2 on the scale, for example, is the same as the difference between 2 and 3. Interval scales generally have five or more quantitative levels. You might ask people to rate their mood on a 7-point scale ranging from a “very negative” to a “very positive” mood. There is no absolute zero point that indicates an “absence” of mood.
Page 244In the behavioral sciences, it is often difficult to know precisely whether an ordinal or an interval scale is being used. However, it is often useful to assume that the variable is being measured on an interval scale because interval scales allow for more sophisticated statistical treatments than do ordinal scales. Of course, if the measure is a rank ordering (for example, a rank ordering of students in a class on the basis of popularity), an ordinal scale clearly is being used.
Ratio scale variables have both equal intervals and an absolute zero point that indicates the absence of the variable being measured. Time, weight, length, and other physical measures are the best examples of ratio scales. Interval and ratio scale variables are conceptually different; however, the statistical procedures used to analyze data with such variables are identical. An important implication of interval and ratio scales is that data can be summarized using the mean, or arithmetic average. It is possible to provide a number that reflects the mean amount of a variable—for example, the “average mood of people who won a contest was 5.1” or the “mean weight of the men completing the weight loss program was 187.7 pounds.”
Scales of measurement have important implications for the way that the results of research investigations are described and analyzed. Most research focuses on the study of relationships between variables. Depending on the way that the variables are studied, there are three basic ways of describing the results: (1) comparing group percentages, (2) correlating scores of individuals on two variables, and (3) comparing group means.
Comparing Group Percentages
Suppose you want to know whether males and females differ in their interest in travel. In your study, you ask males and females whether they like or dislike travel. To describe your results, you will need to calculate the percentage of females who like to travel and compare this with the percentage of males who like to travel. Suppose you tested 50 females and 50 males and found that 40 of the females and 30 of the males indicated that they like to travel. In describing your findings, you would report that 80% of the females like to travel in comparison with 60% of the males. Thus, a relationship between the gender and travel variables appears to exist. Note that we are focusing on percentages because the travel variable is nominal: Liking and disliking are simply two different categories.
After describing your data, the next step would be to perform a statistical analysis to determine whether there is a statistically significant difference between the males and females. Statistical significance is discussed in Chapter 13; statistical analysis procedures are described in Appendix C.
Correlating Individual Scores
A second type of analysis is needed when you do not have distinct groups of subjects. Instead, individuals are measured on two variables, and each variable has a range of numerical values. For example, we will consider an analysis of data on the relationship between location in a classroom and grades in the class: Do people who sit near the front receive higher grades?
Comparing Group Means
Much research is designed to compare the mean responses of participants in two or more groups. For example, in an experiment designed to study the effect of exposure to an aggressive adult, children in one group might observe an adult “model” behaving aggressively while children in a control group do not. Each child then plays alone for 10 minutes in a room containing a number of toys, while observers record the number of times the child behaves aggressively during play. Aggression is a ratio scale variable because there are equal intervals and a true zero on the scale.
In this case, you would be interested in comparing the mean number of aggressive acts by children in the two conditions to determine whether the children who observed the model were more aggressive than the children in the control condition. Hypothetical data from such an experiment in which there were 10 children in each condition are shown in Table 12.1; the scores in the table represent the number of aggressive acts by each child. In this case, the mean aggression score in the model group is 5.20 and the mean score in the no-model condition is 3.10.
TABLE 12.1 Scores on aggression measure in a hypothetical experiment on modeling and aggression
Page 246For all types of data, it is important to understand your results by carefully describing the data collected. We begin by constructing frequency distributions.
When analyzing results, researchers start by constructing a frequency distribution of the data. A frequency distribution indicates the number of individuals who receive each possible score on a variable. Frequency distributions of exam scores are familiar to most college students—they tell how many students received a given score on the exam. Along with the number of individuals associated with each response or score, it is useful to examine the percentage associated with this number.
Graphing Frequency Distributions
It is often useful to graphically depict frequency distributions. Let’s examine several types of graphs: pie chart, bar graph, and frequency polygon.
Pie charts Pie charts divide a whole circle, or “pie,” into “slices” that represent relative percentages. Figure 12.1 shows a pie chart depicting a frequency distribution in which 70% of people like to travel and 30% dislike travel. Because there are two pieces of information to graph, there are two slices in this pie. Pie charts are particularly useful when representing nominal scale information. In the figure, the number of people who chose each response has been converted to a percentage—the simple number could have been displayed instead, of course. Pie charts are most commonly used to depict simple descriptions of categories for a single variable. They are useful in applied research reports and articles written for the general public. Articles in scientific journals require more complex information displays.
Bar graphs Bar graphs use a separate and distinct bar for each piece of information. Figure 12.2 represents the same information about travel using a bar graph. In this graph, the x or horizontal axis shows the two possible responses. The y or vertical axis shows the number who chose each response, and so the height of each bar represents the number of people who responded to the “like” and “dislike” options.
Bar graph displaying data obtained in two groups
Frequency polygons Frequency polygons use a line to represent the distribution of frequencies of scores. This is most useful when the data represent interval or ratio scales as in the modeling and aggression data shown in Table 12.1. Here we have a clear numeric scale of the number of aggressive acts during the observation period. Figure 12.3 graphs the data from the hypothetical experiment using two frequency polygons—one for each group. The solid line represents the no-model group, and the dotted line stands for the model group.
Histograms A histogram uses bars to display a frequency distribution for a quantitative variable. In this case, the scale values are continuous and show increasing amounts on a variable such as age, blood pressure, or stress. Because the values are continuous, the bars are drawn next to each other. A histogram is shown in Figure 12.4 using data from the model group in Table 12.1.
What can you discover by examining frequency distributions? First, you can directly observe how your participants responded. You can see what scores are most frequent, and you can look at the shape of the distribution of scores. You can tell whether there are any outliers—scores that are unusual, unexpected, or very different from the scores of other participants. In an experiment, you can compare the distribution of scores in the groups.
Frequency polygons illustrating the distributions of scores in Table 12.1
Note: Each frequency polygon is anchored at scores that were not obtained by anyone (0 and 6 in the no-model group; 2 and 8 in the model group).
Histogram showing frequency of responses in the model group
In addition to examining the distribution of scores, you can calculate descriptive statistics. Descriptive statistics allow researchers to make precise statements about the data. Two statistics are needed to describe the data. A single number can be used to describe the central tendency, or how participants scored overall. Another number describes the variability, or how widely the distribution of scores is spread. These two numbers summarize the information contained in a frequency distribution.
A central tendency statistic tells us what the sample as a whole, or on the average, is like. There are three measures of central tendency—the mean, the median, and the mode. The mean of a set of scores is obtained by adding all the scores and dividing by the number of scores. It is symbolized as ; in scientific reports, it is abbreviated as M. The mean is an appropriate indicator of central tendency only when scores are measured on an interval or ratio scale, because the actual values of the numbers are used in calculating the statistic. In Table 12.1, the mean score for the no-model group is 3.10 and for the model group is 5.20. Note that the Greek letter Σ (sigma) in Table 12.1 is statistical notation for summing a set of numbers. Thus, ΣX is shorthand for “sum of the values in a set of scores.”
The median is the score that divides the group in half (with 50% scoring below and 50% scoring above the median). In scientific reports, the median is abbreviated as Mdn. The median is appropriate when scores are on an ordinal Page 249scale because it takes into account only the rank order of the scores. It is also useful with interval and ratio scale variables, however. The median for the nomodel group is 3 and for the model group is 5.
The mode is the most frequent score. The mode is the only measure of central tendency that is appropriate if a nominal scale is used. The mode does not use the actual values on the scale, but simply indicates the most frequently occurring value. There are two modal values for the no-model group—3 and 4 occur equally frequently. The mode for the model group is 5.
The median or mode can be a better indicator of central tendency than the mean if a few unusual scores bias the mean. For example, the median family income of a county or state is usually a better measure of central tendency than the mean family income. Because a relatively small number of individuals have extremely high incomes, using the mean would make it appear that the “average” person makes more money than is actually the case.
We can also determine how much variability exists in a set of scores. A measure of variability is a number that characterizes the amount of spread in a distribution of scores. One such measure is the standard deviation, symbolized as s, which indicates the average deviation of scores from the mean. Income is a good example. The Census Bureau reports that the median U.S. household income in 2012 was $53,046 (http://quickfacts.census.gov/qfd/states/00000.html). Suppose that you live in a community that matches the U.S median and there is very little variation around that median (i.e., every household earns something close to $53,046); your community would have a smaller standard deviation in household income compared to another community in which the median income is the same but there is a lot more variation (e.g., where many people earn $15,000 per year and many others $5 million per year). It is possible for measures of central tendency in two communities to be close with the variability differing substantially.
In scientific reports, the standard deviation is abbreviated as SD. It is derived by first calculating the variance, symbolized as s2 (the standard deviation is the square root of the variance). The standard deviation of a set of scores is small when most people have similar scores close to the mean. The standard deviation becomes larger as more people have scores that lie farther from the mean value. For the model group, the standard deviation is 1.14, which tells us that most scores in that condition lie 1.14 units above and below the mean—that is, between 4.06 and 6.34. Thus, the mean and the standard deviation provide a great deal of information about the distribution. Note that, as with the mean, the calculation of the standard deviation uses the actual values of the scores; thus, the standard deviation is appropriate only for interval and ratio scale variables.
Another measure of variability is the range, which is simply the difference between the highest score and the lowest score. The range for both the model and no-model groups is 4.
Graphing relationships between variables was discussed briefly in Chapter 4. A common way to graph relationships between variables is to use a bar graph or a line graph. Figure 12.5 is a bar graph depicting the means for the model and no-model groups. The levels of the independent variable (no-model and model) are represented on the horizontal x axis, and the dependent variable values are shown on the vertical y axis. For each group, a point is placed along the y axis that represents the mean for the groups, and a bar is drawn to visually represent the mean value. Bar graphs are used when the values on the x axis are nominal categories (e.g., a no-model and a model condition). Line graphs are used when the values on the x axis are numeric (e.g., marijuana use over time, as shown in Figure 7.1). In line graphs, a line is drawn to connect the data points to represent the relationship between the variables.
Graph of the results of the modeling experiment showing mean aggression scores
Choosing the scale for a bar graph allows a common manipulation that is sometimes used by scientists and all too commonly used by advertisers. The trick is to exaggerate the distance between points on the measurement scale to make the results appear more dramatic than they really are. Suppose, for example, that a cola company (cola A) conducts a taste test that shows 52% of the participants prefer cola A and 48% prefer cola B. How should the cola company present these results? The two bar graphs in Figure 12.6 show the most honest method, as well as one that is considerably more dramatic. It is always wise to look carefully at the numbers on the scales depicted in graphs.
Two ways to graph the same data
CORRELATION COEFFICIENTS: DESCRIBING THE STRENGTH OF RELATIONSHIPS
It is important to know whether a relationship between variables is relatively weak or strong. A correlation coefficient is a statistic that describes how strongly variables are related to one another. You are probably most familiar with the Pearson product-moment correlation coefficient, which is used when both variables have interval or ratio scale properties. The Pearson product-moment correlation coefficient is called the Pearson r. Values of a Pearson r can range from 0.00 to ±1.00. Thus, the Pearson r provides information about the strength of the relationship and the direction of the relationship. A correlation of 0.00 indicates that there is no relationship between the variables. The nearer a correlation is to 1.00 (plus or minus), the stronger is the relationship. Indeed, a 1.00 correlation is sometimes called a perfect relationship because the two variables go together in a perfect fashion. The sign of the Pearson r tells us about the direction of the relationship; that is, whether there is a positive relationship or a negative relationship between the variables.
Data from studies examining similarities of intelligence test scores among siblings illustrate the connection between the magnitude of a correlation coefficient and the strength of a relationship. The relationship between scores of monozygotic (identical) twins reared together is .86 and the correlation for monozygotic twins reared apart is .74, demonstrating a strong similarity of test scores in these pairs of individuals. The correlation for dizygotic (fraternal) twins reared together is less strong, with a correlation of .59. The correlation among non-twin siblings raised together is .46, and the correlation among non-twin siblings reared apart is .24. Data such as these are important in ongoing research on the relative influence of heredity and environment on intelligence (Devlin, Daniels, & Roeder, 1997; Kaplan, 2012).
There are several different types of correlation coefficients. Each coefficient is calculated somewhat differently depending on the measurement scale that applies to the two variables. As noted, the Pearson r correlation coefficient is appropriate when the values of both variables are on an interval or ratio scale. We will now focus on the details of the Pearson product-moment correlation coefficient.
Pearson r Correlation Coefficient
To calculate a correlation coefficient, we need to obtain pairs of observations from each subject. Thus, each individual has two scores, one on each of the variables. Table 12.2 shows fictitious data for 10 students measured on the variables of classroom seating pattern and exam grade. Students in the first row receive a seating score of 1, those in the second row receive a 2, and so on. Once we have made our observations, we can see whether the two variables are related. Do the variables go together in a systematic fashion?
TABLE 12.2 Pairs of scores for 10 participants on seating pattern and exam scores (fictitious data)
The Pearson r provides two types of information about the relationship between the variables. The first is the strength of the relationship; the second is the direction of the relationship. As noted previously, the values of r can range from 0.00 to ±1.00. The absolute size of r is the coefficient that indicates the strength of the relationship. A value of 0.00 indicates that there is no relationship. The nearer r is to 1.00 (plus or minus), the stronger is the relationship. The plus and minus signs indicate whether there is a positive linear or negative linear relationship between the two variables. It is important to remember that it is the size of the correlation coefficient, not the sign, that indicates the strength of the relationship. Thus, a correlation coefficient of −.54 indicates a stronger relationship than does a coefficient of +.45.
Scatterplots The data in Table 12.2 can be visualized in a scatterplot in which each pair of scores is plotted as a single point in a diagram. Figure 12.7 shows two scatterplots. The values of the first variable are depicted on the x axis, and the values of the second variable are shown on the y axis. These scatterplots show a perfect positive relationship (+1.00) and a perfect negative relationship (−1.00). You can easily see why these are perfect relationships: The scores on the two variables fall on a straight line that is on the diagonal of the diagram. Each person’s score on one variable correlates precisely with his or her score on the other variable. If we know an individual’s score on one of the variables, we can predict exactly what his or her score will be on the other variable. Such “perfect” relationships are rarely observed in reality.
Scatterplots of perfect (±1.00) relationships
Page 253The scatterplots in Figure 12.8 show patterns of correlation you are more likely to encounter in exploring research findings. The first diagram shows pairs of scores with a positive correlation of +.65; the second diagram shows a negative relationship, −.77. The data points in these two scatterplots reveal a general pattern of either a positive or negative relationship, but the relationships are not perfect. You can make a general prediction in the first diagram, for instance, that the higher the score on one variable, the higher the score on the second variable. However, even if you know a person’s score on the first variable, you cannot perfectly predict what that person’s score will be on the second variable. To confirm this, take a look at value 1 on variable x (the horizontal axis) in the positive scatterplot. Looking along the vertical y axis, you will see that two individuals had a score of 1. One of these had a score of 1 on variable y, and the other had a score of 3. The data points do not fall on the perfect diagonal shown in Figure 12.7. Instead, there is a variation (scatter) from the perfect diagonal line.
Scatterplots depicting patterns of correlation
Page 254The third diagram shows a scatterplot in which there is absolutely no correlation (r = 0.00). The points fall all over the diagram in a completely random pattern. Thus, scores on variable x are not related to scores on variable y.
The fourth diagram has been left blank so that you can plot the scores from the data in Table 12.2. The x (horizontal) axis has been labeled for the seating pattern variable, and the y (vertical) axis for the exam score variable. To complete the scatterplot, you will need to plot the 10 pairs of scores. For each individual in the sample, find the score on the seating pattern variable; then go up from that point until you are level with that person’s exam score on the y axis. A point placed there will describe the score on both variables. There will be 10 points on the finished scatterplot.
The correlation coefficient calculated from these data shows a negative relationship between the variables (r = −.88). In other words, as the seating distance from the front of the class increases, the exam score decreases. Although these data are fictitious, a negative relationship has been reported in research on this topic (Benedict & Hoag, 2004; Brooks & Rebata, 1991).
Restriction of range It is important that the researcher sample from the full range of possible values of both variables. If the range of possible values is restricted, the magnitude of the correlation coefficient is reduced. For example, if the range of seating pattern scores is restricted to the first two rows, you will not get an accurate picture of the relationship between seating pattern and exam score. In fact, when only scores of students sitting in the first two rows are considered, the correlation between the two variables is exactly 0.00. With a restricted range comes restricted variability in the scores and thus less variability that can be explained. Figure 12.9 illustrates a scatterplot with the entire range of X values represented and with a portion of those values missing because of restriction of range.
The problem of restriction of range occurs when the individuals in your sample are very similar on the variable you are studying. If you are studying age as a variable, for instance, testing only 6- and 7-year-olds will reduce your chances of finding age effects. Likewise, trying to study the correlates of intelligence will be almost impossible if everyone in your sample is very similar in intelligence (e.g., the senior class of a prestigious private college).
Left scatterplot—positive correlation with entire range of values. Right scatterplot—no correlation with restricted range of values
Curvilinear relationship The Pearson product-moment correlation coefficient (r) is designed to detect only linear relationships. If the relationship is curvilinear, as in the scatterplot shown in Figure 12.10, the correlation coefficient will not indicate the existence of a relationship. The Pearson r correlation coefficient calculated from these data is exactly 0.00, even though the two variables clearly are related.
Because a relationship may be curvilinear, it is important to construct a scatterplot in addition to looking at the magnitude of the correlation coefficient. The scatterplot is valuable because it gives a visual indication of the shape of the relationship. Computer programs for statistical analysis will usually display scatterplots and can show you how well the data fit to a linear or curvilinear relationship. When the relationship is curvilinear, another type of correlation coefficient must be used to determine the strength of the relationship.
Scatterplot of a curvilinear relationship (Pearson product-moment correlation coefficient = 0.00)
We have presented the Pearson r correlation coefficient as the appropriate way to describe the relationship between two variables with interval or ratio scale properties. Researchers also want to describe the strength of relationships between variables in all studies. Effect size refers to the strength of association between variables. The Pearson r correlation coefficient is one indicator of effect size; it indicates the strength of the linear association between two variables. In an experiment with two or more treatment conditions, other types of correlation coefficients can be calculated to indicate the magnitude of the effect of the independent variable on the dependent variable. For example, in our experiment on the effects of witnessing an aggressive model on children’s aggressive behavior, we compared the means of two groups. In addition to knowing the means, it is valuable to know the effect size. An effect size correlation coefficient can be calculated for the modeling and aggression experiment. In this case, the effect size correlation value is .69. As with all correlation coefficients, the values of this effect size correlation can range from 0.00 to 1.00 (we do not need to worry about the direction of relationship, so plus and minus values are not used).
The advantage of reporting effect size is that it provides us with a scale of values that is consistent across all types of studies. The values range from 0.00 to 1.00, irrespective of the variables used, the particular research design selected, or the number of participants studied. You might be wondering what correlation coefficients should be considered indicative of small, medium, and large effects. A general guide is that correlations near .15 (about .10 to .20) are considered small, those near .30 are medium, and correlations above .40 are large.
It is sometimes preferable to report the squared value of a correlation coefficient; instead of r, you will see r2. Thus, if the obtained r = .50, the reported r2 = .25. Why transform the value of r? This reason is that the transformation changes the obtained r to a percentage. The percentage value represents the percent of variance in one variable that is accounted for by the second variable. The range of r2 values can range from 0.00 (0%) to 1.00 (100%). The r2 value is sometimes referred to as the percent of shared variance between the two variables. What does this mean, exactly? Recall the concept of variability in a set of scores—if you measured the weight of a random sample of American adults, you would observe variability in that weights would range from relatively low weights to relatively high weights. If you are studying factors that contribute to people’s weight, you would want to examine the relationship between weights and scores on the contributing variable. One such variable might be gender: In actuality, the correlation between gender and weight is about .70 (with males weighing more than females). That means that 49% (squaring .70) of the variability in weight is accounted for by variability in gender. You have therefore explained 49% of the variability in the weights, but there is still 51% of the variability that is not accounted for. This variability might be accounted for by other variables, such as the weights of Page 257the biological mother and father, prenatal stress, diet, and exercise. In an ideal world, you could account for 100% of the variability in weights if you had enough information on all other variables that contribute to people’s weights: Each variable would make an incremental contribution until all the variability is accounted for.
Regression equations are calculations used to predict a person’s score on one variable when that person’s score on another variable is already known. They are essentially “prediction equations” that are based on known information about the relationship between the two variables. For example, after discovering that seating pattern and exam score are related, a regression equation may be calculated that predicts anyone’s exam score based only on information about where the person sits in the class. The general form of a regression equation is:
where Y is the score we wish to predict, X is the known score, a is a constant, and b is a weighting adjustment factor that is multiplied by X (it is the slope of the line created with this equation). In our seating–exam score example, the following regression equation is calculated from the data:
Thus, if we know a person’s score on X (seating), we can insert that into the equation and predict what that person’s exam score (Y) will be. If the person’s X score is 2 (by sitting in the second row), we can predict that Y = 99 + (−16), or that the person’s exam score will be 83. Through the use of regression equations such as these, colleges can use SAT scores to predict college grades.
When researchers are interested in predicting some future behavior (called the criterion variable) on the basis of a person’s score on some other variable (called the predictor variable), it is first necessary to demonstrate that there is a reasonably high correlation between the criterion and predictor variables. The regression equation then provides the method for making predictions on the basis of the predictor variable score only.
Thus far we have focused on the correlation between two variables at a time. Researchers recognize that a number of different variables may be related to a given behavior (this is the same point noted above in the discussion of factors Page 258that contribute to weight). A technique called multiple correlation is used to combine a number of predictor variables to increase the accuracy of prediction of a given criterion or outcome variable.
A multiple correlation (symbolized as R to distinguish it from the simple r) is the correlation between a combined set of predictor variables and a single criterion variable. Taking all of the predictor variables into account usually permits greater accuracy of prediction than if any single predictor is considered alone. For example, applicants to graduate school in psychology could be evaluated on a combined set of predictor variables using multiple correlation. The predictor variables might be (1) college grades, (2) scores on the Graduate Record Exam Aptitude Test, (3) scores on the Graduate Record Exam Psychology Test, and (4) favorability of letters of recommendation. No one of these factors is a perfect predictor of success in graduate school, but this combination of variables can yield a more accurate prediction. The multiple correlation is usually higher than the correlation between any one of the predictor variables and the criterion or outcome variable.
In actual practice, predictions would be made with an extension of the regression equation technique discussed previously. A multiple regression equation can be calculated that takes the following form:
where Y is the criterion variable, X1 to Xn are the predictor variables, a is a constant, and b1 to bn are weights that are multiplied by scores on the predictor variables. For example, a regression equation for graduate school admissions would be:
Researchers use multiple regression to study basic research topics. For example, Ajzen and Fishbein (1980) developed a model called the “theory of reasoned action” that uses multiple correlation and regression to predict specific behavioral intentions (e.g., to attend church on Sunday, buy a certain product, or join an alcohol recovery program) on the basis of two predictor variables. These are (1) attitude toward the behavior and (2) perceived normative pressure to engage in the behavior. Attitude is one’s own evaluation of the behavior, and normative pressure comes from other people such as parents and friends. In one study, Codd and Cohen (2003) found that the multiple correlation between college students’ intention to seek help for alcohol problems and Page 259the combined predictors of attitude and norm was .35. The regression equation was as follows:
This equation is somewhat different from those described previously. In basic research, you are not interested in predicting an exact score (such as an exam score or GPA), and so the mathematical calculations can assume that all variables are measured on the same scale. When this is done, the weighting factor reflects the magnitude of the correlation between the criterion variable and each predictor variable. In the help-seeking example, the weight for the attitude predictor is somewhat higher than the weight for the norm predictor; this shows that, in this case, attitudes are more important as a predictor of intention than are norms. However, for other behaviors, attitudes may be less important than norms.
It is also possible to visualize the regression equation. In the help-seeking example, the relationships among variables could be diagrammed as follows:
You should note that the squared multiple correlation coefficient (R2) is interpreted in much the same way as the squared correlation coefficient (r2). That is, R2 tells you the percentage of variability in the criterion variable that is accounted for by the combined set of predictor variables. Again, this value will be higher than that of any of the single predictors by themselves.
THE THIRD-VARIABLE PROBLEM
Researchers face the third-variable problem in nonexperimental research when some uncontrolled third variable may be responsible for the relationship between the two variables of interest. For example, a finding that people who exercise more have lower anxiety levels could be due to a third variable such as income. High income may cause both exercise and low anxiety (see Chapter 4). When properly designed and executed, the problem does not exist in experimental research, because all extraneous variables are controlled either by keeping the variables constant or by using randomization.
Page 260A technique called partial correlation provides a way of statistically controlling third variables. A partial correlation is a correlation between the two variables of interest, with the influence of the third variable removed from, or “partialed out of,” the original correlation. This provides an indication of what the correlation between the primary variables would be if the third variable were held constant. This is not the same as actually keeping the variable constant, but it is a useful approximation.
Suppose a researcher is interested in a measure of number of bedrooms per person as an index of household crowding—a high number indicates that more space is available for each person in the household. After obtaining this information, the researcher gives a cognitive test to children living in these households. The correlation between bedrooms per person and test scores is .50. Thus, children in more spacious houses score higher on the test. The researcher suspects that a third variable may be operating. Social class could influence both housing and performance on this type of test. If social class is measured, it can be included in a partial correlation calculation that looks at the relationship between bedrooms per person and test scores with social class held constant. To calculate a partial correlation, you need to have scores on the two primary variables of interest and the third variable that you want to examine.
When a partial correlation is calculated, you can compare the partial correlation with the original correlation to see if the third variable did have an effect. Is our original correlation of .50 substantially reduced when social class is held constant? Figure 12.11 shows two different partial correlations. In both, there is a .50 correlation between bedrooms per person and test score. The first partial correlation between bedrooms per person and test scores drops to .09 when social class is held constant because social class is so highly correlated with the primary variables. However, the partial correlation in the second example remains high at .49 because the correlations with social class are relatively small. Thus, the outcome of the partial correlation depends on the magnitude of the correlations between the third variable and the two variables of primary interest.
Two partial correlations between bedrooms per person and performance
STRUCTURAL EQUATION MODELING
Advances in statistical methods have resulted in a complex set of techniques to examine models that specify a set of relationships among variables using quantitative nonexperimental methods (see Raykov & Marcoulides, 2000; Ullman, 2007). Structural equation modeling (SEM) is a general term to refer to these techniques. The methods of SEM are beyond the scope of this book but you will likely encounter some research findings that use SEM; thus, it is worth-while to provide an overview. A model is an expected pattern of relationships among a set of variables. The proposed model is based on a theory of how the variables are causally related to one another. After data have been collected, statistical methods can be applied to examine how closely the proposed model actually “fits” the obtained data.
Researchers typically present path diagrams to visually represent the models being tested. Such diagrams show the theoretical causal paths among the variables. The multiple regression diagram on attitudes and intentions shown previously is a path diagram of a very simple model. The theory of reasoned action evolved into a more complex “theory of planned behavior” that has an additional construct to predict behavior. Huchting, Lac, and LaBrie (2008) studied alcohol consumption among 247 sorority members at a private university. They measured four variables at the beginning of the study: (1) attitude toward alcohol consumption based on how strongly the women believed that consuming alcohol had positive consequences, (2) subjective norm or perceived alcohol consumption of other sorority members, (3) perceived lack of behavioral control based on beliefs about the degree of difficulty in refraining from drinking alcohol, and (4) intention to consume alcohol based on the amount that the women expected to consume during the next 30 days. The sorority members were contacted one month later to provide a measure of actual behavior—the amount of alcohol consumed during the previous 30 days.
The theory of planned behavior predicts that attitude, subjective norm, and behavior control will each predict the behavioral intention to consume alcohol. Intention will in turn predict actual behavior. The researchers used structural equation modeling techniques to study this model. It is easiest to visualize the results using the path diagram shown in Figure 12.12. In the path diagram, arrows leading from one variable to another depict the paths that relate the variables in the model. The arrows indicate a causal sequence. Note that the model specifies that attitude, subjective norm, and behavioral control are related to intention and that intention in turn causes actual behavior. The statistical analysis provides what are termed path coefficients—these are similar to the standardized weights derived in the regression equations described previously. They indicate the strength of a relationship on our familiar −1.00 to +1.00 scale.
In Figure 12.12, you can see that both attitude and subjective norm were significant predictors of intention to consume alcohol. However, the predicted relationship between behavioral control and intention was not significant (therefore, the path is depicted as a dashed line). Intention was strongly related to actual behavior. Note also that behavioral control had a direct path to behavior; this indicates that difficulty in controlling alcohol consumption is directly related to actual consumption.
Structural model based on data from Huchting, Lac, and LaBrie (2008)
Besides illustrating how variables are related, a final application of SEM in the Huchting et al. study was to evaluate how closely the obtained data fit the specified model. The researchers concluded that the model did in fact closely fit the data. The lack of a path from behavioral control to intention is puzzling and will lead to further research.
There are many other applications of SEM. For example, researchers can compare two competing models in terms of how well each fits obtained data. Researchers can also examine much more complex models that contain many more variables. These techniques allow us to study nonexperimental data in more complex ways. This type of research leads to a better understanding of the complex networks of relationships among variables.
In the next chapter we turn from description of data to making decisions about statistical significance. These two topics are of course related. The topic of effect size that was described in this chapter is also very important when evaluating statistical significance.
Bar graph (p. 246)
Central tendency (p. 248)
Correlation coefficient (p. 251)
Criterion variable (p. 257)
Descriptive statistics (p. 248)
Effect size (p. 256)
Page 263Frequency distribution (p. 246)
Frequency polygons (p. 247)
Histogram (p. 247)
Interval scales (p. 243)
Mean (p. 248)
Median (p. 248)
Mode (p. 249)
Multiple correlation (p. 258)
Nominal scales (p. 243)
Ordinal scales (p. 243)
Partial correlation (p. 260)
Pearson product-moment correlation coefficient (p. 251)
Pie chart (p. 246)
Predictor variable (p. 257)
Range (p. 249)
Ratio scales (p. 244)
Regression equations (p. 257)
Restriction of range (p. 254)
Scatterplot (p. 252)
Standard deviation (p. 249)
Structural equation modeling (SEM) (p. 261)
Variability (p. 249)
Variance (p. 249)