Introduction to Linear Regression and Correlation Analysis
Goals
After this, you should be able to:
• • • • •
Calculate and interpret the simple correlation between two variables
Determine whether the correlation is significant Calculate and interpret the simple linear regression equation for a set of data Understand the assumptions behind regression analysis Determine whether a regression model is significant
Goals
(continued)
After this, you should be able to:
• Calculate and interpret confidence intervals for the regression coefficients • Recognize regression analysis applications for purposes of prediction and description • Recognize some potential problems if regression analysis is used incorrectly • Recognize
…show more content…
sed to:
– Predict the value of a dependent variable based on the value of at least one independent variable – Explain the impact of changes in an independent variable on the dependent variable
Dependent variable: the variable we wish to explain Independent variable: the variable used
Simple Linear Regression Model
• Only one independent variable, x
• Relationship between x and y is described by a linear function
• Changes in y are assumed to be caused by changes in x
Types of Regression Models
Positive Linear Relationship Relationship NOT Linear
Negative Linear Relationship
No Relationship
Population Linear Regression
The population regression model:
Population y intercept Dependent Variable
Population Slope Coefficient
Independent Variable
y β0 β1x ε
Linear component
Random Error term, or residual
Random Error component
Linear Regression Assumptions
• Error values (ε) are statistically independent • Error values are normally distributed for any given value of x
• The probability distribution of the errors is normal
• The probability distribution of the errors has constant variance • The underlying relationship between the x
Population Linear Regression y Observed Value of y for xi
y β0 β1x ε εi (continued)
Slope = β1 Random Error for this x value
Predicted Value of y for xi Intercept = β0
xi
x
Estimated Regression Model
The sample regression line provides an estimate of the
As discussed in the previous section, a normal distribution has particular characteristics it conforms to. i.e.
(a) Then mean of the sample and the value of Z with an area of 10% in right tail.
21. Suppose that you are designing an instrument panel for a large industrial machine. The machine requires the person using it to reach 2 feet from a particular position. The reach from this position for adult women is known to have a mean of 2.8 feet with a standard deviation of .5. The reach for adult men is known to have a mean of 3.1 feet with a standard deviation of .6. Both women’s and men’s reach from this position is normally distributed. If this design is implemented:
6. When do the mean and median have the same value? 7. Describe the relationship between variance and standard deviation.
5. The green bars are called "error bars." They indicate the range of uncertainty that scientists have about the data on the graph. (Note: Not all error bars are shown.) Why do you think these error bars are smaller near the year 2000 than in the 1890s?
Answer = A visual representation of the relationship between the independent and the dependant variables. Either bar or line graph.
Independent variableIdentify the independent variable (IV), which is the variable that you hypothesize will cause or influence the other variable. *
A researcher found a significant relationship between a person's age, a, the number of hours a person works per week, b, and the number of accidents, y, the person has per year. The relationship can be represented by the multiple regression equation y = -3.2 + 0.012a + 0.23b. Predict the number of accidents per year (to the nearest whole number) for a person whose age is 42 and who works 46 hours per week.
The independent variable is the poor treatment of Mexican girls and women of the maquiladora by the corporations and government.
Answer: D3. The probability of Type I error, , and the probability of Type II error,
Regression analysis will be performed on all variables to determine if relationships exist between variables.
c) Is there a high probability that the mean and standard deviation of your sample are consistent
2. Which of the following values from Table 1 tells us about variability of the scores in a distribution?
"There are several different kinds of relationships between variables. Before drawing a conclusion, you should first understand how one variable changes with the other. This means you need to establish how the variables are related - is the relationship linear or quadratic or inverse or logarithmic or something else" ("Relationship Between Variables ", n.d)
1. Introduction (brief discussion about your research question and how your dependent variable has been used in the past).