Linear Regression Models
1
SPSS for Windows® Intermediate & Advanced Applied Statistics Zayed University Office of Research SPSS for Windows® Workshop Series Presented by Dr. Maher Khelifa Associate Professor Department of Humanities and Social Sciences College of Arts and Sciences
© Dr. Maher Khelifa
2
Bi-variate Linear Regression
(Simple Linear Regression)
© Dr. Maher Khelifa
Understanding Bivariate Linear Regression
3
Many statistical indices summarize information about particular
phenomena under study.
For example, the Pearson (r) summarizes the magnitude of a linear
relationship between pairs of variables.
However, one major scientific research objective is to “explain”,
“predict”, or
…show more content…
The parameters β0 and β1 are constants describing the functional relationship in the population. The value of β1 identifies the change along the Y scale expected for every unit changed in fixed values of X (represents the slope or degree of steepness). The values of β0 identifies an adjustment constant due to scale differences in measuring X and Y (the intercept or the place on the Y axis through which the straight line passes. It is the value of Y when X = 0). ∑ (Epsilon) represents an error component for each individual. The portion of Y score that cannot be accounted for by its systematic relationship with values of X.
© Dr. Maher Khelifa
Understanding Bivariate Linear Regression
12
•
The formula Y = β0 + β1X + ε can be thought of as:
Yi = Y’+ εi (where α + β1Xi define the predictable part of any Y score for fixed values of X. Y’ is considered the predicted score).
•
The mathematical equation for the sample general linear model is represented as:
Yi = b0 + b1Xi + ei.
•
In this equation the values of a and b can be thought of as values that maximize the explanatory power or predictive accuracy of X in relation to Y. In maximizing explanatory power or predictive accuracy these values minimize prediction error. If Y represents an individual’s score on the criterion variable and Y’ is the predicted score, then Y-Y’ = error score (e) or the
Cozby, Paul. C., Bates, Scott. C. (2012). Methods in Behavioral Research (11th ed). New York, NY:McHraw-Hill
17 In regression analysis, the coefficient of determination R2 measures the amount of variation in y
2. Compute the means for the following set of scores saved as Ch. 2 Data Set 3 using IBM® SPSS® software. Print out a copy of the output. (Please refer to attachment)
* Independent variable coefficient – This is the measured effect the independent variables have on the dependent variable. This is the main output of the regression analysis.
Iterations of analysis eliminated data points that were listed as “unusual observations,” or any data point with a large standardized residual. After 5 iterations, the analysis showed improved residual plots. Randomness in the versus fits and versus order plots means that the linear regression model is appropriate for the data; a straight line in the normal probability plot illustrates the linearity of the data, and a bell shaped curve in the histogram illustrates the normality of the data.
But why below 70? This score has been chosen as a result of normal distribution, meaning that it is directly linked and
6. Using IBM® SPSS® software, create a frequency table to summarize the data on the educational level variable. Copy and paste the output from IBM® SPSS® into this worksheet.
Some questions in Part B require that you access data from Using SPSS for Windows and Macintosh. This data is available on the student website under the Student Text Resources link.
Let’s use what you learned not only earlier in the class but also information from Module 7 for this assignment. Answer the following questions using your newfound knowledge about applying bivariate statistics and their p values to published results. Make sure you answer all parts of the question to get full credit.
| Based on explicit knowledge and this can be easy and fast to capture and analyse.Results can be generalised to larger populationsCan be repeated – therefore good test re-test reliability and validityStatistical analyses and interpretation are
This equation is given to us on the AP formula sheet, so since Barons actually teaches you how to use this specific formula it makes it much easier to understand and was very helpful on the AP Exam. Another topic the Brase text did not cover very thoroughly was the Least Squares Line. Brase simply defines the line as, “The least-squares line devolved with x as the explanatory variable and y as the response variable can be used only to predict y values from specific x values. Baron 's on the other hand goes into detail in explaining how to find the line, its slope, and the standard deviation. It gives the equation ŷ=y + b1 (x-x). It also goes onto define ß as the slope of the true regression line, which can be found using a t score with degrees of freedom n-2. The equation for standard deviation is the sum of the squared residuals divided by the sum of the squared deviations of the mean.
This regression equation can be graphed as follows assuming β0 as the intercept and β1 as the slope:
Statistical Product and Service Solutions for Windows (SPSS) 12.0 software package and SAS 8.0 software. SPSS 12.0 was used for all statistical analysis except linear regression which used SAS 8.0.
The objective of this chapter is to describe the procedures used in the analysis of the data and present the main findings. It also presents the different tests performed to help choose the appropriate model for the study. The chapter concludes by providing thorough statistical interpretation of the findings.
Taking the sample sensitivity and model specification into account root mean square error of approximation (RMSEA), incremental fit index (IFI), Tucker Lewis index or Non-normed fit index (TLI or NNFI) and comparative fit index (CFI) are considered in this study for evaluating fit indices. The grounds for reporting these indices as fit measures are discussed in the following paragraph.