Branches of mechanical engineering: Multiple Regression (Part 3) Diagnostics + Solutions - R


https://www.r-bloggers.com/multiple-regression-part-3-diagnostics/
http://www.r-exercises.com/2017/03/11/multiple-regression-part-3-diagnostics-solutions/



In the exercises below nosotros encompass some to a greater extent than fabric on multiple regression diagnostics inwards R. This includes added variable (partial-regression) plots, component+residual (partial-residual) plots, CERES plots, VIF values, tests for heteroscedasticity (nonconstant variance), tests for Normality, too a examination for autocorrelation of residuals. These are maybe non every bit mutual every bit what nosotros bring seen inwards Multiple Regression (Part 2), but their assist inwards investigating our model’s assumptions is valuable.
Answers to the exercises are available here.
If you lot obtained a unlike (correct) response than those listed on the solutions page, delight experience gratis to post service your response every bit a comment on that page.
Multiple Regression (Part 2) Diagnostics tin strength out survive found here.
As usual, nosotros volition survive using the dataset state.x77, which is part of the statedatasets available in R. (Additional information virtually the dataset tin strength out survive obtained yesteryear running help(state.x77).)
First, delight run the next code to obtain too format the information every bit usual:
data(state)
state77 <- as.data.frame(state.x77)
names(state77)[4] <- "Life.Exp"
names(state77)[6] <- "HS.Grad"
Exercise 1
For the model with Life.Exp as theme variable, and HS.Grad and Murderas predictors, suppose nosotros would similar to report the marginal consequence of each predictor variable, given that the other predictor is inwards the model.
a. Use a part from the car package to obtain added-variable (partial regression) plots for this purpose.
b. Re-create the added-variable plots from part a., labeling the 2 most influential points inwards the plots (according to Mahalanobis distance).




Learn more about multiple linear regression inwards the online course Linear regression inwards R for Data Scientists. In this course of pedagogy you lot volition acquire how to:
  • Model basic too complex existent footing employment using linear regression
  • Understand when models are performing poorly too right it
  • Design complex models for hierarchical data
  • And much more
Exercise 2
a. Illiteracy is highly correlated amongst both HS.Grad and Murder. To illustrate problems that occur when multicollinearity exists, suppose nosotros would similar to report the marginal consequence of Illiteracy (only), given that HS.Grad and Murder are inwards the model. Use a part from the car package to acquire the relevant added-variable plot.
b. From the correlation matrix inwards the previous Exercise Set, nosotros know that Population and Area are the to the lowest degree strongly correlated variables with Life.Exp. Create added-variable plots for each of these 2 variables, given that all other half dozen variables are inwards the model.
Exercise 3
Consider the model with HS.GradMurderIncome, and Area as predictors. Create component+residual (partial-residual) plots for this model.
Exercise 4
Create CERES plots for the model inwards Exercise 3.
Exercise 5
As an example of high collinearities, compute VIF (Variance Inflation Factor) values for a model with Life.Exp as the response, that includes all the variables every bit predictors. Which variables appear to survive causing the most problems?
Exercise 6
Using a part from the package lmtest, behave a Breusch-Pagan examination for heteroscedasticity (non-constant variance) for the model inwards Exercise 1.
Exercise 7
Re-do the examination inwards the previous practise yesteryear using a part from the car package.
Exercise 8
The examination inwards Exercise 6 (and 7) is for linear forms of heteroscedasticity. To examination for nonlinear heteroscedasticity (e.g., “bowtie-shape” inwards a residuum plot), behave White’s test.
Exercise 9
a. Conduct the Kolmogorov-Smirnov normality examination for the residuals from the model inwards Exercise 1.
b. Now behave the Shapiro-Wilk normality test.
Note: More Normality tests tin strength out survive institute inwards the nortest package.
Exercise 10
For example purposes only, behave the Durbin-Watson examination for autocorrelation inwards residuals. (NOTE: This examination is ONLY appropriate when the response variable is a fourth dimension series, or somehow time-related (e.g., ordered yesteryear information collection time.))


_______________________________________________________


Below are the solutions to these exercises on Multiple Regression (part 3).
data(state) state77 <- as.data.frame(state.x77) names(state77)[4] <- "Life.Exp" names(state77)[6] <- "HS.Grad"   #################### #                  # #    Exercise 1    # #                  # #################### #a. library(car) m1 <- lm(Life.Exp   HS.Grad+Murder, data=state77) avPlots(m1) 
 In the exercises below nosotros encompass some to a greater extent than fabric on multiple regression diagnostics inwards  branchesofmechanicalengineering: Multiple Regression (Part 3) Diagnostics + Solutions - R
#Note that the gradient of the business is positive inwards the HS.Grad plot, too negative inwards the Murder plot, every bit expected.   #b. avPlots(m1,id.method=list("mahal"),id.n=2) 
 In the exercises below nosotros encompass some to a greater extent than fabric on multiple regression diagnostics inwards  branchesofmechanicalengineering: Multiple Regression (Part 3) Diagnostics + Solutions - R
#################### #                  # #    Exercise 2    # #                  # #################### #a. with(state77,avPlot(lm(Life.Exp   HS.Grad+Murder+Illiteracy),variable=Illiteracy)) 
 In the exercises below nosotros encompass some to a greater extent than fabric on multiple regression diagnostics inwards  branchesofmechanicalengineering: Multiple Regression (Part 3) Diagnostics + Solutions - R
#Note that the gradient is positive, opposite to what is expected  #b. avPlots(lm(Life.Exp   .,data=state77), terms=   Population+Area) 
 In the exercises below nosotros encompass some to a greater extent than fabric on multiple regression diagnostics inwards  branchesofmechanicalengineering: Multiple Regression (Part 3) Diagnostics + Solutions - R
#################### #                  # #    Exercise iii    # #                  # #################### crPlots(lm(Life.Exp   HS.Grad+Murder+Income+Area,data=state77)) 
 In the exercises below nosotros encompass some to a greater extent than fabric on multiple regression diagnostics inwards  branchesofmechanicalengineering: Multiple Regression (Part 3) Diagnostics + Solutions - R
#We run into that at that spot seems to survive a employment amongst linearity for Income too Area (which could survive due to the outlier inwards the lower right corner inwards both plots).   #################### #                  # #    Exercise four    # #                  # #################### ceresPlots(lm(Life.Exp   HS.Grad+Murder+Income+Area,data=state77)) 
 In the exercises below nosotros encompass some to a greater extent than fabric on multiple regression diagnostics inwards  branchesofmechanicalengineering: Multiple Regression (Part 3) Diagnostics + Solutions - R
#Here, at that spot is non much departure amongst the plots inwards Exercise iii (although, inwards general, CERES plots are "less prone to leakage of nonlinearity amid the predictors.")   #################### #                  # #    Exercise v    # #                  # #################### vif(lm(Life.Exp   .,data=state77)) 
## Population     Income Illiteracy     Murder    HS.Grad      Frost  ##   1.499915   1.992680   4.403151   2.616472   3.134887   2.358206  ##       Area  ##   1.789764 
#Some authors advocate that a vif>2.5 is a get for concern, field others yell vif>4 or vif>10. According to these criteria, Illiteracy, Murder, too HS.Grad are the most problematic (in the presence of all the other predictors).   #################### #                  # #    Exercise 6    # #                  # #################### library(lmtest) bptest(m1) 
##  ##  studentized Breusch-Pagan examination ##  ## data:  m1 ## BP = 2.9728, df = 2, p-value = 0.2262 
#There is no bear witness of heteroscedasticity (of the type that depends on a linear combination of the predictors).   #################### #                  # #    Exercise vii    # #                  # #################### ncvTest(m1) 
## Non-constant Variance Score Test  ## Variance formula:   fitted.values  ## Chisquare = 0.01065067    Df = 1     p = 0.9178026 
#Note that the results are unlike to Exercise 6 because bptest (by default) uses studentized residuals (which is preferred for robustness) too assumes the fault variance depends on a linear combination of the predictors, whereas ncvTest (by default) uses regular residuals too assumes the fault variance depends on the fitted values.  #ncvTest(m1) is equivalent to bptest(m1,varformula=   m1$fitted,studentize=F,data=state77)   #################### #                  # #    Exercise 8    # #                  # #################### bptest(m1,varformula=   I(HS.Grad^2)+I(Murder^2)+HS.Grad*Murder,data=state77) 
##  ##  studentized Breusch-Pagan examination ##  ## data:  m1 ## BP = 6.7384, df = 5, p-value = 0.2408 
#################### #                  # #    Exercise nine    # #                  # #################### #a.  ks.test(m1$residuals,"pnorm") 
##  ##  One-sample Kolmogorov-Smirnov examination ##  ## data:  m1$residuals ## D = 0.15546, p-value = 0.1603 ## choice hypothesis: two-sided 
#There is no bear witness that the residuals are non Normal.   #b. shapiro.test(m1$residuals) 
##  ##  Shapiro-Wilk normality examination ##  ## data:  m1$residuals ## W = 0.96961, p-value = 0.2231 
#Again, at that spot is no bear witness of nonnormality.   #################### #                  # #    Exercise 10   # #                  # #################### durbinWatsonTest(m1) 
##  lag Autocorrelation D-W Statistic p-value ##    1      0.04919151        1.8495   0.582 ##  Alternative hypothesis: rho != 0 
#There is no bear witness of lag-1 autocorrelation inwards the residuals.


Sumber http://engdashboard.blogspot.com/

Jangan sampai ketinggalan postingan-postingan terbaik dari Branches of mechanical engineering: Multiple Regression (Part 3) Diagnostics + Solutions - R. Berlangganan melalui email sekarang juga:

Bali Attractions

BACA JUGA LAINNYA:

Bali Attractions