Branches of mechanical engineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R

https://www.r-bloggers.com/multiple-regression-part-2-diagnostics/
http://www.r-exercises.com/2017/01/26/multiple-regression-part-2-diagnostics-solutions/


ultiple Regression is ane of the most widely used methods inwards statistical modelling. However, despite its many benefits, it is oftentimes used without checking the underlying assumptions. This tin dismiss Pb to results which tin dismiss endure misleading or fifty-fifty completely wrong. Therefore, applying diagnostics to respect whatever rigid violations of the assumptions is important. In the exercises below nosotros encompass some fabric on multiple regression diagnostics inwards R.
Answers to the exercises are available here.
If you lot obtain a dissimilar (correct) answer than those listed on the solutions page, delight experience gratis to post service your answer equally a comment on that page.
Multiple Regression (Part 1) tin dismiss endure found here.
We volition endure using the dataset state.x77, which is part of the state datasets available in R. (Additional information close the dataset tin dismiss endure obtained past times running help(state.x77).)
Exercise 1
a. Load the state datasets.
b. Convert the state.x77 dataset to a dataframe.
c. Rename the Life Exp variable to Life.Exp, and HS Grad to HS.Grad. (This avoids problems amongst referring to these variables when specifying a model.)
d. Produce the correlation matrix.
e. Create a scatterplot matrix for the variables Life.ExpHS.GradMurder, and Frost.
Exercise 2
a. Fit the model with Life.Exp as subject variable, and HS.Grad and Murder as predictors.
b. Obtain the residuals.
c. Obtain the fitted values.
Exercise 3
a. Create a remainder plot (residuals vs. fitted values).
b. Create the same remainder plot using the plot command on the lm object from Exercise 2.




Learn more about multiple linear regression inwards the online courses Linear regression inwards R for Data ScientistsStatistics amongst R – advanced level, and Linear Regression in addition to Modeling.
Exercise 4
Create plots of the residuals vs. each of the predictor variables.
Exercise 5
a. Create a Normality plot.
b. Create the same plot using the plot command on the lm object from Exercise 2.
Exercise 6
a. Obtain the studentized residuals.
b. Does in that location seem to endure whatever outliers?
Exercise 7
a. Obtain the leverage value for each observation in addition to plot them.
b. Obtain the conventional threshold for leverage values. Are whatever observations influential?
Exercise 8
a. Obtain DFFITS values.
b. Obtain the conventional threshold. Are whatever observations influential?
c. Obtain DFBETAS values.
d. Obtain the conventional threshold. Are whatever observations influential?
Exercise 9
a. Obtain Cook’s distance values in addition to plot them.
b. Obtain the same plot using the plot command on the lm object from Exercise 2.
c. Obtain the threshold value. Are whatever observations influential?
Exercise 10
Create the Influence Plot using a operate from the car package.


_____________________________________________________________


Below are the solutions to these exercises on Multiple Regression (part 2).
Learn more about multiple linear regression inwards the online courses Linear regression inwards R for Data ScientistsStatistics amongst R – advanced level, and Linear Regression in addition to Modeling.
#################### #                  # #    Exercise 1    # #                  # #################### #a.  data(state)  #b.  state77 <- as.data.frame(state.x77)  #c. names(state77)[4] <- "Life.Exp" names(state77)[6] <- "HS.Grad"  #d. round(cor(state77),3) #displays correlations to three decimal places 
##            Population Income Illiteracy Life.Exp Murder HS.Grad  Frost ## Population      1.000  0.208      0.108   -0.068  0.344  -0.098 -0.332 ## Income          0.208  1.000     -0.437    0.340 -0.230   0.620  0.226 ## Illiteracy      0.108 -0.437      1.000   -0.588  0.703  -0.657 -0.672 ## Life.Exp       -0.068  0.340     -0.588    1.000 -0.781   0.582  0.262 ## Murder          0.344 -0.230      0.703   -0.781  1.000  -0.488 -0.539 ## HS.Grad        -0.098  0.620     -0.657    0.582 -0.488   1.000  0.367 ## Frost          -0.332  0.226     -0.672    0.262 -0.539   0.367  1.000 ## Area            0.023  0.363      0.077   -0.107  0.228   0.334  0.059 ##              Area ## Population  0.023 ## Income      0.363 ## Illiteracy  0.077 ## Life.Exp   -0.107 ## Murder      0.228 ## HS.Grad     0.334 ## Frost       0.059 ## Area        1.000 
#e. pairs(  Life.Exp + HS.Grad + Murder + Frost, data=state77, gap=0) 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#################### #                  # #    Exercise 2    # #                  # #################### #a. model <- lm(Life.Exp   HS.Grad + Murder, data=state77) summary(model) 
##  ## Call: ## lm(formula = Life.Exp   HS.Grad + Murder, information = state77) ##  ## Residuals: ##      Min       1Q   Median       3Q      Max  ## -1.66758 -0.41801  0.05602  0.55913  2.05625  ##  ## Coefficients: ##             Estimate Std. Error t value Pr(>|t|)     ## (Intercept) 70.29708    1.01567  69.213  < 2e-16 *** ## HS.Grad      0.04389    0.01613   2.721  0.00909 **  ## Murder      -0.23709    0.03529  -6.719 2.18e-08 *** ## --- ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ##  ## Residual criterion error: 0.7959 on 47 degrees of liberty ## Multiple R-squared:  0.6628, Adjusted R-squared:  0.6485  ## F-statistic:  46.2 on 2 in addition to 47 DF,  p-value: 8.016e-12 
#b. resids <- model$residuals  #c.  fitted <- model$fitted.values  #################### #                  # #    Exercise three    # #                  # #################### #a. plot(fitted,resids,main="Residual Plot",xlab="Fitted Values",ylab="Residuals") abline(h=0,col="red") 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#b. plot(model,which=1) 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#################### #                  # #    Exercise four    # #                  # #################### par(mfrow=c(1,2)) # depict the 2 plots side past times side  plot(state77$HS.Grad,resids,main="Residuals vs. HS.Grad",xlab="HS.Grad",ylab="Residuals") abline(h=0,col="red") plot(state77$Murder,resids,main="Residuals vs. Murder",xlab="Murder",ylab="Residuals") abline(h=0,col="red") 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
par(mfrow=c(1,1)) # restore to the default  #################### #                  # #    Exercise v    # #                  # #################### #a. qqnorm(resids,ylab="Residuals") qqline(resids) 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#b. plot(model,which=2) 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#################### #                  # #    Exercise six    # #                  # #################### #a.   stzed <- rstudent(model)  #b.   stzed[abs(stzed) > 2] 
##    Hawaii     Maine  ##  2.835488 -2.249583 
#################### #                  # #    Exercise seven    # #                  # #################### #a. lever <- hat(model.matrix(model)) plot(lever) 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#b. #obtain the threshold thresh2 <- 2*length(model$coefficients)/length(lever)  #print leverage values higher upwardly threshold lever[lever > thresh2] 
## [1] 0.1728282 0.1571342 
#print corresponding dry soil names rownames(state77)[which(lever > thresh2)] 
## [1] "Alaska" "Nevada" 
#################### #                  # #    Exercise 8    # #                  # #################### #a. dffits1 <- dffits(model)  #b. thresh3 <- 2*sqrt(length(model$coefficients)/length(dffits1)) dffits1[dffits1 > thresh3] 
##    Hawaii  ## 0.6182642 
#c. dfbetas1 <- dfbetas(model)  #d. thresh4 <- 2/sqrt(length(dfbetas1[,1])) dfbetas1[dfbetas1[,1] > thresh4,1]  #for intercept 
##    Alaska    Nevada  ## 0.7036719 0.7400875 
dfbetas1[dfbetas1[,2] > thresh4,2]  #for HS.Grad 
##     California         Hawaii South Carolina  West Virginia  ##      0.3993425      0.4430609      0.3855697      0.3613108 
dfbetas1[dfbetas1[,3] > thresh4,3]  #for Murder 
## California      Maine      Texas  ##  0.3491024  0.4441167  0.3038958 
#################### #                  # #    Exercise ix    # #                  # #################### #a. cooksd <- cooks.distance(model) plot(cooksd,ylab="Cook's Distance") 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#b.  plot(model,which=4) 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
#c. thresh <- 4/length(resids) cooksd[cooksd > thresh] 
##         Alaska         Hawaii          Maine         Nevada South Carolina  ##     0.20282640     0.11081779     0.09477421     0.22879279     0.09423789 
#################### #                  # #    Exercise 10   # #                  # #################### library(car) influencePlot(model, main="Influence Plot") 
 ultiple Regression is ane of the most widely used methods inwards statistical modelling branchesofmechanicalengineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R
##          StudRes        Hat     CookD ## Alaska -1.743144 0.17282816 0.2028264 ## Hawaii  2.835488 0.04538585 0.1108178 ## Nevada -1.977284 0.15713419 0.2287928 




Sumber http://engdashboard.blogspot.com/

Jangan sampai ketinggalan postingan-postingan terbaik dari Branches of mechanical engineering: Multiple Regression (Part 2) – Diagnostics + Solutions - R. Berlangganan melalui email sekarang juga:

Bali Attractions

BACA JUGA LAINNYA:

Bali Attractions