Assignment7

[1] "Year"      "Lag1"      "Lag2"      "Lag3"      "Lag4"      "Lag5"     
[7] "Volume"    "Today"     "Direction"
      Year           Lag1                Lag2                Lag3          
 Min.   :2001   Min.   :-4.922000   Min.   :-4.922000   Min.   :-4.922000  
 1st Qu.:2002   1st Qu.:-0.639500   1st Qu.:-0.639500   1st Qu.:-0.640000  
 Median :2003   Median : 0.039000   Median : 0.039000   Median : 0.038500  
 Mean   :2003   Mean   : 0.003834   Mean   : 0.003919   Mean   : 0.001716  
 3rd Qu.:2004   3rd Qu.: 0.596750   3rd Qu.: 0.596750   3rd Qu.: 0.596750  
 Max.   :2005   Max.   : 5.733000   Max.   : 5.733000   Max.   : 5.733000  
      Lag4                Lag5              Volume           Today          
 Min.   :-4.922000   Min.   :-4.92200   Min.   :0.3561   Min.   :-4.922000  
 1st Qu.:-0.640000   1st Qu.:-0.64000   1st Qu.:1.2574   1st Qu.:-0.639500  
 Median : 0.038500   Median : 0.03850   Median :1.4229   Median : 0.038500  
 Mean   : 0.001636   Mean   : 0.00561   Mean   :1.4783   Mean   : 0.003138  
 3rd Qu.: 0.596750   3rd Qu.: 0.59700   3rd Qu.:1.6417   3rd Qu.: 0.596750  
 Max.   : 5.733000   Max.   : 5.73300   Max.   :3.1525   Max.   : 5.733000  
 Direction 
 Down:602  
 Up  :648  
           
           
           
           


Call:
glm(formula = Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + 
    Volume, family = binomial, data = Smarket)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-1.446  -1.203   1.065   1.145   1.326  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.126000   0.240736  -0.523    0.601
Lag1        -0.073074   0.050167  -1.457    0.145
Lag2        -0.042301   0.050086  -0.845    0.398
Lag3         0.011085   0.049939   0.222    0.824
Lag4         0.009359   0.049974   0.187    0.851
Lag5         0.010313   0.049511   0.208    0.835
Volume       0.135441   0.158360   0.855    0.392

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1731.2  on 1249  degrees of freedom
Residual deviance: 1727.6  on 1243  degrees of freedom
AIC: 1741.6

Number of Fisher Scoring iterations: 3
        1         2         3         4         5 
0.5070841 0.4814679 0.4811388 0.5152224 0.5107812 
        Direction
glm.pred Down  Up
    Down  145 141
    Up    457 507
[1] 0.5216
        Direction.2005
glm.pred Down Up
    Down   77 97
    Up     34 44
[1] 0.4801587
        Direction.2005
glm.pred Down  Up
    Down   35  35
    Up     76 106
[1] 0.5595238
[1] 0.5824176

a. LDA Requirements: Assumes normal distribution of predictors and equal covariance across all classes.

b. Differences between LDA and Logistic Regression: LDA assumes a common covariance matrix and normally distributed predictors, which is efficient when true. Logistic regression doesn’t assume normality and works well with non-linear class boundaries.

c. ROC (Receiver Operating Characteristic): A plot that shows the performance of a classifier at various thresholds by plotting the true positive rate against the false positive rate. The area under the curve (AUC) indicates the model’s ability to differentiate classes.

d. Sensitivity and Specificity:

e. Classification Metrics: The critical metric for prediction depends on the consequences of false positives and false negatives, varying by context like healthcare or finance.

This summary provides a clear overview of the topics covered in Chapter 4, highlighting the theoretical differences and practical implications of different classification methods.