Data

The GAM of Furrycat’s data shows a strong relationship between to-hit and cleverness

model <- gam(
  formula = to_hit ~
    s(hardiness) +
    s(fortitude) + 
    s(dexterity) + 
    s(endurance) +
    s(intellect) + 
    s(cleverness) + 
    s(courage) + 
    s(dependability) +
    s(power) +
    s(fierceness) +
    armor,
  family = gaussian(),
  data = normalized_df
)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## to_hit ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) + 
##     s(intellect) + s(cleverness) + s(courage) + s(dependability) + 
##     s(power) + s(fierceness) + armor
## 
## Parametric coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.3034716  0.0004154 730.535   <2e-16 ***
## armor       -0.0015757  0.0012075  -1.305    0.193    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df        F p-value    
## s(hardiness)     1.720  2.181    0.452  0.6548    
## s(fortitude)     1.000  1.000    0.503  0.4785    
## s(dexterity)     1.000  1.000    0.557  0.4559    
## s(endurance)     1.223  1.414    0.152  0.7195    
## s(intellect)     2.459  3.094    1.406  0.2389    
## s(cleverness)    2.137  2.708 3097.739  <2e-16 ***
## s(courage)       1.000  1.000    0.661  0.4168    
## s(dependability) 1.000  1.000    0.355  0.5515    
## s(power)         1.000  1.000    1.024  0.3123    
## s(fierceness)    1.000  1.000    4.861  0.0281 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.995   Deviance explained = 99.6%
## GCV = 4.0975e-05  Scale est. = 3.9254e-05  n = 370

This is easy to see with the graph.

ggplot(normalized_df, aes(x = cleverness, y = to_hit)) +
  geom_point() +
  ggtitle("Cleverness vs To-Hit")

And the GAM shows linear relationships.

plot(model, select = 6)

The linear model shows high correlation and low residuals.

model <- lm(to_hit ~ cleverness, data = normalized_df)
summary(model)
## 
## Call:
## lm(formula = to_hit ~ cleverness, data = normalized_df)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.076748 -0.002354  0.000207  0.002735  0.044038 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.953e-01  5.088e-04   383.9   <2e-16 ***
## cleverness  6.448e-04  2.322e-06   277.7   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.006327 on 368 degrees of freedom
## Multiple R-squared:  0.9953, Adjusted R-squared:  0.9952 
## F-statistic: 7.713e+04 on 1 and 368 DF,  p-value: < 2.2e-16

And looks like this.

Conclusion

To-Hit is roughly captured by the following formula, \(tohit \approx 0.195 + 0.0006455 * cleverness\).