Data

The GAM of Furrycat’s data shows a strong relationship between to-hit and cleverness

model <- gam(
  formula = to_hit ~
    s(hardiness) +
    s(fortitude) + 
    s(dexterity) + 
    s(endurance) +
    s(intellect) + 
    s(cleverness) + 
    s(courage) + 
    s(dependability) +
    s(power) +
    s(fierceness) +
    armor,
  family = gaussian(),
  data = normalized_df
)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## to_hit ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) + 
##     s(intellect) + s(cleverness) + s(courage) + s(dependability) + 
##     s(power) + s(fierceness) + armor
## 
## Parametric coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.2958052  0.0003678  804.34   <2e-16 ***
## armor       -0.0015207  0.0011267   -1.35    0.178    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df        F p-value    
## s(hardiness)     1.667  2.106    0.594  0.5718    
## s(fortitude)     1.000  1.000    0.325  0.5688    
## s(dexterity)     1.000  1.000    1.251  0.2640    
## s(endurance)     1.506  1.868    0.441  0.5492    
## s(intellect)     2.549  3.198    1.678  0.1672    
## s(cleverness)    2.167  2.749 3470.664  <2e-16 ***
## s(courage)       1.000  1.000    0.589  0.4432    
## s(dependability) 1.000  1.000    0.226  0.6349    
## s(power)         1.000  1.000    0.751  0.3867    
## s(fierceness)    1.000  1.000    4.540  0.0337 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.996   Deviance explained = 99.6%
## GCV = 3.7584e-05  Scale est. = 3.6128e-05  n = 410

This is easy to see with the graph.

ggplot(normalized_df, aes(x = cleverness, y = to_hit)) +
  geom_point() +
  ggtitle("Cleverness vs To-Hit")

And the GAM shows linear relationships.

plot(model, select = 6)

The linear model shows high correlation and low residuals.

model <- lm(to_hit ~ cleverness, data = normalized_df)
summary(model)
## 
## Call:
## lm(formula = to_hit ~ cleverness, data = normalized_df)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.076845 -0.002284  0.000310  0.002575  0.044197 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 1.952e-01  4.484e-04   435.2   <2e-16 ***
## cleverness  6.453e-04  2.140e-06   301.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.006085 on 408 degrees of freedom
## Multiple R-squared:  0.9955, Adjusted R-squared:  0.9955 
## F-statistic: 9.091e+04 on 1 and 408 DF,  p-value: < 2.2e-16

And looks like this.

Conclusion

To-Hit is roughly captured by the following formula, \(tohit \approx 0.195 + 0.0006455 * cleverness\).