Data

The GAM model of creature level shows relative influence to a few attributes.

model.gam.everything <- gam(
  level ~
    s(hardiness) +
    s(fortitude) +
    s(dexterity) +
    s(endurance) +
    s(intellect) +
    s(cleverness) +
    s(dependability) +
    s(courage) +
    s(fierceness) +
    s(power) +
    s(kinetic) +
    s(energy)  +
    s(blast) +
    s(heat) +
    s(cold) +
    s(electricity) +
    s(acid) +
    s(stun),
  data = normalized_df,
  family = gaussian()
)

summary(model.gam.everything)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## level ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) + 
##     s(intellect) + s(cleverness) + s(dependability) + s(courage) + 
##     s(fierceness) + s(power) + s(kinetic) + s(energy) + s(blast) + 
##     s(heat) + s(cold) + s(electricity) + s(acid) + s(stun)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 21.54878    0.06721   320.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df      F  p-value    
## s(hardiness)     6.331  7.532  6.207 4.94e-07 ***
## s(fortitude)     8.919  8.992 79.784  < 2e-16 ***
## s(dexterity)     4.733  5.879  7.500 2.43e-07 ***
## s(endurance)     1.000  1.000  7.105 0.008049 ** 
## s(intellect)     3.090  3.896 26.842  < 2e-16 ***
## s(cleverness)    3.003  3.837 20.640 1.19e-14 ***
## s(dependability) 1.737  2.188  3.906 0.021412 *  
## s(courage)       3.747  4.697  3.524 0.004945 ** 
## s(fierceness)    1.000  1.000  4.193 0.041341 *  
## s(power)         7.465  8.378 18.327  < 2e-16 ***
## s(kinetic)       8.407  8.880  6.845 6.02e-09 ***
## s(energy)        8.486  8.900 50.274  < 2e-16 ***
## s(blast)         1.000  1.000  0.420 0.517232    
## s(heat)          1.000  1.000  0.012 0.914625    
## s(cold)          3.302  4.128 10.697 2.26e-08 ***
## s(electricity)   1.000  1.000  7.921 0.005165 ** 
## s(acid)          3.384  4.168  4.509 0.001356 ** 
## s(stun)          2.960  3.675  5.201 0.000719 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.991   Deviance explained = 99.2%
## GCV = 2.2436  Scale est. = 1.852     n = 410

Specifically, the following attributes:

hardiness
fortitude
dexterity
intellect
cleverness
power
kinetic
energy
cold

The appearance of kinetic, energy, cold and no other resists is especially strange.

Also interesting is that the following attributes have 1.00 degrees of freedom (so their influence is essentially flat):

endurance
fierceness
blast
heat
electricity

Feature Engineering

We are going to make some assumptions about the model based on our domain knowledge of the game, and to avoid over-fitting.

Hardiness, Dexterity, and Intellect have the same weights against level

It’s possible that the additive models associate higher weights to hardiness due to its correlation with fortitude. However, for now, we’ll assume that each attribute is more or less equal.

Kinetic and Energy resists have the same weights against level

We’re combining these two because they are uniquely capped at 60%.

The other resists have the same weight against level

This is due to domain knowledge – “vuln stacking” creatures caused a measurable drop in creature level. It’s possible the special emphasis given to cold is due to over-fitting or something unique about furrycat’s data.

To mediate these concerns, we can create the following synthetic attributes

average_hdi – is the average of hardiness, dexterity, and intellect. Taking the mean of these attributes and training the GAM on this synthetic feature will force it not to over-fit on any of hardiness, dexterity, and intellect.
kinen – mean of kinetic and energy, for the same reasons.
nonkinen – mean of cold, heat, electricity, acid, and stun.

With that, this is the final model:

model.gam <- gam(
  level ~
    s(average_hdi) +
    s(fortitude) +
    s(cleverness) +
    s(power) +
    s(kinen) +
    s(nonkinen),
  data = normalized_df
)
summary(model.gam)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## level ~ s(average_hdi) + s(fortitude) + s(cleverness) + s(power) + 
##     s(kinen) + s(nonkinen)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 21.54878    0.08488   253.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                  edf Ref.df     F p-value    
## s(average_hdi) 6.729  7.813 44.02  <2e-16 ***
## s(fortitude)   8.904  8.993 83.78  <2e-16 ***
## s(cleverness)  7.920  8.680 16.11  <2e-16 ***
## s(power)       2.856  3.656 49.38  <2e-16 ***
## s(kinen)       4.745  5.800 53.25  <2e-16 ***
## s(nonkinen)    2.686  3.399 38.81  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.986   Deviance explained = 98.7%
## GCV = 3.2284  Scale est. = 2.954     n = 410

Armored Creature Levels

Analysis of the data shows extreme an extreme segmentation point \(fortitude = 500\).

In fact, the \(fortitude >= 500\) data set is particularly well-behaved.

model.gam.armor <- gam(
  level ~
    s(average_hdi) +
    s(fortitude) +
    s(cleverness) +
    s(power) +
    s(kinen) +
    s(nonkinen),
  data = armor_df
)
summary(model.gam.armor)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## level ~ s(average_hdi) + s(fortitude) + s(cleverness) + s(power) + 
##     s(kinen) + s(nonkinen)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  42.7722     0.1541   277.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                  edf Ref.df      F  p-value    
## s(average_hdi) 4.641  5.649 20.446 4.86e-15 ***
## s(fortitude)   7.098  8.062  7.065 7.35e-07 ***
## s(cleverness)  1.000  1.000 64.836 5.41e-12 ***
## s(power)       1.445  1.759 43.904 4.98e-12 ***
## s(kinen)       1.000  1.000 44.441 4.67e-09 ***
## s(nonkinen)    4.239  5.227 11.536 2.51e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =   0.99   Deviance explained = 99.3%
## GCV = 2.5287  Scale est. = 1.875     n = 79

In fact, the relative degrees of freedom show that several of these parameters are already close to linear.

linear.fit.level.armor <- lm(
  level ~
    average_hdi +
    fortitude +
    cleverness +
    power +
    kinen +
    nonkinen,
  data = armor_df
)
summary(linear.fit.level.armor)

## 
## Call:
## lm(formula = level ~ average_hdi + fortitude + cleverness + power + 
##     kinen + nonkinen, data = armor_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1643 -1.0873  0.1926  1.0277  4.3139 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -21.331842   3.491081  -6.110 4.60e-08 ***
## average_hdi   0.027648   0.004965   5.568 4.18e-07 ***
## fortitude     0.056252   0.006059   9.285 6.18e-14 ***
## cleverness    0.024034   0.003182   7.552 1.05e-10 ***
## power         0.015740   0.002460   6.398 1.39e-08 ***
## kinen         0.096920   0.018767   5.164 2.06e-06 ***
## nonkinen      0.085904   0.015458   5.557 4.36e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.93 on 72 degrees of freedom
## Multiple R-squared:  0.9818, Adjusted R-squared:  0.9802 
## F-statistic: 645.6 on 6 and 72 DF,  p-value: < 2.2e-16

This simple linear model behaves remarkably well.

And analysis of the residuals are highly promising.

shapiro.test(rs.model.level.armor)

## 
##  Shapiro-Wilk normality test
## 
## data:  rs.model.level.armor
## W = 0.98336, p-value = 0.3941

bptest(linear.fit.level.armor)

## 
##  studentized Breusch-Pagan test
## 
## data:  linear.fit.level.armor
## BP = 6.1042, df = 6, p-value = 0.4116

To summarize, this simple linear model for the “armored” data set (i.e. creatures that have armor):

The residuals are +/- 4 levels
The residuals have median 0.19 levels.
The residuals are normally distributed
The residuals are homoscedastic

Taken together, this may imply that the residuals are due to randomness within the crafting system itself, and may not be due to missing variables or unknown non-linear relationships.

Unarmored Creature Levels

The same analysis for unarmored creatures does not look as promising.

linear.fit.level.noarmor <- lm(
  level ~
    average_hdi +
    fortitude +
    cleverness +
    power +
    kinen +
    nonkinen,
  data = no_armor_df
)
summary(linear.fit.level.noarmor)

## 
## Call:
## lm(formula = level ~ average_hdi + fortitude + cleverness + power + 
##     kinen + nonkinen, data = no_armor_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.2233 -1.1795  0.0108  1.1361  7.7033 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.935414   0.361548  19.183  < 2e-16 ***
## average_hdi  0.032619   0.001946  16.759  < 2e-16 ***
## fortitude   -0.020018   0.001334 -15.004  < 2e-16 ***
## cleverness   0.025625   0.002112  12.131  < 2e-16 ***
## power        0.013442   0.001331  10.095  < 2e-16 ***
## kinen        0.109083   0.008599  12.686  < 2e-16 ***
## nonkinen     0.050314   0.006046   8.321 2.46e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.245 on 324 degrees of freedom
## Multiple R-squared:  0.9341, Adjusted R-squared:  0.9328 
## F-statistic: 764.9 on 6 and 324 DF,  p-value: < 2.2e-16

Which shows a statistically decent but clearly heteroscedastic fit.

The residuals are clearly not normal.

shapiro.test(rs.model.level.noarmor)

## 
##  Shapiro-Wilk normality test
## 
## data:  rs.model.level.noarmor
## W = 0.9869, p-value = 0.004318

bptest(linear.fit.level.noarmor)

## 
##  studentized Breusch-Pagan test
## 
## data:  linear.fit.level.noarmor
## BP = 49.813, df = 6, p-value = 5.126e-09

Summary and areas for future investigation

Using segment analysis, there is statistical evidence that armor is a change point in the model for creature level
An ordinary linear regression fits armored creature levels remarkably well
Unarmored creature levels are not so simple…

For unarmored creature level, a few ideas:

It could be that lower level are more sensitive to rounding, giving the heteroscedastic results
Creature level may be calculated from “final” attributes (such as health, action, mind), which themselves are rounded, which further causes sensitivity at lower creature levels and attributes, where rounding is proportionally more significant
There may be more change points in the unarmored data set

The Strange Case of Fortitude

The influence of fortitude shows a slight negative influence below \(fortitude = 500\), and a positive influence above.

See the influence plot,

This is reflected in the above linear model, where fortitude’s slope in the unarmored data is actually negative!

Recall the result of the linear, unarmored model:

## 
## Call:
## lm(formula = level ~ average_hdi + fortitude + cleverness + power + 
##     kinen + nonkinen, data = no_armor_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.2233 -1.1795  0.0108  1.1361  7.7033 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.935414   0.361548  19.183  < 2e-16 ***
## average_hdi  0.032619   0.001946  16.759  < 2e-16 ***
## fortitude   -0.020018   0.001334 -15.004  < 2e-16 ***
## cleverness   0.025625   0.002112  12.131  < 2e-16 ***
## power        0.013442   0.001331  10.095  < 2e-16 ***
## kinen        0.109083   0.008599  12.686  < 2e-16 ***
## nonkinen     0.050314   0.006046   8.321 2.46e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.245 on 324 degrees of freedom
## Multiple R-squared:  0.9341, Adjusted R-squared:  0.9328 
## F-statistic: 764.9 on 6 and 324 DF,  p-value: < 2.2e-16

Can this be true? Other influence models, such as GBMs, also show the same thing. Still, it is not an intuitive finding. However, note that:

It would explain the existence of high-fortitude, unarmored CL 10s with lots of vulnerabilities – maximizing fortitude to be as close to 499 as possible would minimize its contribution to creature level!
It would explain the question of how many levels armor adds – some players said 10, others said 6. In this view, obtaining armor does not strictly add a constant creature level; it rolls you into a different segment of the linear model, where the influence of fortitude becomes sharply positive
It would also explain why change point detection was so adamant about a change point existing at \(fortitude = 500\) – it’s not that armor adds a constant level, it’s that parameters of the level calculation change entirely!