Data

A GAM of Furrycat’s data shows a strong relationship between damage and power.

Note that damage spread was changed near patch 14.1, so many of the examples in the Furrycat data set are not valid.

We can see this immediately from the complete data set as well.

Where we see a rather striking segment starting at around \(damage_{min} \approx 300\).

For the purposes of fitting the data, we will remove these pre-patch examples.

post_patch_dataset <- normalized_df %>%
  filter((damage_spread > 10 & power >= 380) | power < 380)

A naive general additive model shows high influence from power.

model_low <- gam(
  formula = damage_low ~
    s(hardiness) +
    s(fortitude) + 
    s(dexterity) + 
    s(endurance) +
    s(intellect) + 
    s(cleverness) + 
    s(courage) + 
    s(dependability) +
    s(power) +
    s(fierceness) +
    armor,
  family = gaussian(),
  data = post_patch_dataset
)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## damage_low ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) + 
##     s(intellect) + s(cleverness) + s(courage) + s(dependability) + 
##     s(power) + s(fierceness) + armor
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  148.479      0.264 562.384   <2e-16 ***
## armor         -1.003      1.077  -0.931    0.352    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df        F p-value    
## s(hardiness)     2.683  3.421    2.588  0.0732 .  
## s(fortitude)     1.000  1.000    0.081  0.7765    
## s(dexterity)     3.551  4.439    3.102  0.0129 *  
## s(endurance)     1.042  1.082    0.431  0.5581    
## s(intellect)     1.000  1.000    3.882  0.0497 *  
## s(cleverness)    1.000  1.000    0.476  0.4907    
## s(courage)       1.000  1.000    0.104  0.7478    
## s(dependability) 2.219  2.840    1.527  0.2778    
## s(power)         8.757  8.979 4863.284  <2e-16 ***
## s(fierceness)    1.000  1.000    0.059  0.8076    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.998   Deviance explained = 99.8%
## GCV = 18.766  Scale est. = 17.312    n = 326

And \(damage_{high}\) is best expressed in terms of \(damage_{low}\).

model_high <- gam(
  formula = damage_high ~ s(damage_low),
  family = gaussian(),
  data = post_patch_dataset
)
summary(model_high)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## damage_high ~ s(damage_low)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  161.933      0.178   909.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                 edf Ref.df     F p-value    
## s(damage_low) 8.817  8.987 39921  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.999   Deviance explained = 99.9%
## GCV = 10.646  Scale est. = 10.326    n = 326

A simple change-point detection and segmentation analysis gives the piecewise-defined regressions.

model <- lm(damage_low ~ power, data = post_patch_dataset)
summary(model)
## 
## Call:
## lm(formula = damage_low ~ power, data = post_patch_dataset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -64.691  -5.621   1.410   6.342  12.957 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 17.605418   0.814543   21.61   <2e-16 ***
## power        0.742374   0.003722  199.47   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.73 on 324 degrees of freedom
## Multiple R-squared:  0.9919, Adjusted R-squared:  0.9919 
## F-statistic: 3.979e+04 on 1 and 324 DF,  p-value: < 2.2e-16
## 
##  ***Regression Model with Segmented Relationship(s)***
## 
## Call: 
## segmented.lm(obj = model, seg.Z = ~power, psi = 380)
## 
## Estimated Break-Point(s):
##                Est. St.Err
## psi1.power 368.222  6.649
## 
## Coefficients of the linear terms:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 12.917708   0.528739   24.43   <2e-16 ***
## power        0.778108   0.002831  274.84   <2e-16 ***
## U1.power    -0.249704   0.012071  -20.69       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.161 on 322 degrees of freedom
## Multiple R-Squared: 0.9972,  Adjusted R-squared: 0.9972 
## 
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 2 iterations (rel. change 8.5245e-14)
## $power
##               Est.
## intercept1  12.918
## intercept2 104.860
## $power
##           Est.   St.Err. t value CI(95%).l CI(95%).u
## slope1 0.77811 0.0028312 274.840   0.77254   0.78368
## slope2 0.52840 0.0117340  45.032   0.50532   0.55149

Which visually fits.

What about \(damage_{high}\)? That’s even more simple, especially if expressed in terms of \(damage_{low}\).

model <- lm(damage_high ~ damage_low, data = post_patch_dataset)
segmented.model <- segmented(model, seg.Z = ~damage_low, psi=300)
summary(segmented.model)
## 
##  ***Regression Model with Segmented Relationship(s)***
## 
## Call: 
## segmented.lm(obj = model, seg.Z = ~damage_low, psi = 300)
## 
## Estimated Break-Point(s):
##                   Est. St.Err
## psi1.damage_low 301.8  1.314
## 
## Coefficients of the linear terms:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   10.073887   0.361536   27.86   <2e-16 ***
## damage_low     1.001367   0.002265  442.17   <2e-16 ***
## U1.damage_low  1.011465   0.016866   59.97       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.326 on 322 degrees of freedom
## Multiple R-Squared: 0.999,  Adjusted R-squared: 0.999 
## 
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 2 iterations (rel. change 8.7371e-13)
intercept(segmented.model)
## $damage_low
##                Est.
## intercept1   10.074
## intercept2 -295.190
slope(segmented.model)
## $damage_low
##          Est.   St.Err. t value CI(95%).l CI(95%).u
## slope1 1.0014 0.0022647  442.17   0.99691    1.0058
## slope2 2.0128 0.0167140  120.43   1.98000    2.0457

Which also fits.

Conclusion

Damage is roughly captured by the following formulas,

\[ \begin{equation} damage_{low} \approx \left\{\begin{array}{lr} max(13 + (7/9) * power, 30), & 0 < power \le 369 \\ 300 + 0.5 * power, & power > 369 \end{array}\right. \end{equation} \]

\[ \begin{equation} damage_{high} \approx \left\{\begin{array}{lr} 10 + damage_{low}, & damage_{low} \le 300 \\ 2 * damage_{low} - 290, & damage_{low} > 300 \end{array}\right. \end{equation} \]

Furthermore, \(damage_{high}\) will round to the nearest \(10\), and \(damage_{low}\) will (sometimes) round to the nearest \(5\). There is more investigation needed into the rounding of these values; my belief is that \(damage_{high}\) is based on the non-rounded value of \(damage_{low}\), and then the rounding occurs differently. More investigation on that behavior is needed, however I do consider some of that data of suspect quality, especially where the difference between \(damage_{high}\) and \(damage_{low}\) is allegedly \(5\).