Data

A GAM of Furrycat’s data shows a strong relationship between damage and power.

Note that damage spread was changed near patch 14.1, so many of the examples in the Furrycat data set are not valid.

We can see this immediately from the complete data set as well.

Where we see a rather striking segment starting at around \(damage_{min} \approx 300\).

For the purposes of fitting the data, we will remove these pre-patch examples.

post_patch_dataset <- normalized_df %>%
  filter((damage_spread > 10 & power >= 380) | power < 380)

A naive general additive model shows high influence from power.

model_low <- gam(
  formula = damage_low ~
    s(hardiness) +
    s(fortitude) + 
    s(dexterity) + 
    s(endurance) +
    s(intellect) + 
    s(cleverness) + 
    s(courage) + 
    s(dependability) +
    s(power) +
    s(fierceness) +
    armor,
  family = gaussian(),
  data = post_patch_dataset
)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## damage_low ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) + 
##     s(intellect) + s(cleverness) + s(courage) + s(dependability) + 
##     s(power) + s(fierceness) + armor
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 157.3054     0.2924 537.947   <2e-16 ***
## armor        -1.2650     1.0974  -1.153     0.25    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df        F p-value    
## s(hardiness)     1.689  2.128    2.705  0.0746 .  
## s(fortitude)     1.000  1.000    0.757  0.3851    
## s(dexterity)     2.987  3.740    3.203  0.0143 *  
## s(endurance)     1.000  1.000    0.118  0.7310    
## s(intellect)     1.000  1.000    3.606  0.0587 .  
## s(cleverness)    1.000  1.000    0.063  0.8016    
## s(courage)       1.000  1.000    0.298  0.5853    
## s(dependability) 2.318  2.957    1.660  0.2142    
## s(power)         8.592  8.945 4456.709  <2e-16 ***
## s(fierceness)    1.000  1.000    0.087  0.7682    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.998   Deviance explained = 99.8%
## GCV = 19.673  Scale est. = 18.051    n = 286

And \(damage_{high}\) is best expressed in terms of \(damage_{low}\).

model_high <- gam(
  formula = damage_high ~ s(damage_low),
  family = gaussian(),
  data = post_patch_dataset
)
summary(model_high)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## damage_high ~ s(damage_low)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  171.346      0.196   874.3   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                 edf Ref.df     F p-value    
## s(damage_low) 8.805  8.985 33106  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.999   Deviance explained = 99.9%
## GCV = 11.374  Scale est. = 10.984    n = 286

A simple change-point detection and segmentation analysis gives the piecewise-defined regressions.

model <- lm(damage_low ~ power, data = post_patch_dataset)
summary(model)
## 
## Call:
## lm(formula = damage_low ~ power, data = post_patch_dataset)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -63.546  -5.810   1.459   6.495  13.262 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 18.088064   0.941345   19.21   <2e-16 ***
## power        0.740082   0.004125  179.40   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.035 on 284 degrees of freedom
## Multiple R-squared:  0.9913, Adjusted R-squared:  0.9912 
## F-statistic: 3.219e+04 on 1 and 284 DF,  p-value: < 2.2e-16
## 
##  ***Regression Model with Segmented Relationship(s)***
## 
## Call: 
## segmented.lm(obj = model, seg.Z = ~power, psi = 380)
## 
## Estimated Break-Point(s):
##                Est. St.Err
## psi1.power 367.009   6.33
## 
## Meaningful coefficients of the linear terms:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 12.343985   0.591159   20.88   <2e-16 ***
## power        0.780646   0.003071  254.22   <2e-16 ***
## U1.power    -0.252538   0.011690  -21.60       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.063 on 282 degrees of freedom
## Multiple R-Squared: 0.9973,  Adjusted R-squared: 0.9972 
## 
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 3 iterations (rel. change 7.5492e-16)
## $power
##               Est.
## intercept1  12.344
## intercept2 105.030
## $power
##           Est.   St.Err. t value CI(95%).l CI(95%).u
## slope1 0.78065 0.0030708 254.210   0.77460   0.78669
## slope2 0.52811 0.0112790  46.822   0.50591   0.55031

Which visually fits.

What about \(damage_{high}\)? That’s even more simple, especially if expressed in terms of \(damage_{low}\).

model <- lm(damage_high ~ damage_low, data = post_patch_dataset)
segmented.model <- segmented(model, seg.Z = ~damage_low, psi=300)
summary(segmented.model)
## 
##  ***Regression Model with Segmented Relationship(s)***
## 
## Call: 
## segmented.lm(obj = model, seg.Z = ~damage_low, psi = 300)
## 
## Estimated Break-Point(s):
##                     Est. St.Err
## psi1.damage_low 301.612  1.363
## 
## Meaningful coefficients of the linear terms:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   10.585290   0.422076   25.08   <2e-16 ***
## damage_low     0.999041   0.002552  391.43   <2e-16 ***
## U1.damage_low  1.013791   0.017460   58.06       NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.438 on 282 degrees of freedom
## Multiple R-Squared: 0.999,  Adjusted R-squared: 0.999 
## 
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 6 iterations (rel. change 4.0206e-12)
intercept(segmented.model)
## $damage_low
##                Est.
## intercept1   10.585
## intercept2 -295.190
slope(segmented.model)
## $damage_low
##           Est.   St.Err. t value CI(95%).l CI(95%).u
## slope1 0.99904 0.0025523  391.43   0.99402    1.0041
## slope2 2.01280 0.0172720  116.54   1.97880    2.0468

Which also fits.

Conclusion

Damage is roughly captured by the following formulas,

\[ \begin{equation} damage_{low} \approx \left\{\begin{array}{lr} max(13 + (7/9) * power, 30), & 0 < power \le 369 \\ 300 + 0.5 * power, & power > 369 \end{array}\right. \end{equation} \]

\[ \begin{equation} damage_{high} \approx \left\{\begin{array}{lr} 10 + damage_{low}, & damage_{low} \le 300 \\ 2 * damage_{low} - 290, & damage_{low} > 300 \end{array}\right. \end{equation} \]

Furthermore, \(damage_{high}\) will round to the nearest \(10\), and \(damage_{low}\) will (sometimes) round to the nearest \(5\). There is more investigation needed into the rounding of these values; my belief is that \(damage_{high}\) is based on the non-rounded value of \(damage_{low}\), and then the rounding occurs differently. More investigation on that behavior is needed, however I do consider some of that data of suspect quality, especially where the difference between \(damage_{high}\) and \(damage_{low}\) is allegedly \(5\).