A GAM of Furrycat’s data shows a strong relationship between damage and power.
Note that damage spread was changed near patch 14.1, so many of the examples in the Furrycat data set are not valid.
We can see this immediately from the complete data set as well.
Where we see a rather striking segment starting at around \(damage_{min} \approx 300\).
For the purposes of fitting the data, we will remove these pre-patch examples.
post_patch_dataset <- normalized_df %>%
filter((damage_spread > 10 & power >= 380) | power < 380)
A naive general additive model shows high influence from power.
model_low <- gam(
formula = damage_low ~
s(hardiness) +
s(fortitude) +
s(dexterity) +
s(endurance) +
s(intellect) +
s(cleverness) +
s(courage) +
s(dependability) +
s(power) +
s(fierceness) +
armor,
family = gaussian(),
data = post_patch_dataset
)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## damage_low ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) +
## s(intellect) + s(cleverness) + s(courage) + s(dependability) +
## s(power) + s(fierceness) + armor
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 157.3054 0.2924 537.947 <2e-16 ***
## armor -1.2650 1.0974 -1.153 0.25
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(hardiness) 1.689 2.128 2.705 0.0746 .
## s(fortitude) 1.000 1.000 0.757 0.3851
## s(dexterity) 2.987 3.740 3.203 0.0143 *
## s(endurance) 1.000 1.000 0.118 0.7310
## s(intellect) 1.000 1.000 3.606 0.0587 .
## s(cleverness) 1.000 1.000 0.063 0.8016
## s(courage) 1.000 1.000 0.298 0.5853
## s(dependability) 2.318 2.957 1.660 0.2142
## s(power) 8.592 8.945 4456.709 <2e-16 ***
## s(fierceness) 1.000 1.000 0.087 0.7682
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.998 Deviance explained = 99.8%
## GCV = 19.673 Scale est. = 18.051 n = 286
And \(damage_{high}\) is best expressed in terms of \(damage_{low}\).
model_high <- gam(
formula = damage_high ~ s(damage_low),
family = gaussian(),
data = post_patch_dataset
)
summary(model_high)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## damage_high ~ s(damage_low)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 171.346 0.196 874.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(damage_low) 8.805 8.985 33106 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.999 Deviance explained = 99.9%
## GCV = 11.374 Scale est. = 10.984 n = 286
A simple change-point detection and segmentation analysis gives the piecewise-defined regressions.
model <- lm(damage_low ~ power, data = post_patch_dataset)
summary(model)
##
## Call:
## lm(formula = damage_low ~ power, data = post_patch_dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -63.546 -5.810 1.459 6.495 13.262
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.088064 0.941345 19.21 <2e-16 ***
## power 0.740082 0.004125 179.40 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.035 on 284 degrees of freedom
## Multiple R-squared: 0.9913, Adjusted R-squared: 0.9912
## F-statistic: 3.219e+04 on 1 and 284 DF, p-value: < 2.2e-16
##
## ***Regression Model with Segmented Relationship(s)***
##
## Call:
## segmented.lm(obj = model, seg.Z = ~power, psi = 380)
##
## Estimated Break-Point(s):
## Est. St.Err
## psi1.power 367.009 6.33
##
## Meaningful coefficients of the linear terms:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.343985 0.591159 20.88 <2e-16 ***
## power 0.780646 0.003071 254.22 <2e-16 ***
## U1.power -0.252538 0.011690 -21.60 NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.063 on 282 degrees of freedom
## Multiple R-Squared: 0.9973, Adjusted R-squared: 0.9972
##
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 3 iterations (rel. change 7.5492e-16)
## $power
## Est.
## intercept1 12.344
## intercept2 105.030
## $power
## Est. St.Err. t value CI(95%).l CI(95%).u
## slope1 0.78065 0.0030708 254.210 0.77460 0.78669
## slope2 0.52811 0.0112790 46.822 0.50591 0.55031
Which visually fits.
What about \(damage_{high}\)? That’s even more simple, especially if expressed in terms of \(damage_{low}\).
model <- lm(damage_high ~ damage_low, data = post_patch_dataset)
segmented.model <- segmented(model, seg.Z = ~damage_low, psi=300)
summary(segmented.model)
##
## ***Regression Model with Segmented Relationship(s)***
##
## Call:
## segmented.lm(obj = model, seg.Z = ~damage_low, psi = 300)
##
## Estimated Break-Point(s):
## Est. St.Err
## psi1.damage_low 301.612 1.363
##
## Meaningful coefficients of the linear terms:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.585290 0.422076 25.08 <2e-16 ***
## damage_low 0.999041 0.002552 391.43 <2e-16 ***
## U1.damage_low 1.013791 0.017460 58.06 NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.438 on 282 degrees of freedom
## Multiple R-Squared: 0.999, Adjusted R-squared: 0.999
##
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 6 iterations (rel. change 4.0206e-12)
intercept(segmented.model)
## $damage_low
## Est.
## intercept1 10.585
## intercept2 -295.190
slope(segmented.model)
## $damage_low
## Est. St.Err. t value CI(95%).l CI(95%).u
## slope1 0.99904 0.0025523 391.43 0.99402 1.0041
## slope2 2.01280 0.0172720 116.54 1.97880 2.0468
Which also fits.
Damage is roughly captured by the following formulas,
\[ \begin{equation} damage_{low} \approx \left\{\begin{array}{lr} max(13 + (7/9) * power, 30), & 0 < power \le 369 \\ 300 + 0.5 * power, & power > 369 \end{array}\right. \end{equation} \]
\[ \begin{equation} damage_{high} \approx \left\{\begin{array}{lr} 10 + damage_{low}, & damage_{low} \le 300 \\ 2 * damage_{low} - 290, & damage_{low} > 300 \end{array}\right. \end{equation} \]
Furthermore, \(damage_{high}\) will round to the nearest \(10\), and \(damage_{low}\) will (sometimes) round to the nearest \(5\). There is more investigation needed into the rounding of these values; my belief is that \(damage_{high}\) is based on the non-rounded value of \(damage_{low}\), and then the rounding occurs differently. More investigation on that behavior is needed, however I do consider some of that data of suspect quality, especially where the difference between \(damage_{high}\) and \(damage_{low}\) is allegedly \(5\).