A GAM of Furrycat’s data shows a strong relationship between damage and power.
Note that damage spread was changed near patch 14.1, so many of the examples in the Furrycat data set are not valid.
We can see this immediately from the complete data set as well.
Where we see a rather striking segment starting at around \(damage_{min} \approx 300\).
For the purposes of fitting the data, we will remove these pre-patch examples.
post_patch_dataset <- normalized_df %>%
filter((damage_spread > 10 & power >= 380) | power < 380)
A naive general additive model shows high influence from power.
model_low <- gam(
formula = damage_low ~
s(hardiness) +
s(fortitude) +
s(dexterity) +
s(endurance) +
s(intellect) +
s(cleverness) +
s(courage) +
s(dependability) +
s(power) +
s(fierceness) +
armor,
family = gaussian(),
data = post_patch_dataset
)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## damage_low ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) +
## s(intellect) + s(cleverness) + s(courage) + s(dependability) +
## s(power) + s(fierceness) + armor
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 148.479 0.264 562.384 <2e-16 ***
## armor -1.003 1.077 -0.931 0.352
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(hardiness) 2.683 3.421 2.588 0.0732 .
## s(fortitude) 1.000 1.000 0.081 0.7765
## s(dexterity) 3.551 4.439 3.102 0.0129 *
## s(endurance) 1.042 1.082 0.431 0.5581
## s(intellect) 1.000 1.000 3.882 0.0497 *
## s(cleverness) 1.000 1.000 0.476 0.4907
## s(courage) 1.000 1.000 0.104 0.7478
## s(dependability) 2.219 2.840 1.527 0.2778
## s(power) 8.757 8.979 4863.284 <2e-16 ***
## s(fierceness) 1.000 1.000 0.059 0.8076
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.998 Deviance explained = 99.8%
## GCV = 18.766 Scale est. = 17.312 n = 326
And \(damage_{high}\) is best expressed in terms of \(damage_{low}\).
model_high <- gam(
formula = damage_high ~ s(damage_low),
family = gaussian(),
data = post_patch_dataset
)
summary(model_high)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## damage_high ~ s(damage_low)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 161.933 0.178 909.9 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(damage_low) 8.817 8.987 39921 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.999 Deviance explained = 99.9%
## GCV = 10.646 Scale est. = 10.326 n = 326
A simple change-point detection and segmentation analysis gives the piecewise-defined regressions.
model <- lm(damage_low ~ power, data = post_patch_dataset)
summary(model)
##
## Call:
## lm(formula = damage_low ~ power, data = post_patch_dataset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -64.691 -5.621 1.410 6.342 12.957
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.605418 0.814543 21.61 <2e-16 ***
## power 0.742374 0.003722 199.47 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.73 on 324 degrees of freedom
## Multiple R-squared: 0.9919, Adjusted R-squared: 0.9919
## F-statistic: 3.979e+04 on 1 and 324 DF, p-value: < 2.2e-16
##
## ***Regression Model with Segmented Relationship(s)***
##
## Call:
## segmented.lm(obj = model, seg.Z = ~power, psi = 380)
##
## Estimated Break-Point(s):
## Est. St.Err
## psi1.power 368.222 6.649
##
## Coefficients of the linear terms:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.917708 0.528739 24.43 <2e-16 ***
## power 0.778108 0.002831 274.84 <2e-16 ***
## U1.power -0.249704 0.012071 -20.69 NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.161 on 322 degrees of freedom
## Multiple R-Squared: 0.9972, Adjusted R-squared: 0.9972
##
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 2 iterations (rel. change 8.5245e-14)
## $power
## Est.
## intercept1 12.918
## intercept2 104.860
## $power
## Est. St.Err. t value CI(95%).l CI(95%).u
## slope1 0.77811 0.0028312 274.840 0.77254 0.78368
## slope2 0.52840 0.0117340 45.032 0.50532 0.55149
Which visually fits.
What about \(damage_{high}\)? That’s even more simple, especially if expressed in terms of \(damage_{low}\).
model <- lm(damage_high ~ damage_low, data = post_patch_dataset)
segmented.model <- segmented(model, seg.Z = ~damage_low, psi=300)
summary(segmented.model)
##
## ***Regression Model with Segmented Relationship(s)***
##
## Call:
## segmented.lm(obj = model, seg.Z = ~damage_low, psi = 300)
##
## Estimated Break-Point(s):
## Est. St.Err
## psi1.damage_low 301.8 1.314
##
## Coefficients of the linear terms:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.073887 0.361536 27.86 <2e-16 ***
## damage_low 1.001367 0.002265 442.17 <2e-16 ***
## U1.damage_low 1.011465 0.016866 59.97 NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.326 on 322 degrees of freedom
## Multiple R-Squared: 0.999, Adjusted R-squared: 0.999
##
## Boot restarting based on 6 samples. Last fit:
## Convergence attained in 2 iterations (rel. change 8.7371e-13)
intercept(segmented.model)
## $damage_low
## Est.
## intercept1 10.074
## intercept2 -295.190
slope(segmented.model)
## $damage_low
## Est. St.Err. t value CI(95%).l CI(95%).u
## slope1 1.0014 0.0022647 442.17 0.99691 1.0058
## slope2 2.0128 0.0167140 120.43 1.98000 2.0457
Which also fits.
Damage is roughly captured by the following formulas,
\[ \begin{equation} damage_{low} \approx \left\{\begin{array}{lr} max(13 + (7/9) * power, 30), & 0 < power \le 369 \\ 300 + 0.5 * power, & power > 369 \end{array}\right. \end{equation} \]
\[ \begin{equation} damage_{high} \approx \left\{\begin{array}{lr} 10 + damage_{low}, & damage_{low} \le 300 \\ 2 * damage_{low} - 290, & damage_{low} > 300 \end{array}\right. \end{equation} \]
Furthermore, \(damage_{high}\) will round to the nearest \(10\), and \(damage_{low}\) will (sometimes) round to the nearest \(5\). There is more investigation needed into the rounding of these values; my belief is that \(damage_{high}\) is based on the non-rounded value of \(damage_{low}\), and then the rounding occurs differently. More investigation on that behavior is needed, however I do consider some of that data of suspect quality, especially where the difference between \(damage_{high}\) and \(damage_{low}\) is allegedly \(5\).