Bio-Engineering in Star Wars: Galaxies let players craft custom creatures (“pets”) by combining DNA samples. The creature’s level (CL) governed who could tame the pet and how powerful it was. The 2003 BE Guide noted that CL “mostly depends on damage, resists, and the CL of the donors,” but the exact algorithm was never published.
This document reverse-engineers that algorithm from the Furrycat archive. The headline findings:
fortitude >= 500 (which agrees with the actual armor flag on 99.6% of creatures).Combined, the two formulas reach R² ≈ 0.96 and SD ≈ 1.7 on 370 creatures (79 armored, 291 unarmored).
A single GAM across all creatures gives a quick picture of which attributes move level.
model.gam.everything <- gam(
level ~
s(hardiness) +
s(fortitude) +
s(dexterity) +
s(endurance) +
s(intellect) +
s(cleverness) +
s(dependability) +
s(courage) +
s(fierceness) +
s(power) +
s(kinetic) +
s(energy) +
s(blast) +
s(heat) +
s(cold) +
s(electricity) +
s(acid) +
s(stun),
data = normalized_df,
family = gaussian()
)
summary(model.gam.everything)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## level ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) +
## s(intellect) + s(cleverness) + s(dependability) + s(courage) +
## s(fierceness) + s(power) + s(kinetic) + s(energy) + s(blast) +
## s(heat) + s(cold) + s(electricity) + s(acid) + s(stun)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.05135 0.06939 332.2 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(hardiness) 2.774 3.579 4.803 0.001739 **
## s(fortitude) 8.925 8.994 92.676 < 2e-16 ***
## s(dexterity) 7.381 8.354 7.183 < 2e-16 ***
## s(endurance) 1.000 1.000 9.498 0.002248 **
## s(intellect) 3.023 3.854 23.165 < 2e-16 ***
## s(cleverness) 4.658 5.777 14.774 < 2e-16 ***
## s(dependability) 1.090 1.171 6.456 0.011768 *
## s(courage) 4.612 5.632 4.316 0.000485 ***
## s(fierceness) 1.000 1.000 6.345 0.012290 *
## s(power) 8.165 8.785 18.242 < 2e-16 ***
## s(kinetic) 8.115 8.758 5.838 2.50e-07 ***
## s(energy) 8.453 8.892 50.694 < 2e-16 ***
## s(blast) 1.000 1.000 3.715 0.054860 .
## s(heat) 1.000 1.000 0.027 0.869184
## s(cold) 2.969 3.733 9.033 2.46e-06 ***
## s(electricity) 1.000 1.000 12.828 0.000399 ***
## s(acid) 3.651 4.474 4.538 0.001255 **
## s(stun) 1.000 1.000 11.723 0.000704 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.991 Deviance explained = 99.3%
## GCV = 2.2033 Scale est. = 1.7816 n = 370
Several attributes appear important:
And a handful land on ~1.00 effective degrees of freedom (essentially flat): endurance, fierceness, blast, heat, electricity. We treat those as noise attributes for level purposes.
The fortitude smooth is the most interesting term. Its shape will turn out to encode the two-formula structure.
To avoid over-fitting to small per-attribute weight differences, we collapse correlated attributes into synthetic features:
average_hdi = (hardiness + dexterity + intellect) / 3 — these three attributes have similar weights against level; collapsing them prevents the model from over-attributing influence to whichever happens to correlate most with fortitude.kinen = (kinetic + energy) / 2 — kinetic and energy share the in-game 60% cap; treating them as one resist class respects that mechanic.nonkinen = (blast + heat + cold + electricity + acid + stun) / 6 — the other six resists, none of which are individually significant after collapsing.These are derived in R/data.R and ride along on normalized_df.
A re-fit GAM with these collapsed features reproduces the same picture with far fewer terms:
model.gam <- gam(
level ~
s(average_hdi) +
s(fortitude) +
s(cleverness) +
s(power) +
s(kinen) +
s(nonkinen),
data = normalized_df
)
summary(model.gam)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## level ~ s(average_hdi) + s(fortitude) + s(cleverness) + s(power) +
## s(kinen) + s(nonkinen)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.05135 0.09015 255.7 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(average_hdi) 5.600 6.785 36.79 <2e-16 ***
## s(fortitude) 8.919 8.994 79.20 <2e-16 ***
## s(cleverness) 7.694 8.556 15.32 <2e-16 ***
## s(power) 2.371 3.040 56.96 <2e-16 ***
## s(kinen) 4.282 5.267 57.10 <2e-16 ***
## s(nonkinen) 2.356 2.978 44.41 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.985 Deviance explained = 98.6%
## GCV = 3.2935 Scale est. = 3.0067 n = 370
Holding all other attributes at their mean, the fortitude smooth shows the discontinuity that motivates the two-formula split:
The smooth is decreasing below fortitude=500 and increasing above it. We’ll quantify and verify that breakpoint shortly.
For creatures with fortitude >= 500, a plain linear regression already fits remarkably well.
linear.fit.level.armor <- lm(
level ~ average_hdi + fortitude + cleverness + power + kinen + nonkinen,
data = armor_df
)
summary(linear.fit.level.armor)
##
## Call:
## lm(formula = level ~ average_hdi + fortitude + cleverness + power +
## kinen + nonkinen, data = armor_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.1643 -1.0873 0.1926 1.0277 4.3139
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -21.331842 3.491081 -6.110 4.60e-08 ***
## average_hdi 0.027648 0.004965 5.568 4.18e-07 ***
## fortitude 0.056252 0.006059 9.285 6.18e-14 ***
## cleverness 0.024034 0.003182 7.552 1.05e-10 ***
## power 0.015740 0.002460 6.398 1.39e-08 ***
## kinen 0.096920 0.018767 5.164 2.06e-06 ***
## nonkinen 0.085904 0.015458 5.557 4.36e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.93 on 72 degrees of freedom
## Multiple R-squared: 0.9818, Adjusted R-squared: 0.9802
## F-statistic: 645.6 on 6 and 72 DF, p-value: < 2.2e-16
Residual diagnostics:
shapiro.test(rs.armor)
##
## Shapiro-Wilk normality test
##
## data: rs.armor
## W = 0.98336, p-value = 0.3941
bptest(linear.fit.level.armor)
##
## studentized Breusch-Pagan test
##
## data: linear.fit.level.armor
## BP = 6.1042, df = 6, p-value = 0.4116
The armored residuals are small (±4 levels), normally distributed, and homoscedastic. This is consistent with the residuals being driven by crafting-system randomness rather than a missing variable.
A clean, game-dev-friendly version of the armored formula:
level = -23
+ 0.01 * hardiness
+ 0.06 * fortitude
+ 0.005 * dexterity
+ 0.01 * intellect
+ 0.025 * cleverness
+ 0.015 * power
+ 0.10 * kinen
+ 0.08 * nonkinen
R² ≈ 0.98, residual SD ≈ 1.8 levels.
The same recipe on unarmored creatures gives a worse fit, and the residuals clearly violate normality.
linear.fit.level.noarmor <- lm(
level ~ average_hdi + fortitude + cleverness + power + kinen + nonkinen,
data = no_armor_df
)
summary(linear.fit.level.noarmor)
##
## Call:
## lm(formula = level ~ average_hdi + fortitude + cleverness + power +
## kinen + nonkinen, data = no_armor_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.0819 -1.2764 -0.0060 0.9886 7.5673
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.481758 0.415248 20.43 <2e-16 ***
## average_hdi 0.027228 0.002052 13.27 <2e-16 ***
## fortitude -0.018734 0.001335 -14.04 <2e-16 ***
## cleverness 0.027003 0.002131 12.67 <2e-16 ***
## power 0.013156 0.001308 10.06 <2e-16 ***
## kinen 0.114275 0.008455 13.52 <2e-16 ***
## nonkinen 0.060657 0.006034 10.05 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.14 on 284 degrees of freedom
## Multiple R-squared: 0.9375, Adjusted R-squared: 0.9362
## F-statistic: 709.9 on 6 and 284 DF, p-value: < 2.2e-16
shapiro.test(rs.noarmor)
##
## Shapiro-Wilk normality test
##
## data: rs.noarmor
## W = 0.98048, p-value = 0.0005241
bptest(linear.fit.level.noarmor)
##
## studentized Breusch-Pagan test
##
## data: linear.fit.level.noarmor
## BP = 46.786, df = 6, p-value = 2.064e-08
Three things stand out from this naive unarmored fit:
kinen average.A heavy left tail in the residuals turned out to be a coherent cluster: low-CL unarmored creatures with ~10k Health and a kinetic resist near the 60% cap.
## Uber-CL10-style cluster (level <= 12 and kinetic >= 50):
## n = 39
## mean kinetic = 57.8, mean energy = -51.7
## mean kinen = (k+e)/2 = 3.0
## mean naive residual = -0.45 (negative = over-predicted)
The Bio-Engineer “trick”: stack kinetic resist near the 60% cap, accept a heavy negative energy resist as the cost. The naive kinen = (k+e)/2 averages a near-cap positive against a heavily-negative number and credits the small remainder linearly — so the formula thinks the creature has ~5% “average” resist and predicts CL ~13, when in practice the energy vulnerability erases the resist’s value entirely. The game evidently penalizes the vulnerability harder than (k+e)/2 does.
We test three symmetric formulations of the kinetic/energy term against the kinen baseline. All four use the same other predictors; only the resist term changes.
| ID | Resist term | Mechanic |
|---|---|---|
| M0 | kinen = (k+e)/2 |
naive average |
| M5 | kinen_pos + kinen_neg (split positive / negative half) |
asymmetric credit on positive vs penalty on negative |
| M6 | pmax(kinen, 0) |
average resist, but no negative credit |
| M7 | pmax(pmin(k, e), 0) |
use the weaker side, floored at zero |
Symmetric in kinetic and energy — both share the 60% in-engine cap, so treating them as a single resist class is mechanically motivated.
no_armor_df <- no_armor_df %>%
mutate(
kinen_pos = pmax(kinen, 0),
kinen_neg = pmin(kinen, 0),
ke_floor = pmax(pmin(kinetic, energy), 0)
)
m0 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + kinen + nonkinen,
data = no_armor_df)
m5 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + kinen_pos + kinen_neg + nonkinen,
data = no_armor_df)
m6 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + kinen_pos + nonkinen,
data = no_armor_df)
m7 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nonkinen,
data = no_armor_df)
| Model | R<U+00B2> | Resid SD | AIC | Uber-CL10 SD |
|---|---|---|---|---|
| M0 kinen | 0.9380 | 2.109 | 1279.2 | 1.174 |
| M5 kinen_pos+kinen_neg | 0.9495 | 1.903 | 1221.3 | 1.422 |
| M6 pmax(kinen,0) | 0.9492 | 1.908 | 1221.0 | 1.413 |
| M7 pmax(pmin(k,e),0) | 0.9518 | 1.859 | 1205.7 | 0.752 |
M7 is unambiguously best on every metric: highest R², lowest residual SD, lowest AIC, and crucially the tightest fit on the uber-CL10 cluster itself (SD 0.70 vs 1.10 baseline). M5 and M6 capture the cluster’s mean but not its tightness — a creature with kinetic = 60, energy = -50 still gets pmax(kinen, 0) = 5 of credit it apparently shouldn’t.
The mechanism reading: vulnerability on either kinetic or energy erases all kin/eng resist credit. Roughly half of unarmored creatures get zero credit from this term because they have a vulnerability on one side.
We promote M7 as the canonical unarmored form. The fitted coefficients:
summary(m7)
##
## Call:
## lm(formula = level ~ hardiness + fortitude + dexterity + intellect +
## cleverness + power + ke_floor + nonkinen, data = no_armor_df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.9320 -1.0468 -0.1114 0.9451 8.3337
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.873163 0.384465 20.478 < 2e-16 ***
## hardiness 0.013015 0.001638 7.947 4.61e-14 ***
## fortitude -0.019361 0.001453 -13.320 < 2e-16 ***
## dexterity 0.004101 0.001827 2.245 0.0256 *
## intellect 0.010102 0.002249 4.492 1.03e-05 ***
## cleverness 0.024797 0.002902 8.544 8.29e-16 ***
## power 0.015077 0.001201 12.556 < 2e-16 ***
## ke_floor 0.166537 0.009827 16.948 < 2e-16 ***
## nonkinen 0.054300 0.005645 9.618 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.885 on 282 degrees of freedom
## Multiple R-squared: 0.9518, Adjusted R-squared: 0.9505
## F-statistic: 696.5 on 8 and 282 DF, p-value: < 2.2e-16
Q-Q residuals:
The Shapiro-Wilk test still rejects normality (p ≈ 3e-6), so there is additional structure left in the unarmored residuals beyond the vulnerability mechanism — see “Persistent Outliers” below.
The two-formula split assumes a hard discontinuity at fortitude=500. We test it four ways: a single global linear regression with no break, a GAM smooth, a data-driven segmented regression, and the proposed hard split.
df <- normalized_df %>%
mutate(ke_floor = pmax(pmin(kinetic, energy), 0))
# Single global linear (no break)
m_global <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nonkinen,
data = df)
# GAM with smooth fortitude
m_gam <- gam(level ~ s(fortitude) + hardiness + dexterity + intellect +
cleverness + power + ke_floor + nonkinen,
data = df)
# Segmented regression (data-driven breakpoint)
seg_in <- lm(level ~ fortitude + hardiness + dexterity + intellect +
cleverness + power + ke_floor + nonkinen, data = df)
m_seg <- segmented(seg_in, seg.Z = ~ fortitude, psi = 500)
# Two formulas, hard split at fort = 500
arm_df <- df %>% dplyr::filter(armor == 1 & fortitude >= 500)
una_df <- df %>% dplyr::filter(armor == 0)
m_arm <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nonkinen,
data = arm_df)
m_una <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nonkinen,
data = una_df)
resid_split <- c(resid(m_arm), resid(m_una))
ss_split <- sum(resid_split^2)
n_split <- length(resid_split)
r2_split <- 1 - ss_split / sum((c(arm_df$level, una_df$level) -
mean(c(arm_df$level, una_df$level)))^2)
sd_split <- sd(resid_split)
# AIC for the joint two-formula model: sum of AICs (each is independent)
aic_split <- AIC(m_arm) + AIC(m_una)
| Model | R<U+00B2> | Resid SD | AIC |
|---|---|---|---|
| Single global linear (no break) | 0.9266 | 3.852 | 2067.0 |
| GAM smooth fortitude | 0.9784 | 2.046 | 1614.2 |
| Segmented regression (break at fort=445) | 0.9768 | 2.165 | 1644.6 |
| Two formulas, hard split at fort=500 | 0.9823 | 1.892 | 1560.2 |
The hard split at fortitude=500 beats the single global linear by ~500 AIC points and beats a flexible GAM smooth by ~50. That second margin is the important one: it says the relationship has a true discontinuity at 500, not just a slope change. Classic step function in the game code.
The sign flip across the break is iron-clad:
| Subset | n | Fortitude coef | p-value |
|---|---|---|---|
| Unarmored (fortitude < 500) | 291 | -0.01936 | 0 |
| Armored (fortitude >= 500) | 79 | 0.05730 | 0 |
Both coefficients are extremely significant, and they have opposite signs.
The unarmored fortitude coefficient is negative. Taken at face value, this would say that a creature with more fortitude has a lower level — counterintuitive enough to deserve scrutiny. There are three families of explanation:
We can rule out (1) with the data, but the data alone cannot distinguish (2) from (3). Four tests, in order.
Within the unarmored subset, fortitude correlates 0.85 with hardiness and 0.84 with health — Bio-Engineers wire these up together when crafting. If the negative-fortitude coefficient is bookkeeping for the hardiness correlation, dropping fortitude from the model should let hardiness pick up the signal with no loss of fit.
m_una_full <- m_una
m_una_no_fort <- lm(level ~ hardiness + dexterity + intellect +
cleverness + power + ke_floor + nonkinen,
data = una_df)
shift_compare <- data.frame(
Term = c("hardiness", "dexterity", "intellect",
"cleverness", "power", "ke_floor", "nonkinen"),
With_fortitude = coef(m_una_full)[c("hardiness","dexterity","intellect",
"cleverness","power","ke_floor","nonkinen")],
Without_fortitude = coef(m_una_no_fort)[c("hardiness","dexterity","intellect",
"cleverness","power","ke_floor","nonkinen")]
)
shift_compare$Shift <- shift_compare$Without_fortitude - shift_compare$With_fortitude
knitr::kable(shift_compare, digits = 4, row.names = FALSE,
caption = "Coefficient shifts when fortitude is dropped from the unarmored model")
| Term | With_fortitude | Without_fortitude | Shift |
|---|---|---|---|
| hardiness | 0.0130 | -0.0011 | -0.0141 |
| dexterity | 0.0041 | 0.0031 | -0.0010 |
| intellect | 0.0101 | 0.0188 | 0.0087 |
| cleverness | 0.0248 | 0.0171 | -0.0077 |
| power | 0.0151 | 0.0094 | -0.0057 |
| ke_floor | 0.1665 | 0.1845 | 0.0179 |
| nonkinen | 0.0543 | 0.0316 | -0.0227 |
The hardiness coefficient flips from +0.013 to roughly zero — exactly what the artifact hypothesis predicts. So far, the picture is consistent with the negative coefficient being collinearity bookkeeping.
A manual VIF (variance inflation factor) confirms the collinearity is real:
manual_vif <- function(model) {
X <- model.matrix(model)[, -1, drop = FALSE]
vifs <- sapply(seq_len(ncol(X)), function(j) {
fit <- lm(X[, j] ~ X[, -j])
1 / (1 - summary(fit)$r.squared)
})
setNames(vifs, colnames(X))
}
round(manual_vif(m_una_full), 2)
## hardiness fortitude dexterity intellect cleverness power ke_floor
## 6.36 4.72 5.72 9.76 7.54 2.31 1.75
## nonkinen
## 1.66
Fortitude itself is at moderate VIF (~5); intellect and cleverness are worse. This test is suggestive of artifact, but not decisive — the fact that hardiness can absorb the signal does not prove the signal is nothing but hardiness.
If the negative coefficient is pure artifact, forcing fortitude ≥ 0 (equivalent to setting it to zero, since the unconstrained estimate is on the wrong side of the constraint) should cost no predictive performance. Hardiness will absorb the signal and the model will fit just as well.
By KKT conditions for inequality-constrained least squares: when the unconstrained estimate violates the constraint, the constrained optimum sets the violating coefficient to its boundary (zero) and re-fits the others. So the constrained regression is mathematically identical to the drop-fortitude lm above; we just look at the predictive metrics this time.
una_df$resid_unc <- resid(m_una_full)
una_df$resid_con <- resid(m_una_no_fort)
cat(sprintf("Unconstrained M7: R^2 = %.4f, resid SD = %.3f, AIC = %.1f\n",
summary(m_una_full)$r.squared, sd(resid(m_una_full)),
AIC(m_una_full)))
## Unconstrained M7: R^2 = 0.9518, resid SD = 1.859, AIC = 1205.7
cat(sprintf("Constrained (fort >= 0): R^2 = %.4f, resid SD = %.3f, AIC = %.1f\n",
summary(m_una_no_fort)$r.squared, sd(resid(m_una_no_fort)),
AIC(m_una_no_fort)))
## Constrained (fort >= 0): R^2 = 0.9215, resid SD = 2.373, AIC = 1345.7
cat(sprintf("Delta R^2 = %+.4f, delta SD = %+.3f, delta AIC = %+.1f\n",
summary(m_una_no_fort)$r.squared - summary(m_una_full)$r.squared,
sd(resid(m_una_no_fort)) - sd(resid(m_una_full)),
AIC(m_una_no_fort) - AIC(m_una_full)))
## Delta R^2 = -0.0303, delta SD = +0.514, delta AIC = +140.0
The cost is real and substantial: ~3 percentage points of R², half a level of residual SD, and ~140 AIC points. The artifact hypothesis predicts ≈ zero cost — this is a clear rejection.
The cost is not uniformly distributed. Where does it land?
| Fortitude band | n | SD (unc) | SD (con) | Mean resid (unc) | Mean resid (con) |
|---|---|---|---|---|---|
| 0-100 | 110 | 1.513 | 1.933 | 0.019 | 0.636 |
| 100-200 | 30 | 1.806 | 2.476 | -0.133 | 0.227 |
| 200-300 | 35 | 1.975 | 2.158 | 0.336 | 1.092 |
| 300-400 | 24 | 1.740 | 2.062 | 0.125 | -0.036 |
| 400-500 | 91 | 2.228 | 2.497 | -0.155 | -1.253 |
| NA | 1 | NA | NA | 1.270 | -0.131 |
The constrained model develops a clean U-shape across fortitude bands: positive bias at low fortitude (under-prediction by ~0.6 levels in the 0–100 band) and negative bias at high fortitude (over-prediction by ~1.3 levels in the 400–500 band). That’s the residual signature of a missing linear term in fortitude. The empirical penalty was filling exactly that hole.
The uber-CL10 cluster is the most affected:
uber_idx <- una_df$level <= 12 & una_df$kinetic >= 50
cat(sprintf("Uber-CL10 cluster (n = %d):\n", sum(uber_idx)))
## Uber-CL10 cluster (n = 39):
cat(sprintf(" Unconstrained: mean resid %+.2f, SD %.2f\n",
mean(una_df$resid_unc[uber_idx]),
sd(una_df$resid_unc[uber_idx])))
## Unconstrained: mean resid -0.35, SD 0.75
cat(sprintf(" Constrained: mean resid %+.2f, SD %.2f\n",
mean(una_df$resid_con[uber_idx]),
sd(una_df$resid_con[uber_idx])))
## Constrained: mean resid -1.86, SD 0.75
Same spread, but a 1.5-level downward bias. The negative fortitude coefficient was specifically pulling these creatures from “predicted ~12” down to “actually CL 10.”
A more subtle artifact possibility: in the game, kinetic and energy resists come from two sources — effective resists derived directly from fortitude (floor(fortitude / 10) for unarmored) and special resists from DNA samples. When both are present, special wins and effective drops out. For creatures whose displayed kinetic/energy is effective, the column M7 consumes is literally a function of fortitude — a potential double-count that could create a spurious coefficient.
unarm <- normalized_df %>% dplyr::filter(armor == 0) %>%
mutate(
ke_floor = pmax(pmin(kinetic, energy), 0),
k_effective = !is.na(kinetic.effective) & kinetic.effective != 0,
e_effective = !is.na(energy.effective) & energy.effective != 0,
k_special = !is.na(kinetic.special) & kinetic.special != 0,
e_special = !is.na(energy.special) & energy.special != 0
)
# Verify: kinetic.effective = floor(fortitude / 10) for unarmored
unarm_eff <- unarm %>% dplyr::filter(k_effective)
exact_match <- sum(unarm_eff$kinetic.effective == floor(unarm_eff$fortitude / 10))
cat(sprintf("kinetic.effective matches floor(fortitude/10): %d / %d\n",
exact_match, nrow(unarm_eff)))
## kinetic.effective matches floor(fortitude/10): 34 / 40
# Refit on the all-special subset (no fortitude->resist algebraic relationship)
all_special <- unarm %>%
dplyr::filter(!k_effective & !e_effective &
(k_special | kinetic == 0) &
(e_special | energy == 0))
m_special <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nonkinen,
data = all_special)
cat(sprintf("Full unarmored fortitude coef: %.5f (n=%d)\n",
coef(m_una)["fortitude"], nobs(m_una)))
## Full unarmored fortitude coef: -0.01936 (n=291)
cat(sprintf("All-special fortitude coef: %.5f (n=%d)\n",
coef(m_special)["fortitude"], nobs(m_special)))
## All-special fortitude coef: -0.01998 (n=239)
The fortitude coefficient is essentially identical on the all-special subset, where there is no algebraic relationship between fortitude and the kinetic/energy values M7 consumes. The effective-resist double-count is ruled out as the cause.
(For completeness: zeroing out effective resists from ke_floor makes the fit worse by ~40 AIC. The level formula appears to credit effective and special resists at the same rate, using whichever value is displayed.)
The strongest test: regularize the regression and let cross-validation decide how much credibility each coefficient deserves. Lasso and elastic-net penalize the absolute size of coefficients, so a fragile coefficient — one that’s only there to bookkeep a collinearity — should shrink toward zero or get dropped entirely. A robust signal survives.
library(glmnet)
predictors <- c("hardiness","fortitude","dexterity","intellect",
"cleverness","power","ke_floor","nonkinen")
X_una <- as.matrix(una_df[, predictors])
y_una <- una_df$level
set.seed(2026)
cv_ridge <- cv.glmnet(X_una, y_una, alpha = 0.0, nfolds = 10, standardize = TRUE)
cv_enet <- cv.glmnet(X_una, y_una, alpha = 0.5, nfolds = 10, standardize = TRUE)
cv_lasso <- cv.glmnet(X_una, y_una, alpha = 1.0, nfolds = 10, standardize = TRUE)
reg_compare <- data.frame(
Term = c("(Intercept)", predictors),
OLS = round(coef(m_una), 5),
Ridge_min = round(as.vector(coef(cv_ridge, s = "lambda.min")), 5),
ElasticNet_min = round(as.vector(coef(cv_enet, s = "lambda.min")), 5),
Lasso_min = round(as.vector(coef(cv_lasso, s = "lambda.min")), 5)
)
knitr::kable(reg_compare, row.names = FALSE,
caption = "Cross-validated coefficients at lambda.min, original scale")
| Term | OLS | Ridge_min | ElasticNet_min | Lasso_min |
|---|---|---|---|---|
| (Intercept) | 7.87316 | 8.56827 | 7.93384 | 7.92975 |
| hardiness | 0.01302 | 0.00627 | 0.01234 | 0.01239 |
| fortitude | -0.01936 | -0.01180 | -0.01860 | -0.01860 |
| dexterity | 0.00410 | 0.00619 | 0.00417 | 0.00408 |
| intellect | 0.01010 | 0.01309 | 0.01058 | 0.01056 |
| cleverness | 0.02480 | 0.02005 | 0.02426 | 0.02436 |
| power | 0.01508 | 0.01238 | 0.01483 | 0.01481 |
| ke_floor | 0.16654 | 0.16228 | 0.16675 | 0.16675 |
| nonkinen | 0.05430 | 0.04601 | 0.05312 | 0.05307 |
Lasso and elastic-net at the cross-validated optimum essentially reproduce the OLS solution, including the −0.019 fortitude coefficient. Ridge halves both ends of the (hardiness, fortitude) pair but keeps the negative sign on fortitude.
The lasso path is even more telling. Walking from λ=∞ down to λ=0, predictors enter the model in order of how much signal they carry:
| Predictor | EnteredAtLambda | EnteredSign |
|---|---|---|
| intellect | 6.98803 | + |
| cleverness | 4.81658 | + |
| ke_floor | 4.38869 | + |
| power | 1.73099 | + |
| nonkinen | 1.30943 | + |
| dexterity | 0.68274 | + |
| fortitude | 0.39069 | - |
| hardiness | 0.26929 | + |
fortitude enters at λ ≈ 0.39 with the negative sign already, and stays negative across the entire path. Crucially, hardiness enters last — even after fortitude. The “fragile” coefficient of the collinear pair is hardiness, not fortitude.
Putting the four tests together:
| Test | Predicted under “artifact” | Observed |
|---|---|---|
| Drop fortitude | No fit loss; another coef absorbs it | Hardiness flips to ~0; R² drops 3 pts |
| Constrained regression | No fit loss | R² drops 3 pts; U-shape in residuals |
| Effective resists | Coefficient explained by fort→resist channel | Coefficient identical on all-special subset |
| Elastic-net (lasso) | fortitude shrinks to zero | fortitude stays at −0.019; hardiness enters last |
The artifact hypothesis is rejected. The negative fortitude coefficient is doing real, irreducible predictive work — it cannot be reshuffled away by another coefficient and it cannot be regularized to zero without paying a real penalty.
That leaves two live mechanisms:
The data does not let us choose between these two — both predict exactly the empirical coefficient pattern we observe. But for the purposes of reproducing the retail formula, the choice does not matter: the coefficient is real, and the SWGEmu re-implementation should keep it.
Put the two formulas together with a hard switch at fortitude=500:
source("R/creature_level_model.R")
normalized_df$predicted_level <- custom_model(normalized_df)
normalized_df$residuals <- normalized_df$level - normalized_df$predicted_level
cat(sprintf("Combined: n=%d, R^2=%.4f, residual SD=%.3f\n",
nrow(normalized_df),
1 - sum(normalized_df$residuals^2) /
sum((normalized_df$level - mean(normalized_df$level))^2),
sd(normalized_df$residuals)))
## Combined: n=370, R^2=0.9844, residual SD=1.775
Armored (fortitude >= 500) — clean fractions, R² ≈ 0.98:
level = -23
+ 0.01 * hardiness
+ 0.06 * fortitude
+ 0.005 * dexterity
+ 0.01 * intellect
+ 0.025 * cleverness
+ 0.015 * power
+ 0.10 * kinen
+ 0.08 * nonkinen
with kinen = (kinetic + energy) / 2 and nonkinen = (blast + heat + cold + electricity + acid + stun) / 6.
Unarmored (fortitude < 500) — M8 with cleverness hinge, R² ≈ 0.96:
level = 8.13
+ 0.012 * hardiness
- 0.019 * fortitude
+ 0.004 * dexterity
+ 0.011 * intellect
+ 0.020 * cleverness
+ 0.016 * power
+ 0.170 * pmax(pmin(kinetic, energy), 0)
+ 0.050 * nonkinen
+ 0.106 * pmax(cleverness - 400, 0)
Two structural features beyond the linear baseline:
Combined (both formulas, n=370): R² ≈ 0.96, residual SD ≈ 1.7 levels.
Three creatures resisted M7 stubbornly: rancor 6j048r2a (CL48, residual +8.16), razor cat pdefjush (CL49, +5.40), and a sibling rancor 0iafqb8b (CL48, +5.10). All three sit at the apex of the unarmored CL distribution, with cleverness above 400. M7’s linear cleverness term (coef ~0.025) wasn’t enough to lift them.
Scan for a single hinge pmax(cleverness − K, 0) over K ∈ [100, 500]:
no_armor_df <- normalized_df %>% dplyr::filter(armor == 0)
no_armor_df$ke_floor <- pmax(pmin(no_armor_df$kinetic, no_armor_df$energy), 0)
no_armor_df$nk <- (no_armor_df$blast + no_armor_df$heat +
no_armor_df$cold + no_armor_df$electricity +
no_armor_df$acid + no_armor_df$stun) / 6
m7_form <- level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nk
fit_m7 <- lm(m7_form, data = no_armor_df)
knot_scan <- data.frame(K = seq(200, 475, by = 25))
knot_scan$dAIC <- sapply(knot_scan$K, function(K) {
no_armor_df$.h <- pmax(no_armor_df$cleverness - K, 0)
fit <- update(fit_m7, ~ . + .h, data = no_armor_df)
AIC(fit_m7) - AIC(fit)
})
knot_scan$coef <- sapply(knot_scan$K, function(K) {
no_armor_df$.h <- pmax(no_armor_df$cleverness - K, 0)
coef(update(fit_m7, ~ . + .h, data = no_armor_df))[".h"]
})
print(knot_scan)
## K dAIC coef
## 1 200 33.16757 0.02431164
## 2 225 34.86292 0.02619438
## 3 250 36.03381 0.02845901
## 4 275 39.48425 0.03285469
## 5 300 40.42823 0.03778011
## 6 325 40.83226 0.04445046
## 7 350 38.14335 0.05330615
## 8 375 35.87911 0.06884239
## 9 400 40.54167 0.10577139
## 10 425 42.84150 0.17599156
## 11 450 36.67013 0.37060579
## 12 475 19.17646 8.61611519
The dAIC plateau is wide (K=300..425 all give roughly equivalent fit), with an in-sample peak at K=425 (dAIC ≈ +43, coefficient ≈ 0.18). That’s the headline result — a cleverness hinge worth ~+40 AIC over M7.
The optimal K is loose because only ~10 creatures sit above clev=400. We need to verify the hinge isn’t an artifact of a single creature.
10-fold CV (50 reps; full code in R/cv_cleverness_hinge.R): M7 baseline CV-RMSE is 1.929; the hinge family beats it by 0.09–0.14 RMSE for any K in [200, 425]. Best K=425 gives CV-RMSE 1.788 — about 50× the standard error. Out-of-sample improvement is unambiguous.
Leave-one-out on the top-cleverness creatures: removing rancor 6j048r2a shifts the in-sample best K from 425 → 275 and the coefficient from 0.176 → 0.028. That single creature is the anchor for K=425. K=400, however, retains 88% of its coefficient when 6j048r2a is removed (0.106 → 0.093), and 86% when all three rancors are removed.
Bootstrap of the K=400 coefficient (200 reps): median 0.105, 95% CI [0.057, 0.129] — well clear of zero and stable.
Raw-HTML check on 6j048r2a: level=48 is correct, template stats (clev=476, power=338) are internally consistent with the recorded samples, and no plausible single-digit typo on level / cleverness / power makes the residual go away while remaining physically craftable.
no_armor_df$clev_h400 <- pmax(no_armor_df$cleverness - 400, 0)
fit_m8 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nk + clev_h400,
data = no_armor_df)
cat(sprintf("M8 unarmored: R^2=%.4f SD=%.4f AIC=%.2f\n",
summary(fit_m8)$r.squared, sd(resid(fit_m8)), AIC(fit_m8)))
## M8 unarmored: R^2=0.9584 SD=1.7279 AIC=1165.13
print(round(coef(fit_m8), 5))
## (Intercept) hardiness fortitude dexterity intellect cleverness
## 8.13225 0.01230 -0.01940 0.00444 0.01139 0.01951
## power ke_floor nk clev_h400
## 0.01561 0.16965 0.05038 0.10577
Effective slope above clev 400 is 0.020 + 0.106 = 0.126 — apex pets gain CL more aggressively per cleverness point than mid-tier ones.
K=400 is chosen over the in-sample optimum K=425 because (a) it survives single-point removal, (b) it has a tight bootstrap CI, and (c) it’s a plausible game-design round number. Promoting K=425 would have meant banking on 6j048r2a.
A nice side-effect: the rancor “+5 skin adjustment” that earlier phases needed turns out to be a high-cleverness artifact. Mean rancor residual under M7 was +5.25; under M8 it is +1.48. The rancor row in the skin-adjustments table is retired.
If the cleverness hinge is a global game mechanic, the armored formula should benefit from the same term. The armored set has more high-cleverness creatures than unarmored (26 with clev ≥ 400 vs 10), so it’s a fair test.
arm_df <- normalized_df %>% dplyr::filter(armor == 1, fortitude >= 500)
arm_form <- level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power +
kinetic + energy +
blast + heat + cold + electricity + acid + stun
fit_arm <- lm(arm_form, data = arm_df)
# Knot scan
arm_scan <- data.frame(K = seq(200, 500, by = 50))
arm_scan$dAIC <- sapply(arm_scan$K, function(K) {
arm_df$.h <- pmax(arm_df$cleverness - K, 0)
AIC(fit_arm) - AIC(update(fit_arm, ~ . + .h, data = arm_df))
})
arm_scan$coef <- sapply(arm_scan$K, function(K) {
arm_df$.h <- pmax(arm_df$cleverness - K, 0)
coef(update(fit_arm, ~ . + .h, data = arm_df))[".h"]
})
print(arm_scan)
## K dAIC coef
## 1 200 8.649897 0.020299713
## 2 250 7.989614 0.016828796
## 3 300 5.057604 0.013344742
## 4 350 2.121336 0.010062670
## 5 400 -1.063663 0.004974901
## 6 450 -1.926995 0.001541793
## 7 500 -1.908498 0.002162013
The armored knot scan goes the wrong way. The in-sample best is K=100 with a tiny dAIC of about +11 (vs +43 for unarmored), and the dAIC declines monotonically as K rises — already negative by K=400.
A free-coefficient hinge at K=400 has coef ≈ 0.005 (1/20 of the unarmored coefficient), ANOVA p ≈ 0.39, and a bootstrap 95% CI of roughly [−0.009, 0.018] — zero sits comfortably inside it. Forcing the unarmored coefficient (0.106) onto the armored formula collapses its R² from 0.987 to 0.925.
Conclusion: the cleverness hinge is structurally unarmored-specific. It’s not a stat-driven scaling that would surface anywhere cleverness gets high. This parallels the negative-fortitude finding — both are unique to the unarmored regime, consistent with the “armor pays for itself” interpretation. Armored creatures already pick up CL from fortitude (+0.06) and the steeper kinetic/energy terms; they don’t need the hinge. Unarmored apex pets, lacking that armor-derived credit, get it back through the cleverness hinge.
Full code in R/test_hinge_on_armored.R.
Two falumpaset outliers survive M8 with large positive residuals:
## serial skin level hardiness fortitude cleverness power pred_m8
## 1 01oatm1v falumpaset 32 575 406 191 447 26.57
## 2 1lc95n55 falumpaset 30 557 439 193 292 21.24
## resid_m8
## 1 5.43
## 2 8.76
Both have low cleverness (≈ 192) so the hinge can’t help them. Raw HTML confirms the levels and template stats are internally consistent — no transcription error. Both share an unusual attack profile (“Dizzy/Strong poison/Ranged”) that the other 8 falumpasets don’t have, and across the unarmored set sa2=strong_poison does have a mean residual of +1.09. But this signal is almost certainly a confound: strong_poison comes mainly from the Aggression and Psychology sample-source slots in BE crafting, so the attack flag is a marker for template-source-slot composition (which the model doesn’t use), not an attack-→-level mechanism.
The more interesting observation is that these two falumpasets sit at the tail of a broader population pattern — high-hardiness, low- cleverness, CL30+ creatures that the linear model under-predicts:
## serial skin level hardiness cleverness power pred_m8 resid_m8
## 1 1lc95n55 falumpaset 30 557 193 292 21.24 8.76
## 2 01oatm1v falumpaset 32 575 191 447 26.57 5.43
## 3 1gnl98ni razor_cat 30 552 233 436 25.63 4.37
## 4 h854qtlr gnort 30 591 241 399 25.98 4.02
## 5 1f8gec1e kima 25 512 103 225 22.42 2.58
## 6 s29dv1mh durni 26 606 108 367 24.04 1.96
Quick scans suggest at least two more candidate hinges sitting in the M8 residuals:
Neither is promoted. They’re roughly 1/3 the size of the cleverness hinge effect, may overlap mechanistically (capturing one underlying mechanism, not multiple independent terms), and would need the same CV / sanity-check workflow before being trusted. The data is consistent with the game computing CL from stat bands plus piecewise structure rather than a single linear formula — the cleverness K=400 hinge is just the most identifiable slice because the dataset spans its knot evenly. Code in R/investigate_falumpaset_anomalies.R.
Earlier phases noted that residuals seemed concentrated near the fortitude=500 boundary, suggesting a possible “light armor” middle regime. Three tests under M8:
Cutoff scan: refit the partition for cutoffs T ∈ [440, 540].
df <- normalized_df %>%
dplyr::mutate(ke_floor = pmax(pmin(kinetic, energy), 0),
nk = (blast + heat + cold + electricity + acid + stun) / 6,
clev_h400 = pmax(cleverness - 400, 0))
scan_cutoff <- function(T) {
un <- df %>% dplyr::filter(fortitude < T)
ar <- df %>% dplyr::filter(fortitude >= T)
if (nrow(un) < 50 || nrow(ar) < 30) return(NULL)
fit_u <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power + ke_floor + nk + clev_h400,
data = un)
fit_a <- lm(level ~ hardiness + fortitude + dexterity + intellect +
cleverness + power +
kinetic + energy + blast + heat + cold +
electricity + acid + stun, data = ar)
sse <- sum(resid(fit_u)^2) + sum(resid(fit_a)^2)
n <- nrow(un) + nrow(ar)
p <- length(coef(fit_u)) + length(coef(fit_a)) + 2
data.frame(T = T, n_un = nrow(un), n_ar = nrow(ar),
sd = sqrt(sse / n),
aic = n * log(sse/n) + 2 * p)
}
cutoff_scan <- do.call(rbind,
lapply(seq(460, 540, by = 10),
function(T) { r <- scan_cutoff(T); if (!is.null(r)) r else NULL }))
cutoff_scan$dAIC <- min(cutoff_scan$aic) - cutoff_scan$aic
print(cutoff_scan)
## T n_un n_ar sd aic dAIC
## 1 460 276 94 1.930536 540.7704 -93.43871
## 2 470 281 89 1.909047 532.4869 -85.15526
## 3 480 285 85 1.759753 472.2284 -24.89672
## 4 490 288 82 1.727086 458.3623 -11.03067
## 5 500 290 80 1.701532 447.3317 0.00000
## 6 510 306 64 2.062877 589.8353 -142.50363
## 7 520 315 55 2.371146 692.8964 -245.56476
## 8 530 326 44 2.827763 823.2195 -375.88786
## 9 540 333 37 2.991871 864.9653 -417.63367
The fort=500 cutoff is a knife-edge optimum. Moving it 10 either way costs at least 11 AIC; raising it to 510 costs 142 AIC, to 520 costs ~250. The asymmetry tells us the unarmored formula extrapolates roughly linearly into the boundary, but classifying a true-armored creature as unarmored blows up the prediction.
Three-segment piecewise (low/mid/high with break at 450 and 540): worse than two-segment by ΔAIC ≈ −56. There is no “light armor middle regime” hiding in the data.
Boundary-band residuals: under M8 the bands have slightly elevated SD (~2.0 vs 1.7 baseline) but no systematic bias. The original M7-era claim “the boundary fits worst” was an artifact of the high-DPS under-prediction sitting in this region; M8 incidentally fixed it.
Joining creatures.csv to templates.csv and cross-tabulating the armor flag (creature side) against fortitude >= 500 (template side) on the unfiltered raw data:
raw_creatures <- readr::read_csv("data/clean/furrycat/creatures.csv",
show_col_types = FALSE)
raw_templates <- readr::read_csv("data/clean/furrycat/templates.csv",
show_col_types = FALSE)
joined_raw <- raw_creatures %>%
dplyr::inner_join(raw_templates %>%
dplyr::select(serial, fortitude),
by = c("template_id" = "serial"))
table(armor_flag = joined_raw$armor,
fort_ge_500 = joined_raw$fortitude >= 500)
## fort_ge_500
## armor_flag FALSE TRUE
## 0 374 1
## 1 1 85
Out of 461 raw creatures, only 2 disagreements exist: kjvtv7d7 (template fort=408 with “Light” armor specified, already filtered as bad data) and 3f0lpuko (template fort=501 with “Light” armor specified, but the creature output has empty/vulnerable AR in the raw HTML). In both disagreements the unarmored M8 formula fits better than the armored formula — the creature output is the ground truth, and a BE template specifying armor doesn’t guarantee the crafted output gets the AR slot.
The principled splitter is therefore armor == 0 from the creature output, not fortitude < 500 from the template. Empirically the two rules disagree on a handful of creatures (one in our filtered set), so the in-sample fit is essentially identical. custom_model keeps the fortitude split for now; an SWGEmu re-implementation should branch on the actual creature armor state, which the engine knows directly.
After M8, the unarmored set has SD ≈ 1.73. Most outliers are within the crafting-noise envelope. The structural ones that remain:
1lc95n55 CL30, 01oatm1v CL32): +8.76 / +5.43. Discussed in Phase 17 — likely the tail of the high-hardiness / low-cleverness pattern.dcnqk0gj razor_cat CL15 predicted ~20). Phase 10 candidate; possibly a HAM floor effect or the same broader hinge structure suggested in Phase 17.A 10-creature “data-quality suspects” sensitivity analysis (low- health-for-skin entries plus the 6 effective-resist formula mismatches) shifts coefficients by less than 0.01 each. The formula is not being driven by a small number of suspect rows.
int calculateCreatureLevel(const Creature* c) {
float kinen = (c->kinetic + c->energy) / 2.0f;
float nonkinen = (c->blast + c->heat + c->cold +
c->electricity + c->acid + c->stun) / 6.0f;
float level;
if (c->fortitude >= 500) {
// ARMORED FORMULA
level = -23.0f
+ 0.01f * c->hardiness
+ 0.06f * c->fortitude
+ 0.005f * c->dexterity
+ 0.01f * c->intellect
+ 0.025f * c->cleverness
+ 0.015f * c->power
+ 0.1f * kinen
+ 0.08f * nonkinen;
} else {
// UNARMORED FORMULA — M8 = M7 weaker-resist-floor + cleverness hinge.
// ke_floor = max(0, min(kinetic, energy)): if either side has a
// vulnerability, the kin/eng resist credit collapses to zero.
// clev_hinge: above cleverness 400, level scales more aggressively.
float ke_floor = fmaxf(0.0f, fminf(c->kinetic, c->energy));
float clev_hinge = fmaxf(0.0f, c->cleverness - 400.0f);
level = 8.13f
+ 0.012f * c->hardiness
- 0.019f * c->fortitude
+ 0.004f * c->dexterity
+ 0.011f * c->intellect
+ 0.020f * c->cleverness
+ 0.016f * c->power
+ 0.170f * ke_floor
+ 0.050f * nonkinen
+ 0.106f * clev_hinge;
}
if (level < 1.0f) level = 1.0f;
return (int)(level + 0.5f); // round to nearest
}
The R counterpart of this pseudocode lives in R/creature_level_model.R as custom_model().