Overview

Bio-Engineering in Star Wars: Galaxies let players craft custom creatures (“pets”) by combining DNA samples. The creature’s level (CL) governed who could tame the pet and how powerful it was. The 2003 BE Guide noted that CL “mostly depends on damage, resists, and the CL of the donors,” but the exact algorithm was never published.

This document reverse-engineers that algorithm from the Furrycat archive. The headline findings:

  1. The game uses two completely different formulas, selected by whether the creature has armor. The split lives at template fortitude >= 500 (which agrees with the actual armor flag on 99.6% of creatures).
  2. The unarmored formula needs a vulnerability-aware resist term (the weaker of kinetic/energy, floored at zero).
  3. The unarmored formula also needs a cleverness hinge at K=400 to capture how apex-DPS pets scale at the top of the level distribution. This term is structurally specific to the unarmored regime — fitting the same hinge to the armored set yields a coefficient indistinguishable from zero.

Combined, the two formulas reach R² ≈ 0.96 and SD ≈ 1.7 on 370 creatures (79 armored, 291 unarmored).

An Initial GAM

A single GAM across all creatures gives a quick picture of which attributes move level.

model.gam.everything <- gam(
  level ~
    s(hardiness) +
    s(fortitude) +
    s(dexterity) +
    s(endurance) +
    s(intellect) +
    s(cleverness) +
    s(dependability) +
    s(courage) +
    s(fierceness) +
    s(power) +
    s(kinetic) +
    s(energy)  +
    s(blast) +
    s(heat) +
    s(cold) +
    s(electricity) +
    s(acid) +
    s(stun),
  data = normalized_df,
  family = gaussian()
)
summary(model.gam.everything)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## level ~ s(hardiness) + s(fortitude) + s(dexterity) + s(endurance) + 
##     s(intellect) + s(cleverness) + s(dependability) + s(courage) + 
##     s(fierceness) + s(power) + s(kinetic) + s(energy) + s(blast) + 
##     s(heat) + s(cold) + s(electricity) + s(acid) + s(stun)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 23.05135    0.06939   332.2   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                    edf Ref.df      F  p-value    
## s(hardiness)     2.774  3.579  4.803 0.001739 ** 
## s(fortitude)     8.925  8.994 92.676  < 2e-16 ***
## s(dexterity)     7.381  8.354  7.183  < 2e-16 ***
## s(endurance)     1.000  1.000  9.498 0.002248 ** 
## s(intellect)     3.023  3.854 23.165  < 2e-16 ***
## s(cleverness)    4.658  5.777 14.774  < 2e-16 ***
## s(dependability) 1.090  1.171  6.456 0.011768 *  
## s(courage)       4.612  5.632  4.316 0.000485 ***
## s(fierceness)    1.000  1.000  6.345 0.012290 *  
## s(power)         8.165  8.785 18.242  < 2e-16 ***
## s(kinetic)       8.115  8.758  5.838 2.50e-07 ***
## s(energy)        8.453  8.892 50.694  < 2e-16 ***
## s(blast)         1.000  1.000  3.715 0.054860 .  
## s(heat)          1.000  1.000  0.027 0.869184    
## s(cold)          2.969  3.733  9.033 2.46e-06 ***
## s(electricity)   1.000  1.000 12.828 0.000399 ***
## s(acid)          3.651  4.474  4.538 0.001255 ** 
## s(stun)          1.000  1.000 11.723 0.000704 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.991   Deviance explained = 99.3%
## GCV = 2.2033  Scale est. = 1.7816    n = 370

Several attributes appear important:

And a handful land on ~1.00 effective degrees of freedom (essentially flat): endurance, fierceness, blast, heat, electricity. We treat those as noise attributes for level purposes.

The fortitude smooth is the most interesting term. Its shape will turn out to encode the two-formula structure.

Feature Engineering

To avoid over-fitting to small per-attribute weight differences, we collapse correlated attributes into synthetic features:

  1. average_hdi = (hardiness + dexterity + intellect) / 3 — these three attributes have similar weights against level; collapsing them prevents the model from over-attributing influence to whichever happens to correlate most with fortitude.
  2. kinen = (kinetic + energy) / 2 — kinetic and energy share the in-game 60% cap; treating them as one resist class respects that mechanic.
  3. nonkinen = (blast + heat + cold + electricity + acid + stun) / 6 — the other six resists, none of which are individually significant after collapsing.

These are derived in R/data.R and ride along on normalized_df.

A re-fit GAM with these collapsed features reproduces the same picture with far fewer terms:

model.gam <- gam(
  level ~
    s(average_hdi) +
    s(fortitude) +
    s(cleverness) +
    s(power) +
    s(kinen) +
    s(nonkinen),
  data = normalized_df
)
summary(model.gam)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## level ~ s(average_hdi) + s(fortitude) + s(cleverness) + s(power) + 
##     s(kinen) + s(nonkinen)
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 23.05135    0.09015   255.7   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                  edf Ref.df     F p-value    
## s(average_hdi) 5.600  6.785 36.79  <2e-16 ***
## s(fortitude)   8.919  8.994 79.20  <2e-16 ***
## s(cleverness)  7.694  8.556 15.32  <2e-16 ***
## s(power)       2.371  3.040 56.96  <2e-16 ***
## s(kinen)       4.282  5.267 57.10  <2e-16 ***
## s(nonkinen)    2.356  2.978 44.41  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.985   Deviance explained = 98.6%
## GCV = 3.2935  Scale est. = 3.0067    n = 370

Holding all other attributes at their mean, the fortitude smooth shows the discontinuity that motivates the two-formula split:

The smooth is decreasing below fortitude=500 and increasing above it. We’ll quantify and verify that breakpoint shortly.

Armored Creatures

For creatures with fortitude >= 500, a plain linear regression already fits remarkably well.

linear.fit.level.armor <- lm(
  level ~ average_hdi + fortitude + cleverness + power + kinen + nonkinen,
  data = armor_df
)
summary(linear.fit.level.armor)
## 
## Call:
## lm(formula = level ~ average_hdi + fortitude + cleverness + power + 
##     kinen + nonkinen, data = armor_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1643 -1.0873  0.1926  1.0277  4.3139 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -21.331842   3.491081  -6.110 4.60e-08 ***
## average_hdi   0.027648   0.004965   5.568 4.18e-07 ***
## fortitude     0.056252   0.006059   9.285 6.18e-14 ***
## cleverness    0.024034   0.003182   7.552 1.05e-10 ***
## power         0.015740   0.002460   6.398 1.39e-08 ***
## kinen         0.096920   0.018767   5.164 2.06e-06 ***
## nonkinen      0.085904   0.015458   5.557 4.36e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.93 on 72 degrees of freedom
## Multiple R-squared:  0.9818, Adjusted R-squared:  0.9802 
## F-statistic: 645.6 on 6 and 72 DF,  p-value: < 2.2e-16

Residual diagnostics:

shapiro.test(rs.armor)
## 
##  Shapiro-Wilk normality test
## 
## data:  rs.armor
## W = 0.98336, p-value = 0.3941
bptest(linear.fit.level.armor)
## 
##  studentized Breusch-Pagan test
## 
## data:  linear.fit.level.armor
## BP = 6.1042, df = 6, p-value = 0.4116

The armored residuals are small (±4 levels), normally distributed, and homoscedastic. This is consistent with the residuals being driven by crafting-system randomness rather than a missing variable.

A clean, game-dev-friendly version of the armored formula:

level = -23
      + 0.01   * hardiness
      + 0.06   * fortitude
      + 0.005  * dexterity
      + 0.01   * intellect
      + 0.025  * cleverness
      + 0.015  * power
      + 0.10   * kinen
      + 0.08   * nonkinen

R² ≈ 0.98, residual SD ≈ 1.8 levels.

Unarmored Creatures: the Naive Fit

The same recipe on unarmored creatures gives a worse fit, and the residuals clearly violate normality.

linear.fit.level.noarmor <- lm(
  level ~ average_hdi + fortitude + cleverness + power + kinen + nonkinen,
  data = no_armor_df
)
summary(linear.fit.level.noarmor)
## 
## Call:
## lm(formula = level ~ average_hdi + fortitude + cleverness + power + 
##     kinen + nonkinen, data = no_armor_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.0819 -1.2764 -0.0060  0.9886  7.5673 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  8.481758   0.415248   20.43   <2e-16 ***
## average_hdi  0.027228   0.002052   13.27   <2e-16 ***
## fortitude   -0.018734   0.001335  -14.04   <2e-16 ***
## cleverness   0.027003   0.002131   12.67   <2e-16 ***
## power        0.013156   0.001308   10.06   <2e-16 ***
## kinen        0.114275   0.008455   13.52   <2e-16 ***
## nonkinen     0.060657   0.006034   10.05   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.14 on 284 degrees of freedom
## Multiple R-squared:  0.9375, Adjusted R-squared:  0.9362 
## F-statistic: 709.9 on 6 and 284 DF,  p-value: < 2.2e-16

shapiro.test(rs.noarmor)
## 
##  Shapiro-Wilk normality test
## 
## data:  rs.noarmor
## W = 0.98048, p-value = 0.0005241
bptest(linear.fit.level.noarmor)
## 
##  studentized Breusch-Pagan test
## 
## data:  linear.fit.level.noarmor
## BP = 46.786, df = 6, p-value = 2.064e-08

Three things stand out from this naive unarmored fit:

  1. Fortitude has a negative coefficient in unarmored creatures. This is reproducible across modelling approaches. We come back to it below.
  2. The residuals are not normal. The Q-Q plot shows a heavy left tail — creatures that the model over-predicts.
  3. The over-predicted creatures cluster at low CL, with a recognizable pattern that isn’t captured by the naive kinen average.

The “Uber CL10” Cluster

A heavy left tail in the residuals turned out to be a coherent cluster: low-CL unarmored creatures with ~10k Health and a kinetic resist near the 60% cap.

## Uber-CL10-style cluster (level <= 12 and kinetic >= 50):
##   n = 39
##   mean kinetic = 57.8, mean energy = -51.7
##   mean kinen = (k+e)/2 = 3.0
##   mean naive residual = -0.45 (negative = over-predicted)

The Bio-Engineer “trick”: stack kinetic resist near the 60% cap, accept a heavy negative energy resist as the cost. The naive kinen = (k+e)/2 averages a near-cap positive against a heavily-negative number and credits the small remainder linearly — so the formula thinks the creature has ~5% “average” resist and predicts CL ~13, when in practice the energy vulnerability erases the resist’s value entirely. The game evidently penalizes the vulnerability harder than (k+e)/2 does.

Vulnerability-Aware Resists

We test three symmetric formulations of the kinetic/energy term against the kinen baseline. All four use the same other predictors; only the resist term changes.

ID Resist term Mechanic
M0 kinen = (k+e)/2 naive average
M5 kinen_pos + kinen_neg (split positive / negative half) asymmetric credit on positive vs penalty on negative
M6 pmax(kinen, 0) average resist, but no negative credit
M7 pmax(pmin(k, e), 0) use the weaker side, floored at zero

Symmetric in kinetic and energy — both share the 60% in-engine cap, so treating them as a single resist class is mechanically motivated.

no_armor_df <- no_armor_df %>%
  mutate(
    kinen_pos = pmax(kinen, 0),
    kinen_neg = pmin(kinen, 0),
    ke_floor  = pmax(pmin(kinetic, energy), 0)
  )

m0 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                 cleverness + power + kinen + nonkinen,
         data = no_armor_df)
m5 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                 cleverness + power + kinen_pos + kinen_neg + nonkinen,
         data = no_armor_df)
m6 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                 cleverness + power + kinen_pos + nonkinen,
         data = no_armor_df)
m7 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                 cleverness + power + ke_floor + nonkinen,
         data = no_armor_df)
Resist-term variants on the unarmored data
Model R<U+00B2> Resid SD AIC Uber-CL10 SD
M0 kinen 0.9380 2.109 1279.2 1.174
M5 kinen_pos+kinen_neg 0.9495 1.903 1221.3 1.422
M6 pmax(kinen,0) 0.9492 1.908 1221.0 1.413
M7 pmax(pmin(k,e),0) 0.9518 1.859 1205.7 0.752

M7 is unambiguously best on every metric: highest R², lowest residual SD, lowest AIC, and crucially the tightest fit on the uber-CL10 cluster itself (SD 0.70 vs 1.10 baseline). M5 and M6 capture the cluster’s mean but not its tightness — a creature with kinetic = 60, energy = -50 still gets pmax(kinen, 0) = 5 of credit it apparently shouldn’t.

The mechanism reading: vulnerability on either kinetic or energy erases all kin/eng resist credit. Roughly half of unarmored creatures get zero credit from this term because they have a vulnerability on one side.

We promote M7 as the canonical unarmored form. The fitted coefficients:

summary(m7)
## 
## Call:
## lm(formula = level ~ hardiness + fortitude + dexterity + intellect + 
##     cleverness + power + ke_floor + nonkinen, data = no_armor_df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.9320 -1.0468 -0.1114  0.9451  8.3337 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  7.873163   0.384465  20.478  < 2e-16 ***
## hardiness    0.013015   0.001638   7.947 4.61e-14 ***
## fortitude   -0.019361   0.001453 -13.320  < 2e-16 ***
## dexterity    0.004101   0.001827   2.245   0.0256 *  
## intellect    0.010102   0.002249   4.492 1.03e-05 ***
## cleverness   0.024797   0.002902   8.544 8.29e-16 ***
## power        0.015077   0.001201  12.556  < 2e-16 ***
## ke_floor     0.166537   0.009827  16.948  < 2e-16 ***
## nonkinen     0.054300   0.005645   9.618  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.885 on 282 degrees of freedom
## Multiple R-squared:  0.9518, Adjusted R-squared:  0.9505 
## F-statistic: 696.5 on 8 and 282 DF,  p-value: < 2.2e-16

Q-Q residuals:

The Shapiro-Wilk test still rejects normality (p ≈ 3e-6), so there is additional structure left in the unarmored residuals beyond the vulnerability mechanism — see “Persistent Outliers” below.

Verifying the fortitude=500 Breakpoint

The two-formula split assumes a hard discontinuity at fortitude=500. We test it four ways: a single global linear regression with no break, a GAM smooth, a data-driven segmented regression, and the proposed hard split.

df <- normalized_df %>%
  mutate(ke_floor = pmax(pmin(kinetic, energy), 0))

# Single global linear (no break)
m_global <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                       cleverness + power + ke_floor + nonkinen,
               data = df)

# GAM with smooth fortitude
m_gam <- gam(level ~ s(fortitude) + hardiness + dexterity + intellect +
                     cleverness + power + ke_floor + nonkinen,
             data = df)

# Segmented regression (data-driven breakpoint)
seg_in <- lm(level ~ fortitude + hardiness + dexterity + intellect +
                     cleverness + power + ke_floor + nonkinen, data = df)
m_seg  <- segmented(seg_in, seg.Z = ~ fortitude, psi = 500)

# Two formulas, hard split at fort = 500
arm_df <- df %>% dplyr::filter(armor == 1 & fortitude >= 500)
una_df <- df %>% dplyr::filter(armor == 0)
m_arm  <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                     cleverness + power + ke_floor + nonkinen,
             data = arm_df)
m_una  <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                     cleverness + power + ke_floor + nonkinen,
             data = una_df)

resid_split <- c(resid(m_arm), resid(m_una))
ss_split    <- sum(resid_split^2)
n_split     <- length(resid_split)
r2_split    <- 1 - ss_split / sum((c(arm_df$level, una_df$level) -
                                   mean(c(arm_df$level, una_df$level)))^2)
sd_split    <- sd(resid_split)
# AIC for the joint two-formula model: sum of AICs (each is independent)
aic_split   <- AIC(m_arm) + AIC(m_una)
Breakpoint test (level ~ stats with different fortitude treatments)
Model R<U+00B2> Resid SD AIC
Single global linear (no break) 0.9266 3.852 2067.0
GAM smooth fortitude 0.9784 2.046 1614.2
Segmented regression (break at fort=445) 0.9768 2.165 1644.6
Two formulas, hard split at fort=500 0.9823 1.892 1560.2

The hard split at fortitude=500 beats the single global linear by ~500 AIC points and beats a flexible GAM smooth by ~50. That second margin is the important one: it says the relationship has a true discontinuity at 500, not just a slope change. Classic step function in the game code.

The sign flip across the break is iron-clad:

Fortitude coefficient on either side of the break
Subset n Fortitude coef p-value
Unarmored (fortitude < 500) 291 -0.01936 0
Armored (fortitude >= 500) 79 0.05730 0

Both coefficients are extremely significant, and they have opposite signs.

The Negative-Fortitude Coefficient: Artifact, Mechanism, or Bug?

The unarmored fortitude coefficient is negative. Taken at face value, this would say that a creature with more fortitude has a lower level — counterintuitive enough to deserve scrutiny. There are three families of explanation:

  1. Artifact: the coefficient is not really about fortitude — it is bookkeeping that compensates for a collinearity (e.g. hardiness and fortitude correlate 0.85). Drop fortitude or shrink it, and another predictor should absorb the signal with no loss of fit.
  2. Mechanism: the game’s level formula literally subtracts a fortitude term for unarmored creatures. Counterintuitive as gameplay, but mechanically possible.
  3. Bug or balance lever: SWG ran a major creature-system rebalance post-launch (pre-launch pets could equip medium armor; that was stripped). A developer could have flipped the sign on the unarmored fortitude term either accidentally or as a deliberate “armor must pay for itself” damper.

We can rule out (1) with the data, but the data alone cannot distinguish (2) from (3). Four tests, in order.

Test 1: drop fortitude — does another coefficient absorb it?

Within the unarmored subset, fortitude correlates 0.85 with hardiness and 0.84 with health — Bio-Engineers wire these up together when crafting. If the negative-fortitude coefficient is bookkeeping for the hardiness correlation, dropping fortitude from the model should let hardiness pick up the signal with no loss of fit.

m_una_full <- m_una
m_una_no_fort <- lm(level ~ hardiness + dexterity + intellect +
                            cleverness + power + ke_floor + nonkinen,
                    data = una_df)

shift_compare <- data.frame(
  Term = c("hardiness", "dexterity", "intellect",
           "cleverness", "power", "ke_floor", "nonkinen"),
  With_fortitude = coef(m_una_full)[c("hardiness","dexterity","intellect",
                                      "cleverness","power","ke_floor","nonkinen")],
  Without_fortitude = coef(m_una_no_fort)[c("hardiness","dexterity","intellect",
                                            "cleverness","power","ke_floor","nonkinen")]
)
shift_compare$Shift <- shift_compare$Without_fortitude - shift_compare$With_fortitude
knitr::kable(shift_compare, digits = 4, row.names = FALSE,
             caption = "Coefficient shifts when fortitude is dropped from the unarmored model")
Coefficient shifts when fortitude is dropped from the unarmored model
Term With_fortitude Without_fortitude Shift
hardiness 0.0130 -0.0011 -0.0141
dexterity 0.0041 0.0031 -0.0010
intellect 0.0101 0.0188 0.0087
cleverness 0.0248 0.0171 -0.0077
power 0.0151 0.0094 -0.0057
ke_floor 0.1665 0.1845 0.0179
nonkinen 0.0543 0.0316 -0.0227

The hardiness coefficient flips from +0.013 to roughly zero — exactly what the artifact hypothesis predicts. So far, the picture is consistent with the negative coefficient being collinearity bookkeeping.

A manual VIF (variance inflation factor) confirms the collinearity is real:

manual_vif <- function(model) {
  X <- model.matrix(model)[, -1, drop = FALSE]
  vifs <- sapply(seq_len(ncol(X)), function(j) {
    fit <- lm(X[, j] ~ X[, -j])
    1 / (1 - summary(fit)$r.squared)
  })
  setNames(vifs, colnames(X))
}
round(manual_vif(m_una_full), 2)
##  hardiness  fortitude  dexterity  intellect cleverness      power   ke_floor 
##       6.36       4.72       5.72       9.76       7.54       2.31       1.75 
##   nonkinen 
##       1.66

Fortitude itself is at moderate VIF (~5); intellect and cleverness are worse. This test is suggestive of artifact, but not decisive — the fact that hardiness can absorb the signal does not prove the signal is nothing but hardiness.

Test 2: constrained regression — what is the predictive cost of forcing fortitude ≥ 0?

If the negative coefficient is pure artifact, forcing fortitude ≥ 0 (equivalent to setting it to zero, since the unconstrained estimate is on the wrong side of the constraint) should cost no predictive performance. Hardiness will absorb the signal and the model will fit just as well.

By KKT conditions for inequality-constrained least squares: when the unconstrained estimate violates the constraint, the constrained optimum sets the violating coefficient to its boundary (zero) and re-fits the others. So the constrained regression is mathematically identical to the drop-fortitude lm above; we just look at the predictive metrics this time.

una_df$resid_unc <- resid(m_una_full)
una_df$resid_con <- resid(m_una_no_fort)

cat(sprintf("Unconstrained M7: R^2 = %.4f, resid SD = %.3f, AIC = %.1f\n",
            summary(m_una_full)$r.squared, sd(resid(m_una_full)),
            AIC(m_una_full)))
## Unconstrained M7: R^2 = 0.9518, resid SD = 1.859, AIC = 1205.7
cat(sprintf("Constrained (fort >= 0): R^2 = %.4f, resid SD = %.3f, AIC = %.1f\n",
            summary(m_una_no_fort)$r.squared, sd(resid(m_una_no_fort)),
            AIC(m_una_no_fort)))
## Constrained (fort >= 0): R^2 = 0.9215, resid SD = 2.373, AIC = 1345.7
cat(sprintf("Delta R^2 = %+.4f, delta SD = %+.3f, delta AIC = %+.1f\n",
            summary(m_una_no_fort)$r.squared - summary(m_una_full)$r.squared,
            sd(resid(m_una_no_fort)) - sd(resid(m_una_full)),
            AIC(m_una_no_fort) - AIC(m_una_full)))
## Delta R^2 = -0.0303, delta SD = +0.514, delta AIC = +140.0

The cost is real and substantial: ~3 percentage points of R², half a level of residual SD, and ~140 AIC points. The artifact hypothesis predicts ≈ zero cost — this is a clear rejection.

The cost is not uniformly distributed. Where does it land?

Residuals by fortitude band, unconstrained vs constrained
Fortitude band n SD (unc) SD (con) Mean resid (unc) Mean resid (con)
0-100 110 1.513 1.933 0.019 0.636
100-200 30 1.806 2.476 -0.133 0.227
200-300 35 1.975 2.158 0.336 1.092
300-400 24 1.740 2.062 0.125 -0.036
400-500 91 2.228 2.497 -0.155 -1.253
NA 1 NA NA 1.270 -0.131

The constrained model develops a clean U-shape across fortitude bands: positive bias at low fortitude (under-prediction by ~0.6 levels in the 0–100 band) and negative bias at high fortitude (over-prediction by ~1.3 levels in the 400–500 band). That’s the residual signature of a missing linear term in fortitude. The empirical penalty was filling exactly that hole.

The uber-CL10 cluster is the most affected:

uber_idx <- una_df$level <= 12 & una_df$kinetic >= 50
cat(sprintf("Uber-CL10 cluster (n = %d):\n", sum(uber_idx)))
## Uber-CL10 cluster (n = 39):
cat(sprintf("  Unconstrained: mean resid %+.2f, SD %.2f\n",
            mean(una_df$resid_unc[uber_idx]),
            sd(una_df$resid_unc[uber_idx])))
##   Unconstrained: mean resid -0.35, SD 0.75
cat(sprintf("  Constrained:   mean resid %+.2f, SD %.2f\n",
            mean(una_df$resid_con[uber_idx]),
            sd(una_df$resid_con[uber_idx])))
##   Constrained:   mean resid -1.86, SD 0.75

Same spread, but a 1.5-level downward bias. The negative fortitude coefficient was specifically pulling these creatures from “predicted ~12” down to “actually CL 10.”

Test 3: effective vs special resists

A more subtle artifact possibility: in the game, kinetic and energy resists come from two sources — effective resists derived directly from fortitude (floor(fortitude / 10) for unarmored) and special resists from DNA samples. When both are present, special wins and effective drops out. For creatures whose displayed kinetic/energy is effective, the column M7 consumes is literally a function of fortitude — a potential double-count that could create a spurious coefficient.

unarm <- normalized_df %>% dplyr::filter(armor == 0) %>%
  mutate(
    ke_floor    = pmax(pmin(kinetic, energy), 0),
    k_effective = !is.na(kinetic.effective) & kinetic.effective != 0,
    e_effective = !is.na(energy.effective)  & energy.effective  != 0,
    k_special   = !is.na(kinetic.special)   & kinetic.special   != 0,
    e_special   = !is.na(energy.special)    & energy.special    != 0
  )

# Verify: kinetic.effective = floor(fortitude / 10) for unarmored
unarm_eff <- unarm %>% dplyr::filter(k_effective)
exact_match <- sum(unarm_eff$kinetic.effective == floor(unarm_eff$fortitude / 10))
cat(sprintf("kinetic.effective matches floor(fortitude/10): %d / %d\n",
            exact_match, nrow(unarm_eff)))
## kinetic.effective matches floor(fortitude/10): 34 / 40
# Refit on the all-special subset (no fortitude->resist algebraic relationship)
all_special <- unarm %>%
  dplyr::filter(!k_effective & !e_effective &
                (k_special | kinetic == 0) &
                (e_special | energy == 0))
m_special <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                        cleverness + power + ke_floor + nonkinen,
                data = all_special)

cat(sprintf("Full unarmored fortitude coef:  %.5f (n=%d)\n",
            coef(m_una)["fortitude"], nobs(m_una)))
## Full unarmored fortitude coef:  -0.01936 (n=291)
cat(sprintf("All-special  fortitude coef:    %.5f (n=%d)\n",
            coef(m_special)["fortitude"], nobs(m_special)))
## All-special  fortitude coef:    -0.01998 (n=239)

The fortitude coefficient is essentially identical on the all-special subset, where there is no algebraic relationship between fortitude and the kinetic/energy values M7 consumes. The effective-resist double-count is ruled out as the cause.

(For completeness: zeroing out effective resists from ke_floor makes the fit worse by ~40 AIC. The level formula appears to credit effective and special resists at the same rate, using whichever value is displayed.)

Test 4: elastic-net under cross-validation

The strongest test: regularize the regression and let cross-validation decide how much credibility each coefficient deserves. Lasso and elastic-net penalize the absolute size of coefficients, so a fragile coefficient — one that’s only there to bookkeep a collinearity — should shrink toward zero or get dropped entirely. A robust signal survives.

library(glmnet)

predictors <- c("hardiness","fortitude","dexterity","intellect",
                "cleverness","power","ke_floor","nonkinen")
X_una <- as.matrix(una_df[, predictors])
y_una <- una_df$level

set.seed(2026)
cv_ridge <- cv.glmnet(X_una, y_una, alpha = 0.0, nfolds = 10, standardize = TRUE)
cv_enet  <- cv.glmnet(X_una, y_una, alpha = 0.5, nfolds = 10, standardize = TRUE)
cv_lasso <- cv.glmnet(X_una, y_una, alpha = 1.0, nfolds = 10, standardize = TRUE)

reg_compare <- data.frame(
  Term = c("(Intercept)", predictors),
  OLS = round(coef(m_una), 5),
  Ridge_min = round(as.vector(coef(cv_ridge, s = "lambda.min")), 5),
  ElasticNet_min = round(as.vector(coef(cv_enet, s = "lambda.min")), 5),
  Lasso_min = round(as.vector(coef(cv_lasso, s = "lambda.min")), 5)
)
knitr::kable(reg_compare, row.names = FALSE,
             caption = "Cross-validated coefficients at lambda.min, original scale")
Cross-validated coefficients at lambda.min, original scale
Term OLS Ridge_min ElasticNet_min Lasso_min
(Intercept) 7.87316 8.56827 7.93384 7.92975
hardiness 0.01302 0.00627 0.01234 0.01239
fortitude -0.01936 -0.01180 -0.01860 -0.01860
dexterity 0.00410 0.00619 0.00417 0.00408
intellect 0.01010 0.01309 0.01058 0.01056
cleverness 0.02480 0.02005 0.02426 0.02436
power 0.01508 0.01238 0.01483 0.01481
ke_floor 0.16654 0.16228 0.16675 0.16675
nonkinen 0.05430 0.04601 0.05312 0.05307

Lasso and elastic-net at the cross-validated optimum essentially reproduce the OLS solution, including the −0.019 fortitude coefficient. Ridge halves both ends of the (hardiness, fortitude) pair but keeps the negative sign on fortitude.

The lasso path is even more telling. Walking from λ=∞ down to λ=0, predictors enter the model in order of how much signal they carry:

Lasso entry order: when each predictor first becomes nonzero
Predictor EnteredAtLambda EnteredSign
intellect 6.98803 +
cleverness 4.81658 +
ke_floor 4.38869 +
power 1.73099 +
nonkinen 1.30943 +
dexterity 0.68274 +
fortitude 0.39069 -
hardiness 0.26929 +

fortitude enters at λ ≈ 0.39 with the negative sign already, and stays negative across the entire path. Crucially, hardiness enters last — even after fortitude. The “fragile” coefficient of the collinear pair is hardiness, not fortitude.

Reading the evidence

Putting the four tests together:

Test Predicted under “artifact” Observed
Drop fortitude No fit loss; another coef absorbs it Hardiness flips to ~0; R² drops 3 pts
Constrained regression No fit loss R² drops 3 pts; U-shape in residuals
Effective resists Coefficient explained by fort→resist channel Coefficient identical on all-special subset
Elastic-net (lasso) fortitude shrinks to zero fortitude stays at −0.019; hardiness enters last

The artifact hypothesis is rejected. The negative fortitude coefficient is doing real, irreducible predictive work — it cannot be reshuffled away by another coefficient and it cannot be regularized to zero without paying a real penalty.

That leaves two live mechanisms:

  • Real game-formula term. The unarmored CL formula literally subtracts a fortitude term. A “fortitude penalty” reads as counterintuitive gameplay, but it is mechanically possible — and it would explain the BE-folklore observation that pushing fortitude toward 499 minimized its CL contribution.
  • Bug or post-rebalance damper. Pre-launch SWG allowed pets to equip medium armor; that was removed in a major creature-system rebalance. Either the developer flipped the sign accidentally, or introduced the negative term deliberately as an “armor pays for itself” damper to keep heavy-resist unarmored pets from being too tameable. Either way, the empirical coefficient is what the retail game ran, and that is what an emulator must reproduce.

The data does not let us choose between these two — both predict exactly the empirical coefficient pattern we observe. But for the purposes of reproducing the retail formula, the choice does not matter: the coefficient is real, and the SWGEmu re-implementation should keep it.

Combined Fit: Both Formulas

Put the two formulas together with a hard switch at fortitude=500:

source("R/creature_level_model.R")

normalized_df$predicted_level <- custom_model(normalized_df)
normalized_df$residuals       <- normalized_df$level - normalized_df$predicted_level

cat(sprintf("Combined: n=%d, R^2=%.4f, residual SD=%.3f\n",
            nrow(normalized_df),
            1 - sum(normalized_df$residuals^2) /
                sum((normalized_df$level - mean(normalized_df$level))^2),
            sd(normalized_df$residuals)))
## Combined: n=370, R^2=0.9844, residual SD=1.775

Final Formulas

Armored (fortitude >= 500) — clean fractions, R² ≈ 0.98:

level = -23
      + 0.01   * hardiness
      + 0.06   * fortitude
      + 0.005  * dexterity
      + 0.01   * intellect
      + 0.025  * cleverness
      + 0.015  * power
      + 0.10   * kinen
      + 0.08   * nonkinen

with kinen = (kinetic + energy) / 2 and nonkinen = (blast + heat + cold + electricity + acid + stun) / 6.

Unarmored (fortitude < 500) — M8 with cleverness hinge, R² ≈ 0.96:

level = 8.13
      + 0.012  * hardiness
      - 0.019  * fortitude
      + 0.004  * dexterity
      + 0.011  * intellect
      + 0.020  * cleverness
      + 0.016  * power
      + 0.170  * pmax(pmin(kinetic, energy), 0)
      + 0.050  * nonkinen
      + 0.106  * pmax(cleverness - 400, 0)

Two structural features beyond the linear baseline:

Combined (both formulas, n=370): R² ≈ 0.96, residual SD ≈ 1.7 levels.

Phase 15: High-DPS Under-Prediction — A Cleverness Hinge at 400

Three creatures resisted M7 stubbornly: rancor 6j048r2a (CL48, residual +8.16), razor cat pdefjush (CL49, +5.40), and a sibling rancor 0iafqb8b (CL48, +5.10). All three sit at the apex of the unarmored CL distribution, with cleverness above 400. M7’s linear cleverness term (coef ~0.025) wasn’t enough to lift them.

Scan for a single hinge pmax(cleverness − K, 0) over K ∈ [100, 500]:

no_armor_df <- normalized_df %>% dplyr::filter(armor == 0)
no_armor_df$ke_floor <- pmax(pmin(no_armor_df$kinetic, no_armor_df$energy), 0)
no_armor_df$nk       <- (no_armor_df$blast + no_armor_df$heat +
                          no_armor_df$cold + no_armor_df$electricity +
                          no_armor_df$acid + no_armor_df$stun) / 6

m7_form <- level ~ hardiness + fortitude + dexterity + intellect +
                   cleverness + power + ke_floor + nk
fit_m7  <- lm(m7_form, data = no_armor_df)

knot_scan <- data.frame(K = seq(200, 475, by = 25))
knot_scan$dAIC <- sapply(knot_scan$K, function(K) {
  no_armor_df$.h <- pmax(no_armor_df$cleverness - K, 0)
  fit <- update(fit_m7, ~ . + .h, data = no_armor_df)
  AIC(fit_m7) - AIC(fit)
})
knot_scan$coef <- sapply(knot_scan$K, function(K) {
  no_armor_df$.h <- pmax(no_armor_df$cleverness - K, 0)
  coef(update(fit_m7, ~ . + .h, data = no_armor_df))[".h"]
})
print(knot_scan)
##      K     dAIC       coef
## 1  200 33.16757 0.02431164
## 2  225 34.86292 0.02619438
## 3  250 36.03381 0.02845901
## 4  275 39.48425 0.03285469
## 5  300 40.42823 0.03778011
## 6  325 40.83226 0.04445046
## 7  350 38.14335 0.05330615
## 8  375 35.87911 0.06884239
## 9  400 40.54167 0.10577139
## 10 425 42.84150 0.17599156
## 11 450 36.67013 0.37060579
## 12 475 19.17646 8.61611519

The dAIC plateau is wide (K=300..425 all give roughly equivalent fit), with an in-sample peak at K=425 (dAIC ≈ +43, coefficient ≈ 0.18). That’s the headline result — a cleverness hinge worth ~+40 AIC over M7.

Cross-validation and sanity check

The optimal K is loose because only ~10 creatures sit above clev=400. We need to verify the hinge isn’t an artifact of a single creature.

10-fold CV (50 reps; full code in R/cv_cleverness_hinge.R): M7 baseline CV-RMSE is 1.929; the hinge family beats it by 0.09–0.14 RMSE for any K in [200, 425]. Best K=425 gives CV-RMSE 1.788 — about 50× the standard error. Out-of-sample improvement is unambiguous.

Leave-one-out on the top-cleverness creatures: removing rancor 6j048r2a shifts the in-sample best K from 425 → 275 and the coefficient from 0.176 → 0.028. That single creature is the anchor for K=425. K=400, however, retains 88% of its coefficient when 6j048r2a is removed (0.106 → 0.093), and 86% when all three rancors are removed.

Bootstrap of the K=400 coefficient (200 reps): median 0.105, 95% CI [0.057, 0.129] — well clear of zero and stable.

Raw-HTML check on 6j048r2a: level=48 is correct, template stats (clev=476, power=338) are internally consistent with the recorded samples, and no plausible single-digit typo on level / cleverness / power makes the residual go away while remaining physically craftable.

Decision: M8 = M7 + the K=400 cleverness hinge

no_armor_df$clev_h400 <- pmax(no_armor_df$cleverness - 400, 0)
fit_m8 <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                     cleverness + power + ke_floor + nk + clev_h400,
             data = no_armor_df)
cat(sprintf("M8 unarmored: R^2=%.4f  SD=%.4f  AIC=%.2f\n",
            summary(fit_m8)$r.squared, sd(resid(fit_m8)), AIC(fit_m8)))
## M8 unarmored: R^2=0.9584  SD=1.7279  AIC=1165.13
print(round(coef(fit_m8), 5))
## (Intercept)   hardiness   fortitude   dexterity   intellect  cleverness 
##     8.13225     0.01230    -0.01940     0.00444     0.01139     0.01951 
##       power    ke_floor          nk   clev_h400 
##     0.01561     0.16965     0.05038     0.10577

Effective slope above clev 400 is 0.020 + 0.106 = 0.126 — apex pets gain CL more aggressively per cleverness point than mid-tier ones.

K=400 is chosen over the in-sample optimum K=425 because (a) it survives single-point removal, (b) it has a tight bootstrap CI, and (c) it’s a plausible game-design round number. Promoting K=425 would have meant banking on 6j048r2a.

A nice side-effect: the rancor “+5 skin adjustment” that earlier phases needed turns out to be a high-cleverness artifact. Mean rancor residual under M7 was +5.25; under M8 it is +1.48. The rancor row in the skin-adjustments table is retired.

Phase 16: Is the Hinge Pervasive? Test on the Armored Set

If the cleverness hinge is a global game mechanic, the armored formula should benefit from the same term. The armored set has more high-cleverness creatures than unarmored (26 with clev ≥ 400 vs 10), so it’s a fair test.

arm_df <- normalized_df %>% dplyr::filter(armor == 1, fortitude >= 500)
arm_form <- level ~ hardiness + fortitude + dexterity + intellect +
                    cleverness + power +
                    kinetic + energy +
                    blast + heat + cold + electricity + acid + stun
fit_arm <- lm(arm_form, data = arm_df)

# Knot scan
arm_scan <- data.frame(K = seq(200, 500, by = 50))
arm_scan$dAIC <- sapply(arm_scan$K, function(K) {
  arm_df$.h <- pmax(arm_df$cleverness - K, 0)
  AIC(fit_arm) - AIC(update(fit_arm, ~ . + .h, data = arm_df))
})
arm_scan$coef <- sapply(arm_scan$K, function(K) {
  arm_df$.h <- pmax(arm_df$cleverness - K, 0)
  coef(update(fit_arm, ~ . + .h, data = arm_df))[".h"]
})
print(arm_scan)
##     K      dAIC        coef
## 1 200  8.649897 0.020299713
## 2 250  7.989614 0.016828796
## 3 300  5.057604 0.013344742
## 4 350  2.121336 0.010062670
## 5 400 -1.063663 0.004974901
## 6 450 -1.926995 0.001541793
## 7 500 -1.908498 0.002162013

The armored knot scan goes the wrong way. The in-sample best is K=100 with a tiny dAIC of about +11 (vs +43 for unarmored), and the dAIC declines monotonically as K rises — already negative by K=400.

A free-coefficient hinge at K=400 has coef ≈ 0.005 (1/20 of the unarmored coefficient), ANOVA p ≈ 0.39, and a bootstrap 95% CI of roughly [−0.009, 0.018] — zero sits comfortably inside it. Forcing the unarmored coefficient (0.106) onto the armored formula collapses its R² from 0.987 to 0.925.

Conclusion: the cleverness hinge is structurally unarmored-specific. It’s not a stat-driven scaling that would surface anywhere cleverness gets high. This parallels the negative-fortitude finding — both are unique to the unarmored regime, consistent with the “armor pays for itself” interpretation. Armored creatures already pick up CL from fortitude (+0.06) and the steeper kinetic/energy terms; they don’t need the hinge. Unarmored apex pets, lacking that armor-derived credit, get it back through the cleverness hinge.

Full code in R/test_hinge_on_armored.R.

Phase 17: Remaining Anomalies and a Broader Hypothesis

Two falumpaset outliers survive M8 with large positive residuals:

##     serial       skin level hardiness fortitude cleverness power pred_m8
## 1 01oatm1v falumpaset    32       575       406        191   447   26.57
## 2 1lc95n55 falumpaset    30       557       439        193   292   21.24
##   resid_m8
## 1     5.43
## 2     8.76

Both have low cleverness (≈ 192) so the hinge can’t help them. Raw HTML confirms the levels and template stats are internally consistent — no transcription error. Both share an unusual attack profile (“Dizzy/Strong poison/Ranged”) that the other 8 falumpasets don’t have, and across the unarmored set sa2=strong_poison does have a mean residual of +1.09. But this signal is almost certainly a confound: strong_poison comes mainly from the Aggression and Psychology sample-source slots in BE crafting, so the attack flag is a marker for template-source-slot composition (which the model doesn’t use), not an attack-→-level mechanism.

The more interesting observation is that these two falumpasets sit at the tail of a broader population pattern — high-hardiness, low- cleverness, CL30+ creatures that the linear model under-predicts:

##     serial       skin level hardiness cleverness power pred_m8 resid_m8
## 1 1lc95n55 falumpaset    30       557        193   292   21.24     8.76
## 2 01oatm1v falumpaset    32       575        191   447   26.57     5.43
## 3 1gnl98ni  razor_cat    30       552        233   436   25.63     4.37
## 4 h854qtlr      gnort    30       591        241   399   25.98     4.02
## 5 1f8gec1e       kima    25       512        103   225   22.42     2.58
## 6 s29dv1mh      durni    26       606        108   367   24.04     1.96

Quick scans suggest at least two more candidate hinges sitting in the M8 residuals:

Neither is promoted. They’re roughly 1/3 the size of the cleverness hinge effect, may overlap mechanistically (capturing one underlying mechanism, not multiple independent terms), and would need the same CV / sanity-check workflow before being trusted. The data is consistent with the game computing CL from stat bands plus piecewise structure rather than a single linear formula — the cleverness K=400 hinge is just the most identifiable slice because the dataset spans its knot evenly. Code in R/investigate_falumpaset_anomalies.R.

Phase 18: Near-Armor-Border Zone — Resolved Under M8

Earlier phases noted that residuals seemed concentrated near the fortitude=500 boundary, suggesting a possible “light armor” middle regime. Three tests under M8:

Cutoff scan: refit the partition for cutoffs T ∈ [440, 540].

df <- normalized_df %>%
  dplyr::mutate(ke_floor = pmax(pmin(kinetic, energy), 0),
                nk = (blast + heat + cold + electricity + acid + stun) / 6,
                clev_h400 = pmax(cleverness - 400, 0))

scan_cutoff <- function(T) {
  un <- df %>% dplyr::filter(fortitude < T)
  ar <- df %>% dplyr::filter(fortitude >= T)
  if (nrow(un) < 50 || nrow(ar) < 30) return(NULL)
  fit_u <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                       cleverness + power + ke_floor + nk + clev_h400,
              data = un)
  fit_a <- lm(level ~ hardiness + fortitude + dexterity + intellect +
                       cleverness + power +
                       kinetic + energy + blast + heat + cold +
                       electricity + acid + stun, data = ar)
  sse <- sum(resid(fit_u)^2) + sum(resid(fit_a)^2)
  n <- nrow(un) + nrow(ar)
  p <- length(coef(fit_u)) + length(coef(fit_a)) + 2
  data.frame(T = T, n_un = nrow(un), n_ar = nrow(ar),
             sd = sqrt(sse / n),
             aic = n * log(sse/n) + 2 * p)
}
cutoff_scan <- do.call(rbind,
  lapply(seq(460, 540, by = 10),
         function(T) { r <- scan_cutoff(T); if (!is.null(r)) r else NULL }))
cutoff_scan$dAIC <- min(cutoff_scan$aic) - cutoff_scan$aic
print(cutoff_scan)
##     T n_un n_ar       sd      aic       dAIC
## 1 460  276   94 1.930536 540.7704  -93.43871
## 2 470  281   89 1.909047 532.4869  -85.15526
## 3 480  285   85 1.759753 472.2284  -24.89672
## 4 490  288   82 1.727086 458.3623  -11.03067
## 5 500  290   80 1.701532 447.3317    0.00000
## 6 510  306   64 2.062877 589.8353 -142.50363
## 7 520  315   55 2.371146 692.8964 -245.56476
## 8 530  326   44 2.827763 823.2195 -375.88786
## 9 540  333   37 2.991871 864.9653 -417.63367

The fort=500 cutoff is a knife-edge optimum. Moving it 10 either way costs at least 11 AIC; raising it to 510 costs 142 AIC, to 520 costs ~250. The asymmetry tells us the unarmored formula extrapolates roughly linearly into the boundary, but classifying a true-armored creature as unarmored blows up the prediction.

Three-segment piecewise (low/mid/high with break at 450 and 540): worse than two-segment by ΔAIC ≈ −56. There is no “light armor middle regime” hiding in the data.

Boundary-band residuals: under M8 the bands have slightly elevated SD (~2.0 vs 1.7 baseline) but no systematic bias. The original M7-era claim “the boundary fits worst” was an artifact of the high-DPS under-prediction sitting in this region; M8 incidentally fixed it.

A small discovery: armor flag vs fortitude

Joining creatures.csv to templates.csv and cross-tabulating the armor flag (creature side) against fortitude >= 500 (template side) on the unfiltered raw data:

raw_creatures <- readr::read_csv("data/clean/furrycat/creatures.csv",
                                 show_col_types = FALSE)
raw_templates <- readr::read_csv("data/clean/furrycat/templates.csv",
                                 show_col_types = FALSE)
joined_raw <- raw_creatures %>%
  dplyr::inner_join(raw_templates %>%
                      dplyr::select(serial, fortitude),
                    by = c("template_id" = "serial"))
table(armor_flag = joined_raw$armor,
      fort_ge_500 = joined_raw$fortitude >= 500)
##           fort_ge_500
## armor_flag FALSE TRUE
##          0   374    1
##          1     1   85

Out of 461 raw creatures, only 2 disagreements exist: kjvtv7d7 (template fort=408 with “Light” armor specified, already filtered as bad data) and 3f0lpuko (template fort=501 with “Light” armor specified, but the creature output has empty/vulnerable AR in the raw HTML). In both disagreements the unarmored M8 formula fits better than the armored formula — the creature output is the ground truth, and a BE template specifying armor doesn’t guarantee the crafted output gets the AR slot.

The principled splitter is therefore armor == 0 from the creature output, not fortitude < 500 from the template. Empirically the two rules disagree on a handful of creatures (one in our filtered set), so the in-sample fit is essentially identical. custom_model keeps the fortitude split for now; an SWGEmu re-implementation should branch on the actual creature armor state, which the engine knows directly.

Persistent Outliers After M8

After M8, the unarmored set has SD ≈ 1.73. Most outliers are within the crafting-noise envelope. The structural ones that remain:

A 10-creature “data-quality suspects” sensitivity analysis (low- health-for-skin entries plus the 6 effective-resist formula mismatches) shifts coefficients by less than 0.01 each. The formula is not being driven by a small number of suspect rows.

Implementation Pseudocode

int calculateCreatureLevel(const Creature* c) {
    float kinen = (c->kinetic + c->energy) / 2.0f;
    float nonkinen = (c->blast + c->heat + c->cold +
                      c->electricity + c->acid + c->stun) / 6.0f;

    float level;

    if (c->fortitude >= 500) {
        // ARMORED FORMULA
        level = -23.0f
              + 0.01f   * c->hardiness
              + 0.06f   * c->fortitude
              + 0.005f  * c->dexterity
              + 0.01f   * c->intellect
              + 0.025f  * c->cleverness
              + 0.015f  * c->power
              + 0.1f    * kinen
              + 0.08f   * nonkinen;
    } else {
        // UNARMORED FORMULA — M8 = M7 weaker-resist-floor + cleverness hinge.
        // ke_floor = max(0, min(kinetic, energy)): if either side has a
        // vulnerability, the kin/eng resist credit collapses to zero.
        // clev_hinge: above cleverness 400, level scales more aggressively.
        float ke_floor   = fmaxf(0.0f, fminf(c->kinetic, c->energy));
        float clev_hinge = fmaxf(0.0f, c->cleverness - 400.0f);

        level = 8.13f
              + 0.012f  * c->hardiness
              - 0.019f  * c->fortitude
              + 0.004f  * c->dexterity
              + 0.011f  * c->intellect
              + 0.020f  * c->cleverness
              + 0.016f  * c->power
              + 0.170f  * ke_floor
              + 0.050f  * nonkinen
              + 0.106f  * clev_hinge;
    }

    if (level < 1.0f)  level = 1.0f;

    return (int)(level + 0.5f);   // round to nearest
}

The R counterpart of this pseudocode lives in R/creature_level_model.R as custom_model().