Hello Robert, thank you very much for the correction! You are right, I’ve just edited the article:)
About your question: while deciding the best model among those with the same number of variables, we can use RSS or R² as metrics on the whole dataset. Indeed, the point of splitting into test and train set (or even better, to implement cross-validation) is preventing the model from being over-parametrized. Indeed, adding more parameters will always lower the training error, but after a given threshold the test error will start increasing, so while comparing models with a different number of parameters we cannot rely on the training RSS (or other error measurements). That’s why, once provided with a set of models with a different number of parameters, we apply cross-validation or at least compute the error adjusted with terms which penalizes the model if more parameters are added.
Hoping this is clear!