Interpretable machine learning is increasingly used in oncology, yet feature attributions from supervised models (e.g., Random Forest, XGBoost) can be unstable and bias-prone when grounded solely in SHAP explanations. We contrast target-prediction accuracy with the reliability of feature-importance estimates and assess stability via feature-elimination tests on TCGA data (705 samples, 1936 features). While supervised models achieved modest gains (Random Forest ROC-AUC: 0.8851 to 0.8865; XGBoost: 0.8681 to 0.8695), their feature-selection stability was low (6/10 and 3/10 retained, respectively). Unsupervised and non-target methods were markedly more stable: Feature Agglomeration, Highly Variable Gene Selection, and Spearman retained 10/10 features with unchanged performance (0.8823, 0.8823, 0.8766). We recommend combining unsupervised criteria with causal design and external validation to mitigate model-specific biases.