Marine Pollution Bulletin increasingly applies machine learning and explainable AI to pollutant and shellfish poisoning risk, exemplified by PCA-based source apportionment and SHAP-based feature attribution. However, linear PCA may misrepresent structure in inherently nonlinear environmental data, and existing studies often treat model-derived feature importances as evidence of true associations without assessing consistency or dose-response relationships. This paper clarifies that supervised models possess two distinct accuracies: prediction and feature importance, and only prediction can be validated against ground truth. Using a Basque coastal dataset (8195 instances, 14 features) with chlorophyll-a as a proxy for paralytic shellfish poisoning risk, we introduce a leave-top1-out procedure to test ranking stability. Random Forest and XGBoost with and without SHAP show pronounced instability, indicating biased, model-dependent importances. In contrast, unsupervised and non-target-prediction methods yield perfectly stable rankings while matching or exceeding supervised performance, supporting routine stability, consistency, dose-response, and linearity checks in environmental ML studies.