BACKGROUNDThe development of models using deep learning (DL) to assess pressure injuries from wound images has recently gained attention. Creating enough supervised data is important for improving performance but is time-consuming. Therefore, the development of models that can achieve high performance with limited supervised data is desirable.MATERIALS AND METHODSThis retrospective observational study utilized DL and included patients who received medical examinations for sacral pressure injuries between February 2017 and December 2021. Images were labeled according to the DESIGN-R® classification. Three artificial intelligence (AI) models for assessing pressure injury depth were created with a convolutional neural network (Categorical, Binary, and Combined classification models) and performance was compared among the models.RESULTSA set of 414 pressure injury images in five depth stages (d0 to D4) were analyzed. The Combined classification model showed superior performance (F1-score, 0.868). The Categorical classification model frequently misclassified d1 and d2 as d0 (d0 Precision, 0.503), but showed high performance for D3 and D4 (F1-score, 0.986 and 0.966, respectively). The Binary classification model showed high performance in differentiating between d0 and d1-D4 (F1-score, 0.895); however, performance decreased with increasing number of evaluation steps.CONCLUSIONThe Combined classification model displayed superior performance without increasing the supervised data, which can be attributed to use of the high-performance Binary classification model for initial d0 evaluation and subsequent use of the Categorical classification model with fewer evaluation steps. Understanding the unique characteristics of classification methods and deploying them appropriately can enhance AI model performance.