PURPOSEThis study investigated the potential of deep convolutional neural networks (CNN) for automatic classification of FP-CIT SPECT in multi-site or multi-camera settings with variable image characteristics.METHODSThe study included FP-CIT SPECT of 645 subjects from the Parkinson's Progression Marker Initiative (PPMI), 207 healthy controls, and 438 Parkinson's disease patients. SPECT images were smoothed with an isotropic 18-mm Gaussian kernel resulting in 3 different PPMI settings: (i) original (unsmoothed), (ii) smoothed, and (iii) mixed setting comprising all original and all smoothed images. A deep CNN with 2,872,642 parameters was trained, validated, and tested separately for each setting using 10 random splits with 60/20/20% allocation to training/validation/test sample. The putaminal specific binding ratio (SBR) was computed using a standard anatomical ROI predefined in MNI space (AAL atlas) or using the hottest voxels (HV) analysis. Both SBR measures were trained (ROC analysis, Youden criterion) using the same random splits as for the CNN. CNN and SBR trained in the mixed PPMI setting were also tested in an independent sample from clinical routine patient care (149 with non-neurodegenerative and 149 with neurodegenerative parkinsonian syndrome).RESULTSBoth SBR measures performed worse in the mixed PPMI setting compared to the pure PPMI settings (e.g., AAL-SBR accuracy = 0.900 ± 0.029 in the mixed setting versus 0.957 ± 0.017 and 0.952 ± 0.015 in original and smoothed setting, both p < 0.01). In contrast, the CNN showed similar accuracy in all PPMI settings (0.967 ± 0.018, 0.972 ± 0.014, and 0.955 ± 0.009 in mixed, original, and smoothed setting). Similar results were obtained in the clinical sample. After training in the mixed PPMI setting, only the CNN provided acceptable performance in the clinical sample.CONCLUSIONSThese findings provide proof of concept that a deep CNN can be trained to be robust with respect to variable site-, camera-, or scan-specific image characteristics without a large loss of diagnostic accuracy compared with mono-site/mono-camera settings. We hypothesize that a single CNN can be used to support the interpretation of FP-CIT SPECT at many different sites using different acquisition hardware and/or reconstruction software with only minor harmonization of acquisition and reconstruction protocols.