Purpose To develop and evaluate a new deep learning MR denoising method that leverages quant. noise distribution information from the reconstruction process to improve denoising performance and generalization.Methods This retrospective study trained 14 different transformer and convolutional models with two backbone architectures on a large dataset of 2,885,236 images from 96,605 cardiac retro-gated cine complex series acquired at 3T.The proposed training scheme, termed SNRAware, leverages knowledge of the MRI reconstruction process to improve denoising performance by (1) simulating large, high quality, and diverse synthetic datasets, and (2) providing quant. information about the noise distribution to the model.In-distribution testing was performed on a hold-out dataset of 3000 samples with performance measured using PSNR and SSIM, with ablation comparison without the noise augmentation.Out-of-distribution tests were conducted on cardiac real-time cine, first-pass cardiac perfusion, and neuro and spine MRI, all acquired at 1.5T, to test model generalization across imaging sequences, dynamically changing contrast, different anatomies, and field strengths.Results The in-distribution tests showed that SNRAware training resulted in the best performance for all 14 models tested, better than those trained without the proposed synthetic data generation process or knowledge of the noise distribution.Models trained without any reconstruction knowledge were the most inferior.The improvement was architecture agnostic and shown for both convolution and transformer attention-based models; among them, the transformer models outperformed their convolutional counterparts and training with 3D input tensors improved performance over only using 2D images.The best model found in the in-distribution test generalized well to out-of-distribution samples, delivering 6.5x and 2.9x CNR improvement for real-time cine and perfusion imaging, resp.Further, a model trained with 100% cardiac cine data generalized well to a T1 MPRAGE neuro 3D scan and T2 TSE spine MRI.Conclusions An SNRAware training scheme was proposed to leverage information from the MRI reconstruction process in deep learning denoising training, resulting in improved performance and good generalization properties.