基于预训练的Swin Transformer模型构建及其对糖尿病视网膜病变的诊断效能分析

WANG Gang; 王洪敏; 王善志; 朱永俊; 柳明杰

引用本文:	WANG Gang，王洪敏，王善志，朱永俊，柳明杰.基于预训练的Swin Transformer模型构建及其对糖尿病视网膜病变的诊断效能分析[J].中国临床新医学,2023,16(4):360-365.

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 1239次下载 901次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于预训练的Swin Transformer模型构建及其对糖尿病视网膜病变的诊断效能分析
WANG Gang，王洪敏，王善志，朱永俊，柳明杰
121031　辽宁，渤海大学信息科学与技术学院（WANG Gang，王洪敏）；570100　海口，海南医学院第一附属医院肾内科（王善志，朱永俊）；121031　辽宁，锦州医科大学基础医学院免疫教研室（柳明杰）

摘要:

［摘要］　目的　构建基于预训练的Swin Transformer模型，分析其对糖尿病视网膜病变诊断的效能。方法　在数据建模及数据分析竞赛平台(https://www.kaggle.com/competitions/aptos2019-blindness-detection)下载APTOS 2019 Blindness Detection竞赛的训练数据集。使用OpenCV图像处理库通过更改亮度、不同角度翻转和直方图均衡化的方式来增广数据，共得到9 160张彩色眼底图片作为完整数据集。构建预训练的Swin Transformer模型对图片进行病变等级分类，与预训练的Vision Transformer、EfficientNetV2、ResNet-50和GoogLeNet四个神经网络模型的训练结果进行对比。还与非预训练的随机初始化参数的Swin Transformer模型对比分析预训练对于模型的影响。结果　基于预训练的Swin Transformer模型的二次加权Kappa值为0.977，准确率达94.6%，相较于Vision Transformer、EfficientNetV2、ResNet-50和GoogLeNet模型的准确率分别提高了1.9%、2.3%、5.4%和7.1%；相较于不使用预训练的Swin Transformer模型，准确率提高4.4%，训练轮数减少近400次。结论　基于预训练的Swin Transformer模型对糖尿病视网膜病变的诊断准确率高，有较好的临床应用价值。

关键词: 深度学习 Swin Transformer模型糖尿病视网膜病变预训练智慧医疗

DOI：10.3969/j.issn.1674-3806.2023.04.10

分类号:R 770.41

基金项目:国家自然科学基金资助项目（编号：82060143）

Construction of a pre-trained Swin Transformer model and analysis of its diagnostic efficacy in diabetic retinopathy

WANG Gang, WANG Hong-min, WANG Shan-zhi, et al.

College of Information Science and Technology, Bohai University, Liaoning 121031, China

Abstract:

［Abstract］　Objective　To construct a pre-trained Swin Transformer model, and to analyze its efficacy in diagnosis of diabetic retinopathy. Methods　In the training dataset of the APTOS 2019 Blindness Detection Competition was downloaded from the Data Modeling and Data Analysis of Competition Platform(https://www.kaggle.com/competitions/aptos2019-blindness-detection). The OpenCV Image Processing Library was used to augment the data by changing the brightness, flipping different angles and equalizing the histogram, and a total of 9 160 color fundus images were obtained as a complete dataset. The pre-trained Swin Transformer model was constructed to classify the grades of pathological changes in the images, and its results were compared with the training results of the four pre-trained Vision Transformer, EfficientNetV2, ResNet-50 and GoogLeNet neural network models. In addition, the impact of pre-training on the model was compared with the non-pre-trained Swin Transformer model with random initialization parameters. Results　The quadratic weighted Kappa value of the pre-trained Swin Transformer model was 0.977, with an accuracy rate being 94.6%. Compared with the accuracy rates in the Vision Transformer, EfficientNetV2, ResNet-50 and GoogLeNet models, the accuracy rates in the pre-trained Swin Transformer model were increased by 1.9%, 2.3%, 5.4% and 7.1%, respectively. Compared with those in the non-pre-trained Swin Transformer model, the accuracy rate in the pre-trained Swin Transformer model was increased by 4.4%, and the training rounds were reduced by nearly 400 times. Conclusion　The pre-trained Swin Transformer model has high diagnostic accuracy for diabetic retinopathy and good clinical application value.

Key words: Deep learning Swin Transformer model Diabetic retinopathy Pre-training Intelligent medicine