File(s) under embargo
5
month(s)21
day(s)until file(s) become available
Transformaer‐based model for lung adenocarcinoma subtypes
BackgroundLung cancer has the highest morbidity and mortality rate among all types of cancer. Histological subtypes serve as crucial markers for the development of lung cancer and possess significant clinical values for cancer diagnosis, prognosis, and prediction of treatment responses. However, existing studies only dichotomize normal and cancerous tissues, failing to capture the unique characteristics of tissue sections and cancer types.PurposeTherefore, we have pioneered the classification of lung adenocarcinoma (LAD) cancer tissues into five subtypes (acinar, lepidic, micropapillary, papillary, and solid) based on section data in whole‐slide image sections. In addition, a novel model called HybridNet was designed to improve the classification performance.MethodsHybridNet primarily consists of two interactive streams: a Transformer and a convolutional neural network (CNN). The Transformer stream captures rich global representations using a self‐attention mechanism, while the CNN stream extracts local semantic features to optimize image details. Specifically, during the dual‐stream parallelism, the feature maps of the Transformer stream as weights are weighted and summed with those of the CNN stream backbone; at the end of the parallelism, the respective final features are concatenated to obtain more discriminative semantic information.ResultsExperimental results on a private dataset of LAD showed that HybridNet achieved 95.12% classification accuracy, and the accuracy of five histological subtypes (acinar, lepidic, micropapillary, papillary, and solid) reached 94.5%, 97.1%, 94%, 91%, and 99% respectively; the experimental results on the public BreakHis dataset show that HybridNet achieves the best results in three evaluation metrics: accuracy, recall and F1‐score, with 92.40%, 90.63%, and 91.43%, respectively.ConclusionsThe process of classifying LAD into five subtypes assists pathologists in selecting appropriate treatments and enables them to predict tumor mutation burden (TMB) and analyze the spatial distribution of immune checkpoint proteins based on this and other clinical data. In addition, the proposed HybridNet fuses CNN and Transformer information several times and is able to improve the accuracy of subtype classification, and also shows satisfactory performance on public datasets with some generalization ability.
Funding
National Natural Science Foundation of China. Grant Numbers: 62302279, 61773246, 81871508
Natural Science Foundation of Shandong Province. Grant Numbers: ZR2018, ZB0419
History
Author affiliation
College of Science & Engineering Comp' & Math' SciencesVersion
- AM (Accepted Manuscript)