Please click here to download the model.
Neoadjuvant chemotherapy (NAC) is the primary method to reduce the burden of tumor and metastasis; in the treatment of breast cancer, it may provide additional opportunities
for breast-conserving surgery. Preoperative assessment of pathological complete response (PCR) to NAC is important for developing individualized treatment approaches and predicting
patient prognosis. Compared to magnetic resonance imaging (MRI) and mammography, ultrasonography (US) has the advantages of simplicity, flexibility, and real-time imaging. Moreover, it does not require radiation and can provide multi-time acquisition of the tumor during NAC treatment. Recently, deep learning radiomics models based on multi-time-point US images for the
prediction of NAC effectiveness have been proposed. To further improve the prediction performance, we carefully designed four supporting modules for our proposed dual-input transformer (DiT): isolated tokens-to-token patch embedding module, shared
position embedding, time embedding, and weighted average pooling feature representation modules. The design of each module considers the characteristics of the US images at multiple
time points. We validated our model on our retrospective US dataset composed of 484 cases from two centers whose consistency is not sufficiently high. Patients were allocated to training (n =
298), validation (n = 99), and external test (n = 88) sets. The results show that our model can achieve better performance than the Siamese CNN and the standard tokens-to-token vision
transformer without using multi-time-point images. The ablation study also proved the effectiveness of each module designed for DiT.