Skip to main content

Machine learning prediction models for different stages of non-small cell lung cancer based on tongue and tumor marker: a pilot study

Abstract

Objective

To analyze the tongue feature of NSCLC at different stages, as well as the correlation between tongue feature and tumor marker, and investigate the feasibility of establishing prediction models for NSCLC at different stages based on tongue feature and tumor marker.

Methods

Tongue images were collected from non-advanced NSCLC patients (n = 109) and advanced NSCLC patients (n = 110), analyzed the tongue images to obtain tongue feature, and analyzed the correlation between tongue feature and tumor marker in different stages of NSCLC. On this basis, six classifiers, decision tree, logistic regression, SVM, random forest, naive bayes, and neural network, were used to establish prediction models for different stages of NSCLC based on tongue feature and tumor marker.

Results

There were statistically significant differences in tongue feature between the non-advanced and advanced NSCLC groups. In the advanced NSCLC group, the number of indexes with statistically significant correlations between tongue feature and tumor marker was significantly higher than in the non-advanced NSCLC group, and the correlations were stronger. Support Vector Machine (SVM), decision tree, and logistic regression among the machine learning methods performed poorly in models with different stages of NSCLC. Neural network, random forest and naive bayes had better classification efficiency for the data set of tongue feature and tumor marker and baseline. The models’ classification accuracies were 0.767 ± 0.081, 0.718 ± 0.062, and 0.688 ± 0.070, respectively, and the AUCs were 0.793 ± 0.086, 0.779 ± 0.075, and 0.771 ± 0.072, respectively.

Conclusions

There were statistically significant differences in tongue feature between different stages of NSCLC, with advanced NSCLC tongue feature being more closely correlated with tumor marker. Due to the limited information, single data sources including baseline, tongue feature, and tumor marker cannot be used to identify the different stages of NSCLC in this pilot study. In addition to the logistic regression method, other machine learning methods, based on tumor marker and baseline data sets, can effectively improve the differential diagnosis efficiency of different stages of NSCLC by adding tongue image data, which requires further verification based on large sample studies in the future.

Peer Review reports

Introduction

The International Agency for Research on Cancer (IARC) released the most recent global cancer data [1] in 2020, revealing that lung cancer is the most common cancer in men, the second most common cancer in women after breast cancer, and the leading cause of cancer death. Non-small cell lung cancer (NSCLC) is the most common histological type of lung cancer, accounting for 80–85% of all lung cancer cases, with high morbidity and mortality [2]. Lung cancer patients have a 5-year survival rate of 10–20%, and its prevention, screening, treatment, and reduction of the economic burden associated with lung cancer treatment have become an urgent problem to be solved [3]. Early detection, diagnosis, and treatment of NSCLC are critical for improving patient prognosis and survival rates. Different clinical stages of NSCLC patients receive different treatment methods, and their prognosis varies. Surgery is an effective treatment option for early lung cancer. Surgery can also be used to reduce the tumor burden in patients with locally advanced lung cancer, in conjunction with postoperative radiotherapy and chemotherapy, and the survival period can be effectively extended [4]. Treatment options for patients in advanced stages are limited due to tumor metastasis. Traditional Chinese Medicine (TCM) has specific characteristics and benefits in the treatment of advanced lung cancer. It can effectively reduce symptoms, stabilize tumors, and improve patients’ quality of life [5]. Therefore, it is of great significance to take effective methods to evaluate the clinical stage of NSCLC patients. At present, clinical staging of NSCLC primarily includes imaging and histological methods, with histological examination serving as the gold standard for NSCLC staging diagnosis. However, this method is invasive, complicated, and costly, causing harm to patients and even leading to tumor proliferation, and its use is limited. Therefore, finding a non-invasive, safe, reliable, and simple staging diagnosis approach for NSCLC is critical.

Tongue diagnosis is an important part of TCM diagnosis, and is one of its distinctive features. Studies have shown that the appearance of the tongue can reflect physiological and pathological changes in the body to some extent, and is closely related to a person’s overall health status. Research shows that there is a correlation between the tongue characteristics of patients with Chronic Kidney Disease (CKD) and the disease itself. By evaluating the tongue image features of CKD patients using an automated tongue diagnosis system, valuable information can be provided to clinical doctors, facilitating early detection and diagnosis of CKD [6]. The color, shape, thickness of the tongue coating, as well as the color of the tongue body, have certain correlations with the development of diabetes. Li Jun et al. have shown that tongue image features can significantly improve the prediction accuracy of diabetes risk models [7]. Tongue diagnosis has clinical potential in predicting the risk and severity of gastroesophageal reflux disease (GERD). It is expected to serve as an initial screening indicator for upper gastrointestinal diseases and assist doctors in non-invasive early diagnosis of GERD [8]. In addition, research indicates that the color value and thickness of the tongue coating during menstruation in patients with primary dysmenorrhea (PD) are significantly lower than those in the control group, the tongue image features obtained by computerized tongue image analysis system can serve as an auxiliary method for syndrome differentiation, evaluating therapeutic effects, and predicting prognosis in PD [9]. With the advancement of TCM diagnostic information technology in recent years, the modernization of TCM has ushered in new opportunities and challenges. In clinical practice, a variety of tongue diagnostic instruments are widely used, and the objective data acquisition and analysis technology based on standardized tongue diagnosis has gradually matured. The key technologies of tongue diagnosis include tongue body and tongue coating separation techniques, as well as feature extraction techniques. In modern tongue diagnosis research, digital image processing technology is widely used to extract features of color and texture, and various machine learning methods are used for analysis, all of which have achieved good results [10,11,12,13]. Wang X et al. [14] established a diagnostic model of tooth mark tongue based on a deep convolutional neural network, and the model has good validity and generalization, providing an objective and convenient computer-assisted tongue diagnosis method for tracking disease progression and evaluating efficacy from the perspective of informatics. Xu Q et al. [15] segmented tongue image based on deep neural network and established a multi-task joint learning model. Li J et al. [7] established a diabetes risk warning model based on tongue image by stacking model and ResNet50 model, and the results showed that the model established by combining tongue image data with machine learning had high classification efficiency. Digital tongue diagnosis research has become one of the focus of the modern research of TCM, along with the rapid development of artificial intelligence technology, different machine learning methods, such as logistic regression [16], support vector machine (SVM) [17], neural network [12], and other data mining methods have been widely used in medical research. Quantitative diagnosis of information is carried out through various mathematical models, which has promoted the development of TCM information-based intelligent diagnosis.

Serum tumor marker detection is an examination method for patients which has great clinical value in early diagnosis, efficacy evaluation, and prognosis judgment of lung cancer. Currently, it has been widely used in clinical research and plays an important role in monitoring recurrence and metastasis. The clinical value of carcinoembryonic antigen (CEA), carbohydrate antigen 125 (CA-125), carbohydrate antigen 199 (CA-199), alpha-fetoprotein (AFP), neuron-specific enolase (NSE), cytokeratin 19 fragment (CYFRA21-1), and carbohydrate antigen 15 − 3 (CA15-3) in lung cancer has been widely concerned [18]. Studies have shown that serum ferritin (SF), squamous cell carcinoma-associated antigen (SCC), NSE, CEA, and CYFRA21-1 were highly expressed in NSCLC and have important clinical value in evaluating clinicopathology, the combined detection of these 5 tumor markers can improve the diagnostic value of NSCLC [19]. Zhang H et al. [20] established a prediction model for EGFR mutation in NSCLC based on tumor marker and CT feature, and the model results showed that the prediction model combining tumor marker and CT feature was more accurate than the prediction model using tumor marker or CT feature alone.

Based on this, this pilot study is primarily based on the tongue feature and tumor marker of NSCLC, analyzing the tongue feature of NSCLC in different stages, the correlation between tongue feature and tumor marker, and attempting to establish NSCLC prediction models of different stages based on tongue feature and tumor marker using different machine learning methods, and trying to explore a new, non-invasive, and efficient method for diagnosing NSCLC of different stages, in order to effectively promote the early detection, diagnosis and treatment of NSCLC, as well as improve the survival rate and prognosis of patients with NSCLC. This was an exploratory pilot study, mainly focused on assessing the feasibility of the methodological establishment, emphasizing the accuracy and reliability of data collection, description, and analysis, and providing data and references for subsequent in-depth studies.

Materials and methods

Study design and subjects

From July 2020 to March 2022, 324 lung cancer patients at Longhua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine’s department of oncology were collected, and their case information, including medical record number, name, gender, medical history information, diagnosis information, and so on, were collected separately. Ethical approval was obtained from the Longhua Hospital affiliated to Shanghai University of Traditional Chinese Medicine Hospital Ethics Committee (registration number 2020LCSY083). Professionally trained graduate students collected standardized tongue image and tumor marker data. A total of 219 NSCLC patients were included in this study, including 109 patients with stages I, II, and III combined into the non-advanced NSCLC group and 110 patients with stage IV in the advanced NSCLC group. All patients were informed and signed informed consent after receiving a clear pathological diagnosis. The research flow chart was shown in Fig. 1.

Fig. 1
figure 1

Flow chart

Diagnostic criteria

According to the “Clinical Practice Guidelines for Lung Cancer Screening” issued by the National Comprehensive Cancer Network (NCCN) [21] and the fourth edition of the World Health Organization (WHO) “Classification of Lung Tumors” for histological classification of lung cancer [22, 23].

Inclusion and exclusion criteria

Inclusion criteria: (1) NSCLC diagnosed by pathology or cytology; (2) age ranging from 18 to 90 years; (3) clear pathological staging diagnosis; (4) complete tongue image; and (5) informed and signed informed consent.

Exclusion criteria include: (1) patients who did not meet the inclusion criteria; (2) pregnant or breastfeeding patients; (3) patients with other malignant tumors; (4) patients with systemic acute and chronic infections; and (5) patients with mental illness, unwilling to cooperate, or poor study compliance.

Collecting clinical data

TFDA-1 intelligent tongue diagnosis instrument

The Tongue Face Diagnosis Analysis-1(TFDA-1) digital tongue and face diagnosis instrument developed by the project team of the National Key Research and Development Program “TCM Intelligent Tongue Diagnosis System Research and Development” (NO: 2017YFC17033301) was used to collect the tongue images of patients, and the tongue image analysis system TDAS was used to analyze the tongue images to obtain the objective tongue features. The TFDA-1 digital tongue diagnosis instrument was shown in Fig. 2 (A) and Fig. 2 (B), and the corresponding tongue image analysis system TDAS was shown in Fig. 3.

Fig. 2
figure 2

TFDA-1 digital tongue and face diagnosis instrument A: front B: profile

Fig. 3
figure 3

TDAS tongue image analysis system

All tongue images were collected by researchers with standardized training to ensure the standardization and accuracy of collection. Specific tongue image collection methods were as follows: (1) set the shooting parameters and sterilize the instrument with alcohol; (2) instruct the subjects to place their chin on the mandibular rest of the digital tongue and face diagnosis instrument, relax naturally, open their mouth and stretch out the tongue, let the tongue body relax, tongue surface is flat, the tip of the tongue is downward, and touch the center of the tongue image in the camera to complete the acquisition. (3) examine the photographed tongue image, ensuring that the tongue body is complete and not nervous and that there is no fogging, light leakage, overexposure, or underexposure, and those who do not meet the requirements must be re-shot.

Introduction to features of tongue diagnosis

The tongue color index is derived from four different color spaces: RGB, HSI, Lab, and YCrCb. R(Red), G(Green), and B(Blue) represent the three primary colors of red, green, and blue, with values ranging from 0 to 255. “H” stands for Hue, and its angle range is [0, 2π], which means that the angle of red is 0, the angle of green is 2π/3, the angle of blue is 4π/3, and “S” stands for saturation. “I” stands for intensity; “L” stands for lightness, and its value ranges from 0 to 100, representing pure black to pure white, “a” stands for the green-red axis, its value range is [127, -128], “b” stands for the blue-yellow axis, its value range is [127, -128]; “Y” stands for the luminance, which ranges from 16 to 235, and “Cr” and “Cb” denote chrominance, where Cr denotes the difference between the red part of the RGB input signal and the brightness value of the RGB signal, that is, the degree of offset of the current color to red. and Cb represents the difference between the blue part of the RGB input signal and the brightness value of the RGB signal, that is, the degree of offset of the current color to blue; Cr and Cb have a value range of 16 to 240. CON (Contrast), ASM (Angular Second Moment), ENT (Entropy), and MEAN are the tongue texture indexes; perAll and perPart are the tongue coating indexes, where perAll is the ratio of the tongue coating area to the total tongue area and perPart is the ratio of the coating area to the uncoated tongue area. The prefix “TB-“ refers to the tongue body, and “TC-“ refers to tongue coating in this study. In order to better reflect the continuity of data and find the data regularity and real differences, this study rotated TB-H and TC-H by 180° and redefined the H value after rotation.

The tongue features were extracted automatically by computer batch processing, which had good stability. Data preprocessing in this paper was mainly for data outliers. This study we used the box-graph method to determine outliers, in which the interquartile range (IQR) was the difference between the third (upper) and first (lower) quartile (IQR = Q3-Q1). The upper and lower boundary line was also called outlier cutoff point, the upper outlier cutoff point was the upper quartile + 1.5IQR, the lower outlier cutoff point was the lower quartile − 1.5IQR.

In addition, the tumor markers of patients were obtained from the Hospital Information System (HIS), and the specific indexes included CA50, CA242, AFP, NSE, CA72-4, CYFRA21-1, SCC, CEA, CA125, CA15-3, and CA19-9.

Statistical analysis

SPSS 25.0 was used for statistical analysis, count data were expressed as percentage N (%), Pearson χ2/Fisher’s exact test was used for comparison between groups, measurement data that followed normal distribution were expressed as “X ± SD”, and those that did not conform were expressed as “Median ( P25, P75)”, T-test analysis was performed for groups followed to normality and homogeneity of variance, and independent sample Kruskal-Wallis U test was performed for those not conforming, and correlation heat maps were performed by GraphPad Prism 8.0. All test results were two-tailed, P < 0.05 was considered statistically significant.

Modeling with machine learning methods

In this experiment, six machine learning classification algorithms were used to establish differential diagnosis models for different stages of NSCLC, namely decision tree, support vector machines(SVM), random forest, neural network, naive bayes and logistic regression. Classification models were built using six data sets: “baseline”, “tumor marker”, “tongue feature”, “tongue feature and tumor marker,“ and “tongue feature and tumor marker and baseline” from patients with different clinical stages of NSCLC, and make two-class predictions respectively, baseline data here mainly included age and sex. All machine learning processes were done on R package. In addition to random forest, all other machine learning methods have been processed with data scaled. The data were normalized using the method of Z-score. The preprocessing-data method of Z-score is described as the following Eq. (6).

$$Z=\frac{X-\mu }{\sigma }$$
(6)

Where X denotes an element in a data vector, µ for mean value, and \(\sigma\) for standard deviation.

This study we used ten-fold cross-validation to screen and confirm the best parameters for the model. The optimal parameters for each model can be found in Supplementary material 1. After confirming the optimal parameters, the parameters were locked, and we resampled 10 times, with each resampled testing set occupying 30% of the total sample and the training set occupying 70%, to ensure that the evaluation results were not accidental. Then, the 10 evaluation results were averaged to reduce errors caused by unreasonable selection in the test set. The modeling was repeated 10 times for each data set, and the “Mean (Standard Deviations)” of the 10 classification results was used to describe the model’s classification performance.

As evaluation indexes, Accuracy, Precision, F1-score, Sensitivity, and Specificity were used. AUC was the area under the ROC curve, with values ranging from 0.5 to 1, the higher the value, the better the classification effect. Sensitivity, also known as true positive rate, assesses the sensitivity of diagnostic methods to diseases, the greater the sensitivity, the lower the likelihood of a missed diagnosis. Specificity is also known as the true negative rate, the higher the specificity, the greater the likelihood of a correct diagnosis. Accuracy indicates the proportion of the number of correctly classified test instances to the total number of test instances. Precision is the ratio of the number of positive cases correctly classified to the number of positive cases classified. F1-score is a harmonic average based on Recall and Precision, which is to evaluate the Recall and Precision comprehensively. The evaluation indexes were shown in the following formulas:

$$Accuracy=\frac{TP+TN}{TP+TN+FP+FN}\times 100\text{\%}$$
(1)
$$Precision=\frac{TP}{TP+FP}\times 100\text{\%}$$
(2)
$$Sensitivity=\frac{TP}{TP+FN}\times 100\text{\%}$$
(3)
$$Specificity=\frac{TN}{TN+FP}\times 100\text{\%}$$
(4)
$$F1=\frac{2\times Precision\times Sensitivity}{Precision+Sensitivity}$$
(5)

TP (True Positive) refers to a positive sample predicted as positive by the model. TN (True Negative) refers to a negative sample predicted by the model to be negative. FP (False Positive) refers to a negative sample predicted to be positive by the model; FN (False Negative) refers to a positive sample predicted to be negative by the model.

Results

Baseline data

The baseline data of NSCLC of the two groups with different stages were shown in Table 1.

Table 1 Baseline data table

The gender distribution of the two groups was more male than female in the non-advanced NSCLC group and more female than male in the advanced NSCLC group, and the gender difference between the two groups was statistically significant. In addition, the advanced NSCLC group was older than the non-advanced NSCLC group, and the age difference between the two groups was statistically significant.

Statistical analysis of tongue feature

The statistical analysis results of tongue features in different clinical stages of NSCLC were shown in Table 2.

Table 2 Statistical analysis of tongue features [Mean (SD), Median (P25, P75)]

In order to facilitate the observation of the distribution of tongue features with statistically significant differences, GraphPad Prism 8.0 software was used to draw its violin diagram, as shown in Fig. 4.

Fig. 4
figure 4

Violin diagram of tongue feature in the non-advanced NSCL and the advanced NSCLC group

vs. non-advanced NSCLC group, *P < 0.05, vs. non-advanced NSCLC group, **P < 0.01.

According to the statistical results, there were statistically significant differences in tongue features between the non-advanced NSCLC and the advanced NSCLC group, and the indexes were TB-B, TB-H, TC-H, TB-L, TB-a, TC-b, TB-Y and TC-Cb, respectively. There was no statistically significant difference in the texture index and tongue coating index between the two groups.

Correlation analysis of tongue feature and tumor marker

In order to further understand the correlation between the index of TCM and Western medicine in patients with different stages of NSCLC, and whether there was any difference in the correlation between the indexes of TCM and Western medicine in patients with different stages, the study analyzed the correlation between tongue feature and tumor marker in the non-advanced NSCLC and the advanced NSCLC group. A total of 107 patients (66 in the non-advanced NSCLC and 41 in the advanced NSCLC group) had complete tongue feature and tumor marker. The indexes of correlation coefficient ≥ 0.3 were used to make correlation heat maps, the statistical results and the correlation heat map of the non-advanced NSCLC group were shown in Table 3; Fig. 5, respectively. Statistical results and the correlated heat map of the advanced NSCLC group were shown in Table 4; Fig. 6 respectively.

Table 3 Correlation analysis between tongue feature and tumor marker in the non-advanced NSCLC group
Fig. 5
figure 5

Heat map of correlation between tongue feature and tumor marker in the non-advanced NSCLC group

Table 4 Correlation analysis between tongue feature and tumor marker in the advanced NSCLC group
Fig. 6
figure 6

Heat map of correlation between tongue feature and tumor marker in the advanced NSCLC group

The statistical results showed that CA125 in the non-advanced NSCLC group was significantly correlated with TC-H, TC-b, TC-Cb, and the correlation coefficients were − 0.395, -0.371, and 0.355 (P<0.01), and CA72-4 was correlated with TB-H, TC-H, TC-b, and the correlation coefficients were 0.305, 0.363, 0.344 (P < 0.05, P < 0.01). There was a correlation between CA72-4 and TB-H, TC-H, TC-b, and TC-Cb in the advanced NSCLC group, and the correlation coefficients were − 0.558, -0.403, -0.380, and 0.347, respectively (P<0.05, P<0.01), NSE was correlated with TB-a, TB-L, TB-Y, the correlation coefficients were − 0.403, -0.400, -0.394 (P<0.05, P<0.01), CA125 was correlated with TB-G, TB-Y, and the correlation coefficients were 0.357 and 0.329 (P < 0.05).

Classification model of NSCLC with different clinical stages

Five machine learning classifiers, logistic regression, SVM, random forest, naive bayes, and neural network were used to establish non-advanced NSCLC and advanced NSCLC classification models based on tongue feature, tumor marker, and baseline data. Each dataset was sampled 10 times for each classifier, and the “Mean (Standard Deviations)” of each evaluation index was taken to evaluate the model performance. The classification results of the models were shown in Table 5.

Table 5 Classification results of each model based on different data sets [Mean (Standard Deviations)]

ROC curves of the models based on the six data sets were shown in Fig. 7.

Fig. 7
figure 7

ROC curves of each classifier based on four data sets

A: Baseline; B: Tongue feature; C: Tumor marker; D: Tumor marker and baseline; E: Tongue and Tumor marker; F: Tongue and Tumor marker and Baseline

Gini scores were used to rank the importance of variables. For variables modeled based on random forest method, the importance ranking of the first 15 variables was shown in Fig. 8.

Fig. 8
figure 8

Variable importance based on Random Forest

The neural network model based on “tongue feature and tumor marker and baseline” data set had the best classification efficiency, and the confusion matrix of its model was shown in Table 6.

Table 6 Confusion matrix of the neural network model

The results showed that different classifiers had different classification effectiveness for different data sets during modeling. Among the machine learning methods tested, SVM, decision tree, and logistic regression performed poorly in models with various stages of NSCLC. In the tumor marker and tongue feature data set, the decision tree performed best, with a model accuracy of 0.658 ± 0.072 and an AUC value of 0.658 ± 0.104. SVM performed best in the baseline and tongue feature and tumor marker data sets, with a model accuracy of 0.736 ± 0.074 and an AUC value of 0.655 ± 0.056. Logistic regression performed best in the baseline data set, with a model accuracy of 0.627 ± 0.054 and an AUC value of 0.667 ± 0.065. Neural network, random forest and naive bayes had better classification efficiency for the data set of tongue feature and tumor marker and baseline. The classification accuracies of the models were 0.767 ± 0.081, 0.718 ± 0.062, and 0.688 ± 0.070, respectively, and the AUCs were 0.793 ± 0.086, 0.779 ± 0.075, and 0.771 ± 0.072.

Discussion

Analysis of tongue features in different stages of NSCLC

There were statistically significant differences in tongue features between the non-advanced NSCLC and the advanced NSCLC groups. The indexes mainly focused on the color space index, which was TB-B, TB-H, TC-H, TB-L, TB-a, TC-b, TB-Y, and TC-Cb, respectively. The differences between the two groups were mainly reflected in the intensity and hue of tongue body, the hue of tongue coating, the color of tongue body and tongue coating. The TB-B, TB-H, TB-L, TB-Y, TC-H, and TC-b indexes in the non-advanced NSCLC group were higher than those in the advanced NSCLC group, indicating that the tongue body of the non-advanced NSCLC group was brighter than that of the advanced NSCLC group, and the tongue coating was more yellow. The advanced NSCLC group had higher TB-a and TC-Cr levels than the non-advanced NSCLC group, indicating that the tongue body of the advanced NSCLC group was more reddish purple or cyanotic. The texture index and tongue coating index did not differ statistically between the non-advanced NSCLC and the advanced NSCLC groups, indicating that the tongue texture feature and tongue coating index of different stages of NSCLC could not be distinguished.

Correlation analysis of tongue feature and tumor marker in different stages of NSCLC

Tongue feature is TCM data, while tumor marker is Western medicine data. Due to the differences in concepts and methods of TCM and Western medicine, the relationship between them has not been systematically established. The essence of the relationship between TCM and Western medicine is the mechanism of disease and syndrome. The correlation analysis of tongue feature and tumor marker in this study will aid in the establishment of a bridge between TCM and Western medicine, allowing for a deeper understanding of the internal mechanism of disease and syndrome, as well as improve the accuracy of disease and syndrome diagnosis. According to the findings of the study, there was a link between TCM and Western medicine in patients with various stages of NSCLC. In the advanced NSCLC group, the number of indexes with statistically significant correlations between tongue feature and tumor marker was significantly higher than in the non-advanced NSCLC group, and the correlations were stronger.

Although some studies have linked CA125, CA15-3, CA19-9, CA72-4, and CYFRA21-1 to lung adenocarcinoma metastasis [24], no studies have confirmed that CA125 can be used as a prognostic marker, and only a small number of studies have discussed its prognostic value in advanced cancer [25], other studies have shown that NSE is an important prognostic factor for advanced locally metastatic NSCLC [18, 26]. CA125 was significantly correlated with TB-H, TC-H, TC-b, and TC-Cb in the non-advanced NSCLC group, CA72-4 was significantly correlated with TB-H, TC-H, and TC-b in the advanced NSCLC group, and CA72-4 was significantly correlated with TB-H, TC-H, TC-b, and TC-Cb in the advanced NSCLC group. CA125 was found to be significantly correlated with TB-G and TB-Y, indicating that CA125 and CA72-4 were related to tongue brightness and yellow tongue coating in both groups of NSCLC patients. The difference was that in the non-advanced NSCLC group, CA125 was correlated with TC-Cb, whereas in the advanced NSCLC group, CA72-4 was correlated with TC-Cb, and TC-Cb was a typical characteristic index of the purple tongue. Furthermore, in the advanced NSCLC group, NSE was significantly correlated with TB-a, TB-L, and TB-Y, indicating that NSE was associated with tongue brightness and redness in patients with advanced NSCLC.

Analysis of modeling in different stages of NSCLC

The modernization research of diagnostic technology and artificial intelligence technology have greatly promoted the objectification, standardization, and intelligent research of TCM. Through continuous deep learning of big data, machine learning and data mining methods can provide better clinical diagnosis, efficacy evaluation, and prediction models, as well as new methodological support for disease and syndrome research. The organic integration of TCM disease and syndrome research and artificial intelligence technology can effectively promote the development of a TCM intelligent clinical decision-making and efficacy evaluation model with significant practical implications and promising application prospects [27]. This study employed five classifiers, logistic regression, SVM, random forest, naive bayes, and neural network, which were based on tongue feature, tumor marker, tongue feature and tumor marker, tongue feature and tumor marker and baseline data to establish NSCLC classification and diagnosis models of different stages. The results showed that each classifier based solely on tongue feature and solely on tumor marker produced poor classification results, and the model had a high rate of missed diagnosis. All models performed well when combined with tongue feature, tumor marker, and baseline data, implying that single tongue feature and single tumor marker of NSCLC of different stages might not be able to classify or might be affected by the sample size, we should combine multidimensional data and conduct a comprehensive analysis to obtain better classification results when diagnosing NSCLC of different stages. The classifier of neural network based on the tongue feature and tumor marker and baseline data performed best when predict NSCLC at different stages, which suggested that we should give priority to neural network model in differential diagnosis.

Conclusions

There were statistically significant differences in the tongue feature of NSCLC patients at different clinical stages. In advanced NSCLC patients, there was a stronger correlation between tongue feature and tumor marker. Due to the limited information, single data sources including baseline, tongue feature, and tumor markers cannot be used to identify the different stages of NSCLC. In addition to the logistic regression method, other machine learning methods, based on tumor marker and baseline data sets, can effectively improve the differential diagnosis efficiency of different stages of NSCLC by adding tongue image data. However,  further verification in future studies with large sample is still needed.

Data Availability

The datasets generated and analyzed during the current study are not publicly available due to the confidentiality of the data, which is an important component of the National Key Technology R&D Program of the 13th Five-Year Plan (no. 2017YFC1703301) in China but are available from the corresponding author on reasonable request.

Abbreviations

IARC:

International Agency for Research on Cancer

NSCLC:

non-small cell lung cancer

TCM:

Traditional Chinese Medicine

SVM:

Support vector machine

CEA:

Carcinoembryonic antigen

CA125:

Carbohydrate antigen 125

CA199:

Carbohydrate antigen 199

AFP:

Alpha-fetoprotein

NSE:

Neuron-specific enolase

CYFRA21-1:

Cytokeratin 19 fragment

CA15-3:

Carbohydrate antigen 15 − 3

SF:

Serum ferritin

SCC:

Squamous cell carcinoma-associated antigen

GUI:

Graphical User Interface

NCCN:

National Comprehensive Cancer Network

TFDA-1:

Tongue Face Diagnosis Analysis-1

CON:

Contrast

ASM:

Angular Second Moment

ENT:

Entropy

TB:

Tongue Body

TC:

Tongue Coating

TP:

True Positive

TN:

True Negative

FP:

False Positive

FN:

False Negative

References

  1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and Mortality Worldwide for 36 cancers in 185 Countries[J]. CA Cancer J Clin. 2021;71(3):209–49.

    Article  PubMed  Google Scholar 

  2. Liu G, Pei F, Yang F, Li L, Amin AD, Liu S et al. Role of Autophagy and apoptosis in non-small-cell lung Cancer[J]. Int J Mol Sci, 2017, 18(2).

  3. Wood DE, Kazerooni EA, Baum SL, Eapen GA, Ettinger DS, Hou L, et al. Lung Cancer Screening, Version 3.2018, NCCN Clinical Practice Guidelines in Oncology[J]. J Natl Compr Canc Netw. 2018;16(4):412–41.

    Article  PubMed  Google Scholar 

  4. Yongjun J, Bingying Z, Taiping H, Yong Y, Nan Y, Haifeng D, et al. Effect of a New Model-Based Reconstruction Algorithm for evaluating early peripheral lung Cancer with submillisievert chest computed Tomography[J]. J Comput Assist Tomogr. 2019;43(3):428–33.

    Article  PubMed  Google Scholar 

  5. Chen S, Bao Y, Xu J, Zhang X, He S, Zhang Z, et al. Efficacy and safety of TCM combined with chemotherapy for SCLC: a systematic review and meta-analysis[J]. J Cancer Res Clin Oncol. 2020;146(11):2913–35.

    Article  PubMed  CAS  Google Scholar 

  6. Chen JM, Chiu PF, Wu FM, Hsu PC, Deng LJ, Chang CC, et al. The tongue features associated with chronic kidney disease[J]. Med (Baltim). 2021;100(9):e25037.

    Article  CAS  Google Scholar 

  7. Li J, Chen Q, Hu X, Yuan P, Cui L, Tu L, et al. Establishment of noninvasive diabetes risk prediction model based on tongue features and machine learning techniques[J]. Int J Med Inform. 2021;149:104429.

    Article  PubMed  Google Scholar 

  8. Wu TC, Lu CN, Hu WL, Wu KL, Chiang JY, Sheen JM, et al. Tongue diagnosis indices for gastroesophageal reflux disease: a cross-sectional, case-controlled observational study[J]. Med (Baltim). 2020;99(29):e20471.

    Article  CAS  Google Scholar 

  9. Kim J, Lee H, Kim H, Kim JY, Kim KH. Differences in the Tongue Features of Primary Dysmenorrhea Patients and Controls over a Normal Menstrual Cycle[J]. Evid Based Complement Alternat Med, 2017, 2017: 6435702.

  10. Li J, Yuan P, Hu X, Huang J, Cui L, Cui J, et al. A tongue features fusion approach to predicting prediabetes and diabetes with machine learning[J]. J Biomed Inform. 2021;115:103693.

    Article  PubMed  Google Scholar 

  11. Shi Y, Yao X, Xu J, Hu X, Tu L, Lan F, et al. A New Approach of fatigue classification based on data of Tongue and Pulse with Machine Learning[J]. Front Physiol. 2021;12:708742.

    Article  PubMed  Google Scholar 

  12. Li X, Zhang Y, Cui Q, Yi X, Zhang Y. Tooth-marked Tongue Recognition using multiple Instance Learning and CNN Features[J]. IEEE Trans Cybern. 2019;49(2):380–7.

    Article  PubMed  Google Scholar 

  13. Jiang T, Guo XJ, Tu LP, Lu Z, Cui J, Ma XX, et al. Application of computer tongue image analysis technology in the diagnosis of NAFLD[J]. Comput Biol Med. 2021;135:104622.

    Article  PubMed  Google Scholar 

  14. Wang X, Liu J, Wu C, Liu J, Li Q, Chen Y, et al. Artificial intelligence in tongue diagnosis: using deep convolutional neural network for recognizing unhealthy tongue with tooth-mark[J]. Comput Struct Biotechnol J. 2020;18:973–80.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Xu Q, Zeng Y, Tang W, Peng W, Xia T, Li Z, et al. Multi-task Joint Learning Model for Segmenting and Classifying Tongue images using a deep neural Network[J]. IEEE J Biomed Health Inform. 2020;24(9):2481–9.

    Article  PubMed  Google Scholar 

  16. Zhang K, Geng W, Zhang S. Network-based logistic regression integration method for biomarker identification[J]. BMC Syst Biol. 2018;12(Suppl 9):135.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Liu C, Cheng Y. An application of the support Vector Machine for Attribute-By-Attribute classification in cognitive Diagnosis[J]. Appl Psychol Meas. 2018;42(1):58–72.

    Article  PubMed  Google Scholar 

  18. Abbas M, Kassim SA, Habib M, Li X, Shi M, Wang ZC, et al. Clinical evaluation of serum tumor markers in patients with Advanced-Stage Non-Small Cell Lung Cancer treated with Palliative Chemotherapy in China[J]. Front Oncol. 2020;10:800.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Xu Y, Debing Z, Weiwei W, Mingwei T, Jun Q. Diagnostic value of five tumor markers in non-small cell lung cancer[J]. Clin Res Pract. 2021;6(34):28–32.

    Google Scholar 

  20. Zhang H, He M, Wan R, Zhu L, Chu X. Establishment and Evaluation of EGFR Mutation Prediction Model Based on Tumor Markers and CT Features in NSCLC[J]. J Healthc Eng, 2022, 2022: 8089750.

  21. Wood DE. National Comprehensive Cancer Network (NCCN) Clinical Practice Guidelines for Lung Cancer Screening[J]. Thorac Surg Clin. 2015;25(2):185–97.

    Article  PubMed  Google Scholar 

  22. Brambilla E, Travis WD, Colby TV, Corrin B, Shimosato Y. The new World Health Organization classification of lung tumours[J]. Eur Respir J. 2001;18(6):1059–68.

    Article  PubMed  CAS  Google Scholar 

  23. Micke P, Mattsson JS, Djureinovic D, Nodin B, Jirström K, Tran L, et al. The impact of the Fourth Edition of the WHO classification of lung tumours on histological classification of Resected Pulmonary NSCCs[J]. J Thorac Oncol. 2016;11(6):862–72.

    Article  PubMed  Google Scholar 

  24. Chen ZQ, Huang LS, Zhu B. Assessment of Seven Clinical Tumor Markers in Diagnosis of Non-Small-Cell Lung Cancer[J]. Dis Markers, 2018, 2018: 9845123.

  25. Yu D, Du K, Liu T, Chen G. Prognostic value of tumor markers, NSE, CA125 and SCC, in operable NSCLC Patients[J]. Int J Mol Sci. 2013;14(6):11145–56.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Wu M, Liu X, Fang J, An T, Wang J. [Clinical and prognostic significance of serum CEA, NSE, CYFRA211, CA125 and CA199 levels in patients with advanced non-small cell lung cancer][J]. Zhongguo Fei Ai Za Zhi. 2001;4(5):357–9.

    PubMed  CAS  Google Scholar 

  27. Zhang J, Qian J, Yang T, Dong HY, Wang RJ. Analysis and recognition of characteristics of digitized tongue pictures and tongue coating texture based on fractal theory in traditional chinese medicine[J]. Comput Assist Surg (Abingdon). 2019;24(sup1):62–71.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

The authors are especially thankful for the positive support received from the Longhua Hospital affiliated to Shanghai University of Traditional Chinese Medicine and all medical staff involved.

Funding

This research was funded by the National Key Research and Development Program of China (2017YFC1703301), and the National Natural Science Foundation of China (82305090), Science and Technology Commission of Shanghai Municipal (22YF1448900), Shanghai Municipal Health Commission (20234Y0168), Shanghai Municipal Education Commission Budget Project (2021LK029) and Space Medical Experiment Project (HYZHXM05001). They were not involved in the preparation of this manuscript or the decision to submit it for publication.

Author information

Authors and Affiliations

Authors

Contributions

SYL and WH, as co-first authors, contributed equally to this work and participated in the writing, SYL and XJT designed the study, YXH and LJ performed data analysis, LJY and CY performed the data collection, XJT and LLS contributed to the critical discussion and manuscript revision. All authors approved the submitted version.

Corresponding authors

Correspondence to Lingshuang Liu or Jiatuo Xu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethics approval and consent to participate

Ethical approval was obtained from the Longhua Hospital affiliated to Shanghai University of Traditional Chinese Medicine Hospital Ethics Committee (registration number 2020LCSY083). All methods were performed in accordance with the relevant guidelines and regulations as stated by the Longhua Hospital affiliated to Shanghai University of Traditional Chinese Medicine Hospital Ethics Committee. Written informed consent was obtained from all patients.

Consent for publication

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, Y., Wang, H., Yao, X. et al. Machine learning prediction models for different stages of non-small cell lung cancer based on tongue and tumor marker: a pilot study. BMC Med Inform Decis Mak 23, 197 (2023). https://0-doi-org.brum.beds.ac.uk/10.1186/s12911-023-02266-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://0-doi-org.brum.beds.ac.uk/10.1186/s12911-023-02266-5

Keywords