Copyright
©The Author(s) 2020.
World J Clin Oncol. Nov 24, 2020; 11(11): 918-934
Published online Nov 24, 2020. doi: 10.5306/wjco.v11.i11.918
Published online Nov 24, 2020. doi: 10.5306/wjco.v11.i11.918
Table 1 Number of oral cancer cases from various anatomical sites
ICD-O-3 codes | Sites | Number of cases |
C000 | External upper lip | 413 |
C001 | External lower lip | 2444 |
C002 | External lip, NOS | 92 |
C003 | Mucosa of upper lip | 104 |
C004 | Mucosa of lower lip | 567 |
C005 | Mucosa of lip, NOS | 29 |
C006 | Commissure of lip | 85 |
C008 | Overlapping lesion of lip | 46 |
C009 | Lip, NOS (excludes skin of lip C44.0) | 153 |
C019 | Base of tongue, NOS | 10840 |
C020 | Dorsal surface of tongue, NOS | 652 |
C021 | Border of tongue | 2632 |
C022 | Ventral surface of tongue, NOS | 1688 |
C023 | Anterior 2/3 of tongue, NOS | 2807 |
C024 | Lingual tonsil | 170 |
C028 | Overlapping lesion of tongue | 581 |
C029 | Tongue, NOS | 3050 |
C030 | Upper gum | 821 |
C031 | Lower gum | 1680 |
C039 | Gum, NOS | 210 |
C040 | Anterior floor of mouth | 1362 |
C041 | Lateral floor of mouth | 352 |
C048 | Overlapping lesion of floor of mouth | 136 |
C049 | Floor of mouth, NOS | 2284 |
C050 | Hard palate | 1155 |
C051 | Soft palate, NOS (excludes nasopharyngeal surface of soft palate C11.3) | 1301 |
C052 | Uvula | 180 |
C058 | Overlapping lesion of palate | 206 |
C059 | Palate, NOS | 154 |
C060 | Cheek mucosa | 1787 |
C061 | Vestibule of mouth | 134 |
C062 | Retromolar area | 1413 |
C068 | Overlapping lesion of other and unspecified parts of mouth | 142 |
C069 | Mouth, NOS | 487 |
C079 | Parotid gland | 7111 |
C080 | Submandibular gland | 1149 |
C081 | Sublingual gland | 94 |
C088 | Overlapping lesion of major salivary glands | 6 |
C089 | Major salivary gland, NOS (excludes minor salivary gland, NOS C06.9) | 287 |
C090 | Tonsillar fossa | 1735 |
C091 | Tonsillar pillar | 888 |
C098 | Overlapping lesion of tonsil | 109 |
C099 | Tonsil, NOS (excludes lingual tonsil C02.4 and pharyngeal tonsil C11.1) | 9521 |
C100 | Vallecula | 282 |
C101 | Anterior surface of epiglottis | 88 |
C102 | Lateral wall of oropharynx | 184 |
C103 | Posterior wall of oropharynx | 246 |
C104 | Branchial cleft (site of neoplasm) | 37 |
C108 | Overlapping lesion of oropharynx | 277 |
C109 | Oropharynx, NOS | 940 |
C129 | Pyriform sinus | 1707 |
C130 | Postcricoid region | 78 |
C131 | Hypopharyngeal aspect of aryepiglottic fold, NOS (excludes laryngeal aspect of aryepiglottic fold C32.1) | 214 |
C132 | Posterior wall of hypopharynx | 250 |
C138 | Overlapping lesion of hypopharynx | 113 |
C139 | Hypopharynx, NOS | 816 |
C739 | Thyroid gland | 111425 |
Table 2 List of all 10 variables included in the final machine learning model building and validation
Variables | Variable description |
Age at diagnosis | This data item represents the age of the patient at diagnosis for this cancer. The code is three digits and represents the patient’s actual age in years |
Year of diagnosis | The year of diagnosis is the year the tumor was first diagnosed by a recognized medical practitioner, whether clinically or microscopically confirmed |
Month of diagnosis | The month of diagnosis is the month the tumor was first diagnosed by a recognized medical practitioner, whether clinically or microscopically confirmed |
Primary site | This data item identifies the site in which the primary tumor originated. See the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3)[18] for topography codes. The decimal point is eliminated |
CS tumor size | Information on tumor size. Available for 2004-2015 diagnosis years. Earlier cases may be converted and new codes added which weren't available for use prior to the current version of CS. For more information, see http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage[19] |
CS extension | Information on extension of the tumor. Available for 2004-2015 diagnosis years. Earlier cases may be converted and new codes added which weren't available for use prior to the current version of CS. For more information, see http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage[19] |
CS lymph nodes eval | Available for 2004-2015, but not required for the entire timeframe. Will be blank in cases not collected. For more information, see http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage[19] |
Derived AJCC stage group | This is the AJCC “Stage Group” component that is derived from CS detailed site-specific codes, using the CS algorithm, effective with 2004-2015 diagnosis years. See the CS site-specific schema for details (http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage)[19] |
RX Summ-surg prim site | Surgery of primary site describes a surgical procedure that removes and/or destroys tissue of the primary site performed as part of the initial work-up or first course of therapy |
Site recode ICD-O-3/WHO 2008 | A recode based on primary site and ICD-O-3 Histology in order to make analyses of site/histology groups easier. For example, the lymphomas are excluded from stomach and Kaposi and mesothelioma are separate categories based on histology. For more information, see http://seer.cancer.gov/siterecode/icdo3_dwhoheme/index.html[20] |
Table 3 Demographic characteristics of the sample (n = 177714)
Variable | Mean | SD | Median | n | % |
Survival months/mo | 60.35 | 40.98 | 54.00 | ||
Age at diagnosis/yr | 54.62 | 16.10 | 55.00 | ||
Tumor size/(ID, cm) | 22.56 | 21.74 | 19.00 | ||
Marital status | |||||
Single | 35688 | 20.08 | |||
Married | 110480 | 62.17 | |||
Separated | 1746 | 0.98 | |||
Divorced | 16401 | 9.23 | |||
Widowed | 13055 | 7.35 | |||
Unmarried or domestic partner | 344 | 0.19 | |||
Sex | |||||
Male | 72179 | 40.62 | |||
Female | 105535 | 59.38 | |||
Race | |||||
White | 148556 | 83.60 | |||
Black | 16051 | 9.03 | |||
Other | 13107 | 7.38 |
Table 4 Machine learning model performance
Performance indicators | Linear regression | Decision tree | Random forest | XGBoost |
MSE | 647.49 | 538.30 | 489.58 | 486.55 |
RMSE | 25.45 | 23.20 | 22.13 | 22.06 |
MAE | 18.21 | 14.45 | 13.63 | 13.55 |
R2 score | 0.620 | 0.681 | 0.709 | 0.711 |
Adjusted R2 score | 0.620 | 0.681 | 0.709 | 0.711 |
- Citation: Hung M, Park J, Hon ES, Bounsanga J, Moazzami S, Ruiz-Negrón B, Wang D. Artificial intelligence in dentistry: Harnessing big data to predict oral cancer survival. World J Clin Oncol 2020; 11(11): 918-934
- URL: https://www.wjgnet.com/2218-4333/full/v11/i11/918.htm
- DOI: https://dx.doi.org/10.5306/wjco.v11.i11.918