Supplementary information
Supplementary Fig. 1 Quality control of sequencing data.
(a) Mapping reads distribution of 998 samples with valid sequencing data. A total of 31 samples (3.1%) with mapping reads <100k were filtered out. (b) MAPD distribution of 967 samples that passed the filter of mapping reads. A total of 387 samples (38.8%) with an MAPD ≥0.25 were filtered out, leaving 580 samples (58.1%) that qualified for downstream analysis. (c) Examples showing unqualified samples with either mapping reads <100k or an MAPD ≥0.25.
2
Supplementary Fig. 2 CNA differences between HOL and DOL.
(a) Normalized CNA counts across the genome in HOL and DOL. Copy number gains and losses in each chromosome were counted and then divided by the number of HOL or DOL samples. (b) Distribution of the CNA ratio across the genome in HOL (left panel) and DOL (right panel). Copy number gains and losses for each chromosome were counted and then divided by the total CNA counts in HOL or DOL. (c) Copy number gain and loss ratio in HOL (left panel) and DOL (right panel). (d) Percentage of male and female CNA patients.
Supplementary Fig. 3 Correlation between CNAscore and COUNTscore.
The presence of a CNAscore >4.3 and a COUNTscore >7 was defined as a severe CNA region (red).
4
Supplementary Fig. 4 CNA differences among FD, LR, and MT.
(a) Normalized CNA counts across the genome in FD, LR, and MT. Copy number gains and losses in each chromosome were counted and then divided by the number of samples in FD, LR, or MT. (b) Distribution of CNA ratio across the genome in FD (left panel), LR (middle panel), and MT (right panel). Copy number gains and losses in each chromosome were counted and then divided by the total CNA counts in FD, LR, or MT.
(c) Copy number gain and loss ratio in FD (upper left panel), LR (upper right panel), and MT (lower panel). (d) Boxplots showing the distributions of CNAscore (upper panel) and COUNTscore (lower panel) in DOL and MT.
Supplementary Fig. 5 CNA profiles of four LR patients and four MT patients with
6
Supplementary Fig. 6 Discussion on the copy number cut-offs.
(a) Distribution of copy numbers of all the bins. Cut-off of copy numbers were showed in red solid line (<1.6;>2.3, used in the paper), black dash line (<1.8;>2.2, lenient cut- off) and green dash line (<1.4;>2.5, stringent cut-off) (b) COUNTscore_lenient distribution of FD, LR and MT groups. (*** p<0.001, t-test). (c) COUNTscore_stringent distribution of FD, LR and MT groups. (*** p<0.001, **
0.001<p<0.01, t-test)
Supplementary Table 1 Summary of histopathological and clinical information of all OL patients.
See the attached file.
8
Supplementary Table 2 CNA ratio according to patient characteristics.
characteristics Patient number (%) p
Total CNA CNA-free
Sex
Female 132 59 (44.6) 73 (55.4) 0.592
Male 128 53 (41.4) 75 (58.6)
Age
<40 46 13 (28.2) 33 (71.8) 0.017
40-49 53 20 (37.7) 33 (62.3)
50-59 77 32 (41.6) 45 (58.4)
60-69 65 34 (52.3) 31 (47.7)
>70 19 13 (68.4) 6 (31.6)
Site
Tongue 100 45 (45.0) 55 (55.0) 0.239
Gingiva 34 16 (47.1) 18 (52.9)
Cheek 106 43 (40.6) 63 (59.4)
Floor of mouth 4 2 (50.0) 2 (50.0)
Palate 7 5 (71.4) 2 (28.6)
Lip 9 1 (11.1) 8 (88.9)