Predictive Analysis of Methylation Patterns in Oral Squamous Cell Carcinoma (OSCC) Using Machine Learning

Sarkar, Debasree

RESEARCH ARTICLE

Predictive Analysis of Methylation Patterns in Oral Squamous Cell Carcinoma (OSCC) Using Machine Learning

Debasree Sarkar¹^{, *} ^iD Authors Info & Affiliations

The Open Bioinformatics Journal • 07 Oct 2025 • RESEARCH ARTICLE • DOI: 10.2174/0118750362423160250929075233

Introduction

Oral and oropharyngeal cancers are the most common types of head and neck cancers, with over 90% originating from squamous cells in the mouth and throat. Chronic tobacco and alcohol use, inflammation, viral infections, betel quid chewing, and genetic predisposition are major risk factors for OSCC, which kills over 100,000 patients annually. Epigenetic mechanisms, such as DNA methylation, can silence tumor suppressor genes, contributing to cancer progression and patient outcomes in Oral Squamous Cell Carcinoma (OSCC). This study aimed to predict prominent methylation signatures that can distinguish OSCC from normal cells.

Methods

Machine learning algorithms, like Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP), were implemented using R packages and a balanced training dataset consisting of M-values of methylated CpG sites from 46 matched OSCC and normal adjacent tissue samples.

Results

MLP model demonstrated the highest accuracy of 92% on the training dataset and 100% on the blind dataset, even with a reduced feature set of just 10 significantly differentially methylated CpG sites.

Discussion

Despite the high burden of oral cancer in South America, and an alarming trend of rising number of cases, research into this particular area is sorely lacking. This work aims to address the issue by performing a machine learning-based analysis of methylation patterns, a major established factor, in oral cancer datasets obtained from Brazilian patients. However, the lack of experimental evidence supporting the results of this analysis can be considered a significant limitation of this study.

Conclusion

A highly accurate and generalizable machine learning model was developed using the Multi-layer Perceptron with multiple layers (MLP-ml) algorithm, which achieved an accuracy of 95% on an independent validation dataset of 15 OSCC tumors and 7 non-tumor adjacent tissue samples. Machine learning algorithms can therefore provide valuable insights into biological datasets that may be overlooked by regular bioinformatics workflows.

Keywords: DNA methylation, Oral cancer, Methylome, Machine learning, Random forest, Multilayer perceptron, Support vector machine.

Fulltext HTML PDF ePub

Predictive Analysis of Methylation Patterns in Oral Squamous Cell Carcinoma (OSCC) Using Machine Learning

Abstract

Introduction

Methods

Results

Discussion

Conclusion

Bentham Is Proud To Announce Collaboration With Elsevier

Three Journals Receive Impact Factors

The Nursing Journal Directory Indexes Bentham Journal, The Open Public Health Journal

Follow Us

Authors & Information

Authors

Affiliations

Information

Published In

Article Information

Cite As

Article History

Copyright

ACKNOWLEDGEMENTS

Download

Download1

Download

Citations & Metrics

Citations

Cite As

Export Citation

Metrics

Article Usage (Last 30 Days)

Article Usage (Demographic)

Copyright & License

Copyright & License

© 2025 The Author(s). Published by Bentham Open.

Media

Figures

Tables

Abstract

Introduction

Methods

Results

Discussion

Conclusion

Bentham Is Proud To Announce Collaboration With Elsevier

Three Journals Receive Impact Factors

The Nursing Journal Directory Indexes Bentham Journal, The Open Public Health Journal

Authors

Affiliations

Information

Published In

Article Information

Cite As

Article History

Copyright

ACKNOWLEDGEMENTS

Download1

Download

Citations

Cite As

Export Citation

Metrics

Article Usage (Last 30 Days)

Article Usage (Demographic)

Copyright & License

© 2025 The Author(s). Published by Bentham Open.

Figures

Share

Share article link

Share on social media