Radiumhospitalets legater
helsesorost
Oslo University Hospital

Integrating whole genome sequencing, methylation, gene expression and topological associated domain information in regulatory mutation prediction: a study of follicular lymphoma.


AUTHORS:

Amna Farooq 1, Gunhild Trøen 1, Jan Delabie 3 and Junbai Wang 2,4*

1. Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Oslo, Norway

2. Institute for Clinical Medicine, University of Oslo, Norway.

3. Laboratory Medicine Program, University Health Network and University of Toronto, Toronto, Ontario, Canada.

4. Department of clinical molecular biology (EpiGen), Akershus University Hospital, Lørenskog, Norway.

*To whom correspondence should be addressed.Email: junbai.wang@medisin.uio.no

ABSTRACT:

A major challenge in human genetics is of the analysis of the interplay between genetic and epigenetic factors in a multifactorial disease like cancer. To understand this interplay, it is important to comprehensively analyze genetic and epigenetic features. Here, a novel methodology is proposed to investigate genome-wide regulatory mechanisms in cancer, as studied with the example of follicular Lymphoma (FL). In the first phase, a new machine-learning method is designed to identify Differentially Methylated Regions (DMRs) by computing six attributes. In the second phase, an integrative data analysis method is developed to study regulatory mutations in FL, by considering differential methylation information together with DNA sequence variation, differential gene expression, 3D organization of genome (e.g., topologically associated domains - TADs), and enriched biological pathways. Resulting mutation block-gene pairs are further ranked to find out the significant ones. By this approach, ~159 predicted mutation block-gene pairs with possible relevance to FL were identified. Notably, BCL2 and BCL6 were identified as top-ranking FL-related genes with several mutation blocks and DMRs acting on their regulatory regions. Two additional genes, CDCA and CTSO4, were also found in top rank with significant DNA sequence variation and differential methylation in neighboring areas, pointing towards their potential use as biomarkers for FL. This work provides a novel method for combining both genomic and epigenomic information to investigate genome-wide gene regulatory mechanisms in cancer and contribute to devising novel treatment strategies.

SUPPLEMENTARY MATERIAL:

Following are the additional supplemtary files containing detailed results data. Data can be used to replicate the results reported in main study or can be used a resource to conduct futher research of Follicular lymphoma.

REFERENCES:

  1. Batmanov, K., J. Delabie, and J. Wang, BayesPI-BAR2: A New Python Package for Predicting Functional Non-coding Mutations in Cancer Patient Cohorts. Front Genet, 2019. 10: p. 282.
  2. Batmanov, K., et al., Integrative whole-genome sequence analysis reveals roles of regulatory mutations in BCL6 and BCL2 in follicular lymphoma. Scientific Reports, 2017.
  3. Farooq, A., et al., HMST-Seq-Analyzer: A new python tool for differential methylation and hydroxymethylation analysis in various DNA methylation sequencing data. Computational and structural biotechnology journal, 2020. 18: p. 2877-2889.
  4. Hoffman, M.M., et al., Integrative annotation of chromatin elements from ENCODE data. Nucleic Acids Res, 2013. 41(2): p. 827-41.
  5. Kretzmer, H., et al., DNA methylome analysis in Burkitt and follicular lymphomas identifies differentially methylated regions linked to somatic mutation and transcriptional control. Nat Genet, 2015. 47(11): p. 1316-25.
  6. Zhao, Y., Li, M. C., Konaté, M. M., Chen, L., Das, B., Karlovich, C., ... & McShane, L. M. (2021). TPM, FPKM, or normalized counts? A comparative study of quantification measures for the analysis of RNA-seq data from the NCI patient-derived models repository. Journal of translational medicine, 19(1), 1-15.