Studies of inherited predisposition to cancer

File Description SizeFormat 
Houlston RS DSc thesis.pdf490.12 kBAdobe PDFDownload
Title: Studies of inherited predisposition to cancer
Author(s): Houlston, Richard S.
Item Type: Thesis or dissertation
Abstract: Statement on how my publications have contributed to the advancement of knowledge and learning in the field EARLY RESEARCH The research studies undertaken for my MD thesis were concerned with the impact of genetic variation in the apolipoprotein B (Apo‐B) and Apo‐E genes on lipoprotein metabolism. I demonstrated that the catabolic rate of low density lipoprotein (LDL) is influenced by variation in the Apo‐B gene and this directly impacts on circulating levels of LDL; Notably that polymorphic variation in Apo‐B is a determinant of LDL levels in the general population and that germline mutation of the gene can result in a clinical phenotype analogous to familial hypercholesterolaemia. This period of research was followed by a Clinical Fellowship from the Imperial Cancer Research Fund during which I worked on the genetic epidemiology of breast, ovarian and colorectal cancer. Using large datasets I calculated age‐specific familial cancer risks for these cancers and using statistical modelling determined the most likely genetic models of inherited predisposition to breast, ovarian and colorectal cancer (CRC). Using these data a screening programme for the relatives of patients CRC based on their calculated lifetime risk was developed and implemented as part of the North East Thames Regional Clinical Genetics Service. It was demonstrated that family history of cancer could be used to identify those at risk of colonic cancer and to target appropriate screening. In the CRC family history clinic colonoscopic surveillance was shown to detect a high number of premalignant colonic polyps in those calculated to be at high genetic risk. An analogous programme for familial breast cancer was subsequently implemented for individulas with a family history of breast cancer. In this clinical setting use of familial history was shown to be an effective means of identifying women at risk of breast cancer who might benefit from early mammographic surveillance. The findings and experience demonstrated the clinical utility of dedicated family cancer clinics and established a framework for their operation within the United Kingdom. Collectively work undertaken during this period formed the basis of a PhD thesis. CURRENT RESEARCH INTERESTS AND ACHIEVEMENTS I am currently Professor of Molecular and Population Genetics, Division of Genetics and Epidemiology at the Institute of Cancer Research. The central theme of my research is to understand the biological basis of inherited susceptibility to cancer and specifically the identification and characterisation of genetic variation influencing cancer risk. Over the past 40 years the study of inherited susceptibility to cancer has proved to be a most informative area of cancer research – and it continues to be so. The results of this research, namely the identification of new susceptibility genes, provides for an understanding of the mechanisms of tumour biology, offering potential targets for novel therapeutic interventions. The ability to identify those at increased risk is of clinical relevance, in terms of primary and secondary interventions. Finally given the difficulties in unambiguously identifying causative exposures for some cancers, genetic associations are likely to be increasingly valuable via the Application for Degree of DSc – Richard S Houlston functional links they reveal that either endorse current aetiological hypotheses or suggest new ones that merit testing via gene‐environment specific hypotheses. IDENTIFICATION OF DISEASE GENES THROUGH POSITIONAL CLONING My first work in the area of identification of cancer genes was on Juvenile Polyposis (JPS), a rare classical Mendelian dominant condition conferring a high risk of CRC. We undertook a large family collection based on strict ascertainment criteria, which ultimately through international collaboration led to the identification of the disease gene, SMAD4, for JPS. Its identification has allowed families to be tested for JPS in the diagnostic and predictive settings in routine clinical practise. A similar emphasis on detailed family ascertainment allowed us to conduct a genome‐wide linkage scan of families segregating hereditary leiomyomatosis and renal cancer (HLRCC), and localise the gene for HLRCC to 1q42.3‐q43. Subsequently, as part of an international collaboration, through positional cloning we identified germline mutations in the gene encoding the Krebs cycle enzyme fumarate hydratase (FH) as the basis of HLRCC. While HLRCC is a rare disease the demonstration that mutation of FH can be a basis for tumour development provided contributing evidence of pseudo‐hypoxic drive in the development of cancer. Subsequent to these studies we were one of the first research groups to implement high‐density SNP arrays to conduct genome‐wide linkage scans. While the denseness of SNP marker maps means that there is increased power and improved localisation of disease genes the high density of SNP maps leads to the problem of linkage disequilibrium (LD) between markers inflating linkage statistics. To address this issue we developed the program SNPLINK to incorporate LD between markers into the likelihood calculations so that expected haplotype frequencies are correctly estimated. Prior to performing linkage scans of CRC and chronic lymphocytic leukaemia (CLL) families to search for novel disease loci we extensively piloted the utility of using high‐density SNP arrays. Firstly, we evaluated a pre‐release version of the Affymetrix 10K array successfully using this technology to localise and identify PTF1A as a cause of recessive cerebellar and pancreatic agenesis. Secondly, we validated the use of production Affymetric 10K arrays identifying disease genes for a number of Mendelian disorders including recessive paroxysmal nocturnal hemoglobinuria, vasculitis, merosin‐positive congenital muscular dystrophy, craniosynostosis and dominantly inherited renal dysplasia. Following on from these studies we analysed 206 CLL families identifying a disease locus at 2q21.2, thereby providing evidence for high‐moderate penetrance susceptibility to CLL. Through mutation screening we subsequently demonstrated that germline mutation of the gene CRCX4, which maps to 2q21.2 is implicated in a subset of familial CLL. To investigate Mendelian susceptibility to CRC we conducted a genome‐wide linkage screen of 104 CRC families in which germline mutations in the then known CRC predisposition genes had been excluded. The strongest evidence of linkage was provided by loci at 9p, 3p24, 13q31 and 17q24, albeit non‐significant at a genome‐wide threshold. These data indicate that if additional high‐moderate penetrance susceptibility genes exist for CRC, each accounts for only a small fraction of the familial risk. Application for Degree of DSc – Richard S Houlston IDENTIFICATION OF COMMON RISK VARIANTS FOR CANCER Over the last decade the opportunities for identifying high‐risk cancer genes through classical linkage and positional cloning strategies have dwindled. This has sparked renewed interested in the notion that much of the inherited predisposition to cancer is mediated through various combinations of common and rare‐disease causing genetic variants. Anticipating the requirement for large datasets of cancer cases to search for these classes of cancer susceptibility gene we developed the necessary infrastructure and protocols for large‐scale sample collection together with establishment of international consortia. These initiatives have allowed us to generate world‐class biobanks for multiple cancer types. Initially our search for common genetic variants affecting cancer risk was largely conducted adopting a candidate gene approach making use of analytical platforms technologies for parallel processing of genotyping. The spectrum of mutations in Mendelian disease genes, coupled with issues of statistical power, provided us with a rationale for pursuing association analyses targeting non‐synonymous SNPs (nsSNPs). To increase the likelihood of identifying predisposition variants we classified and catalogued the predicted functionality of all nsSNPs catalogued by dbSNP in genes relevant to the biology of cancer through cross referencing of KEG and other databases. In addition to studies of CRC, breast cancer, CLL our study of lung cancer provided evidence for a role for variants of the IGF and BAT3 in disease aetiology. Subsequent to studies focused on candidate genes using our infrastructure, we has been at the vanguard of implementing genome‐wide association studies (GWAS) to identify common, low‐penetrance loci for cancer without prior knowledge of location and function. We has successfully led GWAS studies of CRC, lung cancer, glioma, meningioma, CLL, acute lymphoblastic leukaemia (ALL), Hodgkin’s lymphoma (HL) and multiple myeloma (MM) being the first research group to identifying novel common disease variants for each malignancy. Importantly, prior to our studies no specific risk factors for ALL, CLL and MM had been identified. Collectively our work in this field of research has so far resulted in the identification of over 50 novel cancer loci constituting a large fraction (perhaps ~25%) of all common variants thus far identified by all researchers worldwide. In the design of GWAS we have shown in our studies of CRC and CLL the theoretical value of using cases enriched for genetic susceptibility by virtue of family history as a means of increasing study power to identify novel disease causing variants through association‐based analyses. As well as vindicating the hypothesis of low penetrance susceptibility to cancer our findings have provided insight into the biological basis of tumourigenesis, emphasising the role of genetic variation in developmental genes as determinants of cancer susceptibility for a number of malignancies. One of the anticipated deliverables from GWAS studies was that the identification of risk variants would provide fresh insights into cancer biology. Indeed few of the genes implicated by the GWAS scans have previously been evaluated in targeted association studies. Through the work of our group, insights into new pathways of tumourigenesis are emerging; for example, we have demonstrated that genetic variation in the B‐cell developmental genes, IKZF1, CEBPE, IRF4, IRF8, SP120, BAK1, and ARID5B, has been shown to determine the risk of B‐cell tumours. Application for Degree of DSc – Richard S Houlston Furthermore we have shown many of the risk variants have cis‐acting effects on gene expression rather than influencing risk through sequence variation in the expressed proteins. Similarly, several of the 20 CRC loci we have identified provide strong evidence for the involvement of components in the transforming growth factor‐β (TGFβ) superfamily signalling pathway in CRC development, most notably BMP2, BMP4, SMAD7 and GREM1. Intriguingly, we have demonstrated an association between chromosome Xp22.2 encompassing the developmental gene SHROOM2 and CRC risk. This represents the first evidence for a role of variation in the X‐chromosome variation in predisposition to a non–sex specific cancer characterised by different incidence rates in men and women. While the well established HLA association with HL represents a very strong genetic effect, the identification of risk variants at 2p16.1, 8q24.21 and 10q14 we have made has implicated important roles for networks involving MYC, GATA3 and the NFκB pathway in HL disease aetiology. Our findings in glioma have also highlighted the importance of variation in genes encoding components of the CDKN2A‐CDK4 signalling pathway in tumour development. Moreover, this pathway, elucidated through the extended interaction network of CDKN2A, incorporates TERT (through mutual interaction with HSP90) and other genes (including CCDC26) identified as risk factors. Gliomas a heterogeneous and following on from initial gene disocery experiments we have shown that the risk variants at 5p15.33 (TERT), 8q24.21 (CCDC26), and 20q13.33 (RTEL) have subtype‐specific effects consistent with different aetiological basis to the various glioma histologies. Similarly our work on the other major primary brain tumour, meningioma has implicated dysfunctional Wnt signalling as a biological basis for tumour development by virtue of variation in MLLT10. This together with our previous observation of a relationship between BRIP1 variation and meninigioma risk represent the only robust genetic associations for this tumour type reported thus far. Most recently the GWAS of MM we performed has identified three loci influencing MM risk. Our observation that genetic variation affecting CDCA7L and ULK4 genes affects MM risk has for the first time directly implicated genetically determined deregulation of MYC and mTOR‐mediated autophagy in the aetiology of the disease. Importantly our studies have provided the first direct and robust evidence for inherited genetic susceptibility to ALL, CLL, MM and lung cancer. This was especially profound for lung cancer a malignancy frequently cited as a solely attributable to environmental exposure. While it has long been postulated that individuals may differ in their genetic susceptibility to develop lung cancer in response to genotoxic insult prior to our studies evidence for such an assertion has been lacking. Through our GWAS of lung cancer we initially identified risk loci for lung cancer at 5p15.33 (TERT/CLPTM1), 6p21.33, and two at 15q25.1 (CHRNA5‐CHRNA3‐CHRNA4). These data thus provided the first evidence for common genetic susceptibility to lung cancer. We have subsequently shown through collaborative pooled analyses of other GWAS datasets that variation in RAD52C and CDKN2A are risk factors for squamous lung cancer and the 5p15.33(TERT) association is specific for risk of lung adenocarcinoma. Most SNP associations identified to date have been tumour specific, which is consistent with the epidemiological observations that most familial cancer risks are tumour specific. Evidence for Application for Degree of DSc – Richard S Houlston pleiotropic effects (reflecting generic effects or lineage‐specific effects) has been provided by our work showing that the 5p15.33 (TERT–CLPML1) locus influences the risk of many tumour types including glioma and lung cancer and the locus at 9p21.3 (CDKN2A–CDKN2B) influences the risk of both glioma and ALL risk in addition to melanoma. In contrast to high‐penetrance susceptibility, the effect of low‐penetrance variants on tumour phenotype may be limited or absent. Exploration of the relationship between SNP genotypes and tumour phenotypes is still in its infancy, but some associations are becoming apparent. One of the most striking genotype–phenotype relationships identified to date is the 10q21.2 (ARID5B) ALL association, which seems to be highly selective for the subset of B‐cell precursor ALL with hyperdiploidy directly implicating the developmental gene ARIDB5 in disease phenotype. Given the difficulties in unambiguously identifying causative exposures for many cancers, genetic associations have the potential to endorse current aetiological hypotheses or suggest new ones that merit testing through gene‐ and/or environment‐specific hypotheses. Demonstration of an effect on cancer risk that is mediated by an environmental exposure has been provided by the nicotinic acid receptor (CHRNA3–CHRNA5) locus that is associated with lung cancer through genetic variation influencing an individual’s propensity to smoke. Since the 15q24 association signal is a consequence of the D398N substitution in CHRNA5 which causes decreased response to nicotine agonists there is evidence for genetically‐determined nicotine dependency with biological plausibility. Our findings that variation in developmental genes is commonly the basis of GWAS signals for cancer generally favours such variation impacting early in tumourgenesis rather than impacting on late disease expression. Testing for association between risk variants and progenitor lesions theoretically offers the means of examining such an assertion. For most tumours this is unfortunately not possible. For CLL this is possible since we have shown that monoclonal CLL‐phenotype cells (monoclonal B‐cell lymphocytosis, MBL) detectable in ~3% of otherwise healthy persons and represent the progenitor lesion for CLL. Our observation that the risk variants for CLL influence affect the risk of MBL is consistent with the variants impact on early stage development of CLL rather than disease progression per se. Modelling all the single nucleotide polymorphisms (SNPs) in a GWAS simultaneously provides a means of deriving an unbiased estimate of the heritability explained by common variation. In concert with our gene discovery efforts using GWAS‐based strategies we have applied this form of statistical modelling to ALL and CLL GWAS datasets to derive estimates for heritability. Through these analyses we have shown that 59% of the total variation in CLL risk can be accounted for by common genetic variation, thereby providing the first direct data for a polygenic basis for susceptibility to CLL. Prior to our GWAS of ALL evidence although speculated was distinctly lacking and much of the contemporary thinking was that, given indirect evidence for the role of infection in disease aetiology any genetic susceptibility would be mediated through disordered HLA‐associated. Our analyses have quashed this hypothesis but have shown that 28% of the variance in childhood ALL risk can be ascribed to common variation. Application for Degree of DSc – Richard S Houlston Genetic and functional basis of GWAS signals To investigate if associations may have arisen owing to independent correlation of a tagSNP with more than one functional SNP, we searched for novel CRC susceptibility variants close to the BMP‐pathway genes GREM1, BMP4, and BMP2. We have shown that independent CRC predisposition SNPs close to BMP4 and BMP2. Near GREM1 we also found using fine‐mapping that the previously‐identified association between tagSNP rs4779584 and CRC was a consequence of two independent signals. As exemplified by MLH1‐93G>A polymorphism there is increasing evidence that high‐ and low‐penetrance variants can map to the same gene. Further support for such an assertion is supported by our observation of inactivating BMP4 mutations as a basis of familial CRC. Our work in this area has thus served to emphasise that genetic fine‐mapping studies can deconvolute associations, thus explaining some of the apparently missing heritability of common diseases. A long term outcome of GWAS is that knowledge gained about the underlying molecular basis of CRC may lead to the development of innovative therapeutic and preventative measures. Many of the risk loci identified thus far by GWAS map to non‐coding regions of the genome. For example, the 8q24.21 region is one of the most intriguing and important loci to emerge from GWASs. The genomic interval harbours independent loci with different tumour specificities including ones for CRC, CLL and HL. The region to which these cancer associations map is, however, bereft of genes or protein‐coding transcripts. Identification of the causal basis of association signals identified through GWAS is challenging and we has played a major role in elucidating the functional basis for the CRC associations, work that has underscored the role of inherited differential gene expression as a determinant of cancer susceptibility rather that mutation impacting on protein sequence which has typified Mendelian cancer susceptibility. Using a combinatorial approach of bioinformatics and molecular studies we have demonstrated the genomic region harbouring the 8q23 variant for CRC influences eukaryotic translation initiation factor 3, subunit H (EIF3H) expression and that up‐expression of EIF3H gene increases CRC growth and invasiveness thus providing a biological mechanism for the 8q23.3 association. Similarly our work has demonstrated cis‐acting regulation of the TGF‐beta signalling gene SMAD7 as a basis for the 18q24 association and cis‐acting regulation of BMP4 provides a basis for the 14q22.2 association. At 8q24.21 we showed rs6983267 was maximally associated with CRC risk. A collaboration with Lauri Aaltonen (Helsinki University) led to the demonstration that rs6983267 annotates an enhancer element affecting the binding of Wnt‐regulated transcription factor TCF4. This coupled with the finding that rs6983267 directly interacts with the MYC promoter has provided a biological basis for the 8q24 association. Application for Degree of DSc – Richard S Houlston GENETIC EPIDEMIOLOGICAL ANALYSES Studies of CLL Although familial clustering of CLL had long been recognised direct evidence for inherited genetic susceptibility has been lacking. We have made a significant contribution to establishing that inherited genetic factors play a role in the development of CLL. Even prior to our linkage and association studies through the ascertainment of striking families segregating CLL we provided overwhelming case for the existence of genetic predisposition. Furthermore, our observations on the repertoire and frequency of IGVH usage in familial and sporadic CLL favour a genetic basis to CLL development rather than a simple environmental aetiology. Coupled with such studies our studies of MBL have provided insight in disease development. Risk prediction and statistical modelling of familial risk The distribution and genotypic risk conferred by disease alleles is crucial for determining the applicability to relative or absolute risk circumstances. Diagnostic testing for highly penetrant mutations is now part of standard clinical care for cancer families in many countries. The identification of individuals at increased risk allows the targeting of cancer prevention strategies and can increasingly influence cancer treatment. Our work on the estimation of familial cancer risks has been largely focused on those relating to CRC. In addition to deriving age‐specific familial CRC risks we have stratified risks by molecular features in cancers. These analyses indicate that the majority of the familial CRC risk associated with micro satellite unstable CRC is a consequence of germline MMR mutation/variation. However, ~70% of the familial CRC risk current remains unexplained. Statistical modelling of this “missing heritability” was shown to be compatible with polygenic/recessive susceptibility. Using the largest dataset to date we have derived age‐specific CRC risk associated with mutations in the base‐excision repair gene MUTYH. We have shown that while biallelic mutations are associated with a high CRC risk penetrance is incomplete at age 60. We have also shown MUTYH mutation screening should be directed to patients with APC‐negative polyposis and early‐onset proximal microsatellite stable CRC an expanded clinical phenotype needs be recognized. This information is directly relevance to clinical counselling. There has been limited data on the spectrum and risk for cancer associated with germline serine/threonine protein kinase 11 (STK11) mutations that cause Peutz‐Jeghers syndrome (PJS). To address this deficiency we analyzed the incidence of cancer in 240 individuals with PJS possessing germline mutations in STK11, the largest study of its type to date. The most common cancers represented were gastrointestinal in origin‐gastroesophageal, small bowel, colorectal, and pancreatic. In women, the risk for breast cancer was substantially increased. Our analysis showed similar cancer risks between missense and truncating mutation carriers. These results quantitatively showed the spectrum of cancer risk associated with STK11 germline mutations in the context of PJS and provide the most comprehensive data for defining surveillance regimens thus far. Application for Degree of DSc – Richard S Houlston Public heath value of common variants on cancer risk At present, the power of models that incorporate all known common risk alleles for individual‐level risk predication is limited, although there is clearly potential for this to improve substantially as more variants are found. This may in turn have important health implications for the provision of cancer screening; for example, in determining who should undergo colonoscopy. Nevertheless we have demonstrated by modelling the public health potential of risk profiling using composites of risk variants for CRC that stratification of the population into CRC risk categories is feasible, informing targeted prevention and surveillance within the population to be optimally configured.
Publication Date: 2012
Date Awarded: 2012
URI: http://hdl.handle.net/10044/1/11640
Department: Faculty of Medicine
Publisher: Imperial College London
Qualification Level: Other
Qualification Name: Doctor of Science (DSc)
Appears in Collections:DSc Awards



Items in Spiral are protected by copyright, with all rights reserved, unless otherwise indicated.

Creative Commons