Genome-wide association studies have found SNPs at 17q22 to be associated with breast cancer risk. To identify potential causal variants related to breast cancer risk, we performed a high resolution fine-mapping analysis that involved genotyping 517 SNPs using a custom Illumina iSelect array (iCOGS) followed by imputation of genotypes for 3,134 SNPs in more than 89,000 participants of European ancestry from the Breast Cancer Association Consortium (BCAC). We identified 28 highly correlated common variants, in a 53 Kb region spanning two introns of the STXBP4 gene, that are strong candidates for driving breast cancer risk (lead SNP rs2787486 (OR = 0.92; CI 0.90-0.94; P = 8.96 × 10(-15))) and are correlated with two previously reported risk-associated variants at this locus, SNPs rs6504950 (OR = 0.94, P = 2.04 × 10(-09), r(2) = 0.73 with lead SNP) and rs1156287 (OR = 0.93, P = 3.41 × 10(-11), r(2) = 0.83 with lead SNP). Analyses indicate only one causal SNP in the region and several enhancer elements targeting STXBP4 are located within the 53 kb association signal. Expression studies in breast tumor tissues found SNP rs2787486 to be associated with increased STXBP4 expression, suggesting this may be a target gene of this locus.
Genome-wide association studies (GWASs) have revealed increased breast cancer risk associated with multiple genetic variants at 5p12. Here, we report the fine mapping of this locus using data from 104,660 subjects from 50 case-control studies in the Breast Cancer Association Consortium (BCAC). With data for 3,365 genotyped and imputed SNPs across a 1 Mb region (positions 44,394,495-45,364,167; NCBI build 37), we found evidence for at least three independent signals: the strongest signal, consisting of a single SNP rs10941679, was associated with risk of estrogen-receptor-positive (ER+) breast cancer (per-g allele OR ER+ = 1.15; 95% CI 1.13-1.18; p = 8.35 × 10-30). After adjustment for rs10941679, we detected signal 2, consisting of 38 SNPs more strongly associated with ER-negative (ER-) breast cancer (lead SNP rs6864776: per-a allele OR ER- = 1.10; 95% CI 1.05-1.14; p conditional = 1.44 × 10-12), and a single signal 3 SNP (rs200229088: per-t allele OR ER+ = 1.12; 95% CI 1.09-1.15; p conditional = 1.12 × 10-05). Expression quantitative trait locus analysis in normal breast tissues and breast tumors showed that the g (risk) allele of rs10941679 was associated with increased expression of FGF10 and MRPS30. Functional assays demonstrated that SNP rs10941679 maps to an enhancer element that physically interacts with the FGF10 and MRPS30 promoter regions in breast cancer cell lines. FGF10 is an oncogene that binds to FGFR2 and is overexpressed in ∼10% of human breast cancers, whereas MRPS30 plays a key role in apoptosis. These data suggest that the strongest signal of association at 5p12 is mediated through coordinated activation of FGF10 and MRPS30, two candidate genes for breast cancer pathogenesis.
Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes.
We analyzed 3,872 common genetic variants across the ESR1 locus (encoding estrogen receptor α) in 118,816 subjects from three international consortia. We found evidence for at least five independent causal variants, each associated with different phenotype sets, including estrogen receptor (ER(+) or ER(-)) and human ERBB2 (HER2(+) or HER2(-)) tumor subtypes, mammographic density and tumor grade. The best candidate causal variants for ER(-) tumors lie in four separate enhancer elements, and their risk alleles reduce expression of ESR1, RMND1 and CCDC170, whereas the risk alleles of the strongest candidates for the remaining independent causal variant disrupt a silencer element and putatively increase ESR1 and RMND1 expression.