Editorial Open Access
Copyright ©2007 Baishideng Publishing Group Co., Limited. All rights reserved.
World J Gastroenterol. Nov 7, 2007; 13(41): 5421-5431
Published online Nov 7, 2007. doi: 10.3748/wjg.v13.i41.5421
Genetic epidemiology of primary sclerosing cholangitis
Tom H Karlsen, Erik Schrumpf, Kirsten Muri Boberg
Author contributions: All authors contributed equally to the work.
Correspondence to: Dr. Tom H Karlsen, Medical Department, Rikshospitalet-Radiumhospitalet Medical Center, N-0027 Oslo, Norway. t.h.karlsen@klinmed.uio.no
Telephone: + 47-23074226 Fax: +47-23073510
Received: June 6, 2007
Revised: July 31, 2007
Accepted: August 14, 2007
Published online: November 7, 2007


The aetiology of primary sclerosing cholangitis (PSC) is not known. A more than 80-fold increased risk of PSC among first-degree relatives emphasizes the importance of genetic factors. Genetic associations within the human leukocyte antigen (HLA) complex on chromosome 6p21 were detected in PSC 25 years ago. Subsequent studies have substantiated beyond doubt that one or more genetic variants located within this genetic region are important. The true identities of these variants, however, remain to be identified. Several candidate genes at other chromosomal loci have also been investigated. However, according to strict criteria for what may be denominated a susceptibility gene in complex diseases, no such gene exists for PSC today. This review summarises present knowledge on the genetic susceptibility to PSC, as well as genetic associations with disease progression and clinical subsets of particular interest (inflammatory bowel disease and cholangiocarcinoma).

Key Words: Primary sclerosing cholangitis; Genetic associations; Human leukocyte antigens; Cholang- iocarcinoma; Inflammatory bowel disease


Primary sclerosing cholangitis (PSC) is a chronic inflammatory condition of unknown aetiology, characterised by progressive strictures of the intra- and extrahepatic bile ducts and eventually liver cirrhosis and liver failure[1,2]. No effective medical treatment is currently available[3,4], and PSC is the major indication for liver transplantation in the Scandinavian countries as well as the fifth leading indication for liver transplantation in the United States[5,6]. Population-based studies of disease frequency are available from Norway, Great Britain and The United States[7-9], and indicate comparable incidence (0.9-1.3 per 100000/year) and prevalence (8.5-14.2 per 100000) rates for these populations. The prevalence of PSC is probably lower in Southern European and Asian populations[10]. In contrast to the female predominance of many autoimmune diseases, approximately 2/3 of the PSC patients are male[11]. Affected individuals are young (less than 40 years at time of diagnosis), and median survival from time of diagnosis by cholangiography to death or liver transplantation is approximately 12 years[11].

Up to 80% of the PSC patients of Northern European origin have concurrent inflammatory bowel disease (IBD)[10]. The frequency in Southern Europe and Asia is lower (around 50% and 35%, respectively)[12-14]. According to standard criteria[15], the IBD phenotype in PSC has mainly been classified as ulcerative colitis (UC), although an association with colonic Crohn's disease also exists[16,17]. The increased frequency of a variety of other autoimmune diseases (e.g., type 1 diabetes) among patients with PSC does not seem related to the increase in IBD[18]. There is also an increased risk of cancer among the patients with PSC, not only cholangiocarcinoma of the biliary tract (approximately 13%-14% in Scandinavia)[19,20], but also other gastrointestinal malignancies (i.e., pancreatic and colorectal cancer)[19]. The diagnosis of cholangiocarcinoma is difficult because the cholangiographic changes may look similar to those found in PSC without cholangiocarcinoma[21]. As a result, the cancer is often recognised at an advanced stage when treatment by liver transplantation does not improve survival[22].

Smoking is the only environmental factor known to influence PSC susceptibility and is associated with a reduced risk of the disease[23]. Several genetic risk factors, however, have been repeatedly described throughout the 25 years since they were first detected[24,25]. The present editorial aims to summarise present knowledge on statistical associations between genetic variants and risk of PSC or particular characteristics of PSC. In genetic epidemiology, disease characteristics under study are called phenotypes. Etymologically, the pheno-prefix refers to "visible" or "evident". Phenotypes, also referred to as traits, may be dichotomous (e.g., PSC/healthy) or quantitative (e.g., the level of alkaline phosphatase in a blood sample from a PSC patient). The clinical definition of a disease is primarily made to decide whether a particular treatment or follow-up may be indicated for a patient or not. This practical aspect means that PSC as a clinical "diagnosis" does not necessarily equal the ideal "phenotype" for genetic association studies. The disease phenotype in such studies should be as homogeneous as possible, simply because the presence of irrelevant phenotypes in a study population will reduce the strength of effects to be identified. The clinical phenotype of PSC is compound (Figure 1).

Figure 1
Figure 1 Primary sclerosing cholangitis (PSC) is a patchwork of different phenotypes in addition to the bile duct involvement. Most important are inflammatory bowel disease (IBD), malignancy and other autoimmune diseases. PSC is distinct from secondary sclerosing cholangitis (SSC).

In other diseases, susceptibility genes have been identified through genome-wide linkage scans followed by fine-mapping[26-28]. In PSC, the lack of families with affected sibling pairs has not allowed such studies to position susceptibility loci[28]. The search for PSC susceptibility genes has thus focused on plausible candidates with regard to function[25]. As a general basis for interpreting candidate gene association studies, an introduction to important concepts of such studies will be given, followed by a presentation and discussion of studies performed in PSC. We searched PubMed for relevant articles published up until the end of April 2007. We have also reviewed the reference lists of identified articles, as well as the reference lists of major immunogenetic- and hepatology conferences held over the last 2 years.


In genetic terms, PSC is considered a complex trait, meaning that polymorphisms in several genes along with environmental factors are required for disease development[27]. Heritability for a disease is measured by (a) concordance rates in monozygotic versus dizygotic twins and (b) relative risk in siblings of a patient (λs = prevalence among siblings divided by the general population prevalence). For monogenic disorders, λs ranges from several hundreds to several thousands, whereas values in complex traits are usually below 100. A strong genetic contribution to overall risk of PSC is supported by λs values of approximately 100[29], as compared with values of 15-35 for Crohn's disease and 6-9 for UC[30].

Polymorphisms are genetic variants that have arisen from mutational events in DNA[31]. Conventionally, to be denominated a polymorphism, a mutant variant should occur at a frequency of > 0.01 in the general population. A particular nucleotide (or nucleotide sequence) at a polymorphism is defined as an allele. The combination of alleles on the two chromosomes is termed the genotype of the individual at that position. A distinct combination of two or more alleles of polymorphisms that occur together on the same chromosome is defined as a haplotype.

When a mutation arises in a chromosomal region, it does so on a background of particular DNA variants that are already present in the population, i.e., the mutation is linked to these surrounding alleles by the integrity of the DNA molecule. Over time, recombination tends to separate a mutant allele from the alleles of the surrounding DNA. At the population level, the positive association that remains between particular alleles at linked polymorphisms is called linkage disequilibrium (LD), meaning that these alleles occur more frequently together than would be expected from their population frequencies. Recombination ultimately leads to loss of LD unless there is a selective advantage of particular allele combinations.

The relationship between disease phenotype and three of the genetic concepts described (polymorphisms, alleles and haplotypes), is the subject of genetic association studies. That is, the aim of genetic epidemiology is to identify alleles (or in diploid terms, genotypes) of polymorphisms that are associated with an increase or decrease in risk of disease or a particular characteristic of a disease. The advantage of LD is that all polymorphisms in a genetic region do not have to be genotyped to detect an association. This is because the causative variant will reside on the same haplotypes as other polymorphisms and can be indirectly detected by typing for these. The disadvantage of LD is that it may be almost impossible to determine which of a series of alleles in LD on a haplotype that is actually the causative variant. Most of the genetic variation (> 99%) in the human genome is believed to be without any phenotypic consequence[32].


Because of the low prevalence, a major limiting factor for statistical power in studies of PSC susceptibility genes is sample size. Figure 2 illustrates the statistical power as a function of the effect size (odds ratio; OR) and allele frequency of a genetic variant for studies performed in the largest PSC population in which studies have been performed so far (n = 365)[33]. Two issues require mentioning. First, very weak effects (OR ≈ 1.0-1.3) are likely to be missed, even for populations of this size. Second, rare variants of importance for PSC susceptibility (allele frequency < 0.01) are likely to be missed unless the OR of the variant is very high (or low; ORs < 1 were not plotted for clarity).

Figure 2
Figure 2 Statistical power (α = 0. 05) for different odds ratios and allele frequencies in a study of 365 patients and 365 controls, i.e., the number of alleles in each group is 2 n = 730.

An important controversy regarding the prospects of mapping the genetic predisposition to complex diseases is not related to statistical power, but the possible complexity of allelic variation at a susceptibility locus. Supporters of the "common-disease/common-variant" hypothesis argue that common diseases arise due to polymorphisms that are common (i.e., allele frequency > 0.10[34]) in the background population. Supporters of the "multiple rare variants" hypothesis point to the complexity observed at susceptibility loci in monogenic disorders, where multiple rare alleles define a similar phenotype (e.g., the hundreds of disease causing alleles at the cystic fibrosis transmembrane-conductance regulator locus)[35]. Possibly, susceptibility genes in complex diseases that are defined by multiple rare variants cannot be identified using regular LD based approaches[36]. Although PSC is relatively rare, the main HLA haplotypes that confer risk are relatively common (e.g., the frequency of the PSC associated ancestral HLA haplotype 8.1 is > 0.10 in Scandinavia[37]).

The abundance of false positive genetic association studies (i.e., typeIstatistical errors) represents a problem of legitimacy for this type of study design[38]. Simply using a P-value < 0.05 as "evidence" to distinguish between a "positive" and "negative" finding in these studies can be questioned[39]. The problem is partly related to the many statistical tests performed in these studies. The so-called Bonferroni correction (multiplying P-values with the number of comparisons that have been performed) is the most widely accepted strategy to account for this problem.

The Bonferroni approach has limitations. Due to the many tests that are theoretically possible throughout the genome, it can be argued that conservative significance levels of 10-5 or even 10-8 should be used for all tests[38,40]. Achieving such significance levels would require patient collections simply not available for rare diseases like PSC. The most recent proposal is that so-called permutation testing (in Latin, "permutare" means "change completely")within a dataset is the preferable strategy to take account of multiple testing[41]. In permutation tests, case/control assignment is shuffled randomly using a computer and tests are run over and over again to count how often the permuted dataset achieves the effect observed in the correctly ordered dataset. If the permuted dataset achieves an effect equal to or stronger than that observed in the original dataset in 500 out of 10000 analyses, this means that the probability of a typeIerror for a finding is 5%.

The problem of statistical significance in genetic association studies philosophically relates to the problem of causality for which criteria relevant to modern medicine were proposed by Sir Austen Bradford Hill in a classic essay in 1965[42]. These criteria point to factors in addition to the probability from statistical association tests (e.g., biological plausibility) that are required for a causal relationship to be established. This is also argued for in so-called Bayesian statistics, where the prior probability of a genetic variant to be associated (e.g., non-synonymous polymorphism in a gene which function is relevant to the disease phenotype), is accounted for when deciding on the posterior probability of whether or not a finding is valid[38]. In sum, circumstantial evidence (from functional studies or mouse models) is required to support findings if a genetic variant should be considered causative in terms of contributing to a disease phenotype[28], whatever the statistical evidence is available.


The HLA complex stretches across 7.6 million base pairs (bp) of DNA on the short arm of chromosome 6 and contains 252 expressed protein-coding genes, of which 28% are potentially related to immunological functions[43]. Throughout evolution of this genetic region[44], duplications have led to several gene clusters containing genes of similar function (Figure 3)[43]. HLA classImolecules (i.e., HLA-A, -B and -C) are expressed on all nucleated cells in the body and present intracellular/endogenous antigens to CD8+ T-lymphocytes. HLA classImolecules also serve as ligands for inhibitory killer immunoglobulin-like receptors (KIRs) on natural killer (NK) cells and γδ T-lymphocytes[45,46]. HLA class II molecules are expressed on antigen presenting cells (e.g., macrophages and dendritic cells) and present extracellular/exogenous antigens to CD4+ T-lymphocytes[45].

Figure 3
Figure 3 Schematic outline of the HLA complex on chromosome 6. Distances are arbitrary. By convention, the extended HLA complex stretches from the centromeric border of the HLA class II loci (HLA-DP) to the telomeric limit of the histone gene cluster more than 4 million bp from HLA-A[43,120]. Centromeric to the HLA-DQ loci, a region with intense recombination can be found ("recombination hot-spot")[132].

Sequence-based HLA-nomenclature was established in 1987[45]. The locus name is followed by an asterisk and two pairs of digits. The first pair of digits denominates the main type and is often similar to the serological type (e.g., DRB1*03 is the same as serological DR3, but DRB1*13 is only one of the DR6 alleles). The second pair of digits denominates the subtype (e.g., DRB1*0301 and DRB1*1301). Further definition is possible, since null alleles are suffixed by "N", and polymorphisms that do not alter the amino acid sequence of the peptide binding groove give rise to the fifth, sixth and seventh digits. In result, a complete sequence-based HLA allele name represents the haplotype of all alleles at all polymorphisms within the HLA gene at that chromosome.

LD between alleles at the HLA classIand II loci defines ancestral HLA haplotypes (AHs) and are named after which HLA-B allele they contain (e.g., the most common haplotype with HLA-B*08 is called AH8.1)[44]. Alleles of other genes are in LD with these ancestral haplotypes, and the co-occurrence of particular alleles across the entire HLA complex on one chromosome is called an extended HLA haplotype[47]. At the population level, the degree of conservation varies between different extended HLA haplotypes[48]. As examples of this phenomenon, an extended HLA haplotype with the HLA-B*08 and DRB1*0301 alleles (i.e., the AH8.1) is remarkably conserved in the Northern European population, whereas haplotypes carrying DRB1*04 alleles are considerably less conserved and may not even qualify for the denomination "extended haplotypes"[49].

A HLA association in PSC was first identified for HLA-B8 (i.e., HLA-B*0801) and DR3 (i.e., DRB1*0301)[24,50]. Later studies have verified that PSC associations exist also for the other alleles of the AH8.1 (the HLA-A1 allele[51], the HLA-C7 allele[52], the major histocompatibility complex classIchain-related A (MICA) *008/5.1 allele[53,54], and the tumour necrosis factor alpha (TNFα ) promoter -308 A allele[55,56]). This haplotype is associated with a wide range of autoimmune diseases[57,58]. A cross-European study (Norway, Sweden, Great Britain, Italy and Spain) concluded that a consistent, positive HLA class II association in PSC probably exists also for a haplotype that carries the DR6 (i.e., DRB1*1301) allele[37]. In individuals negative for DR3 and DR6, an association with haplotypes that carry the DR2 (i.e., DRB1*1501) allele can be found. Negative associations with HLA class II alleles have been reported for the DR4, DR7 and DR11 alleles[37,59,60], although primarily in populations of Northern European origin[56]. In Southern Europe, the picture is even more complex, since the DR4 allele seems to be consistent in LD with a predisposing variant in Italy[37,56], whereas a protective effect is noted in Spain[37].

Due to strong LD, an important question in HLA genetics is whether genetic associations are due to variation in the HLA classIor II genes (meaning that they arise because the patients are able to present particular antigens to the T-cell receptor)[61], or due to variation in neighbouring genes[62]. There is some degree of amino acid sequence similarity between several of the PSC associated HLA class II polypeptide variants[59,63]. However, no consistency has been found regarding these similarities[59]. The proposal of leucine at position 38 of the DRβ polypeptide as a critical determinant for PSC susceptibility relies heavily on the strong DRB3*0101 association in Northern European populations[63]. An early suggestion that a common denominator between haplotypes with the DRB1*0301 and DRB1*1301 alleles could be the DRB3*0101 allele (serologically DRw52a) was later withdrawn[64,65]. Another study found that the DRB1*1301-DRB3*0202 haplotype association is as strong as the DRB1*1301-DRB3*0101 association[37]. Taken together, the most interesting proposal of a single amino acid position in defining risk of PSC may rather relate to a protective effect in carriers of proline at position 55 of the DQβ polypeptide, which is common for DQ3 alleles known to be in LD with the protective DR4, DR7 and DR11 alleles[59]. However, no consistent risk allele is defined by this position[59], and to what extent the HLA class II molecules are of primary importance in the PSC pathogenesis should probably not be concluded based on present evidence.

The PSC-associated MICA*008/5.1 allele has been proposed as a common denominator between the PSC-associated A*01-C*07-B*08-DRB1*0301-DQB1*0201 and A*03-C*07-B*07-DRB1*1501-DQB1*0602 haplotypes[53,59]. MICA functions as a ligand for the activating NKG2D receptor on NK cells[66]. It was recently recognised that the two risk haplotypes in question share alleles not only at MICA, but also at the neighbouring HLA-B and -C loci, when these are defined according to the KIR binding properties of the HLA classImolecules[67]. The PSC-associated HLA-B and -C KIR ligand genotypes may result in decreased inhibition of NK cells and several subsets of T-lymphocytes that express KIRs[46,68]. Such combinations of KIR and HLA classIligand variants have been shown to increase susceptibility to other autoimmune diseases[46]. How the PSC-associated MICA*008/5.1 allele may cause disease is not known. This allele is also associated with an increased risk of other autoimmune conditions[69,70], and may thus also result in an increased activity of cells expressing the NKG2D receptor, acting in synergy with the loss of inhibition resulting from the PSC associated HLA classIligand genotypes. The fact that the MICA 5.1 allele was recently shown to confer protection against cholangiocarcinoma is in line with an activating effect[71]. Some studies report an increased frequency of NK cells in the portal infiltrate of patients with PSC when compared with other liver diseases[72,73], and also in the intestinal mucosa of patients with PSC without IBD compared with IBD patients without liver disease[74]. Taken together with the genetic findings in this region of the HLA complex (Figure 3), further studies on the role of these cells in PSC seem warranted.

In sum, the HLA association in PSC is likely to be complex. Multiple risk variants may exist[25], some of which may be associated not only with PSC, but autoimmunity in general.


Summarising the published genetic association studies in PSC, it seems proven beyond doubt that one or more genetic variants located within the HLA complex are important. The true identities of these variants, as discussed above, are not known. The situation is even less clear with regard to other susceptibility loci. Given the large number of protein coding genes in the human genome (25-35000)[32], selecting candidate genes for association studies is an extremely difficult task. According to strict criteria for what may be denominated a susceptibility gene in complex diseases (consistent statistical evidence, functional consequence of identified mutation, relevant tissue expression, etc.)[28], no such gene exists for PSC. A summary of studies performed is given in Table 1. So far, most attention has been given to genes known to be of importance in other autoimmune diseases. The association between PSC and IBD has also inspired some of the studies, as well as the observation of PSC-like changes in cystic fibrosis[75].

Table 1 Candidate gene studies performed in PSC.
GeneChromosomeN (PSC)Primary findingReferenceReplicationfindingReference

Two of the negative findings are of particular interest and will be discussed in greater detail. First, studies in limited populations (n < 50) have pointed to a non-significant increase of particular multidrug resistance gene 3 (MDR3) variants among PSC patients as compared with healthy controls[86,87]. Knock-out mice for this phospholipid transporter gene (called mdr2 in mice) spontaneously develop hepatic lesions resembling PSC[92], possibly due to loss of protection of the biliary epithelium from toxic bile acids. Second, it cannot be formally ruled out that the 32 bp deletion of the chemokine receptor 5 (CCR5) gene and the E/E genotype of the K469E SNP in the intercellular adhesion molecule 1 (ICAM-1) gene may confer population specific effects[80,81,93]. Both genes are plausible candidate genes in PSC. The CCR5 may be involved in the recruitment of intestinally activated lymphocytes via portal expression of CCR5 ligands (e.g., the macrophage inflammatory protein-1α and β), and ICAM-1 may play a similar role in recruiting leukocytes to an inflamed liver by interacting with the β2-integrin ligand. The negative findings in the replication series referred to in Table 1 state it unlikely that genetic variants of these receptors are of primary importance in the pathogenesis of PSC. The receptors may, however, still be involved in the disease process along with other CCRs and adhesion molecules [e.g., CCR9 and the mucosal addressin cell adhesion molecule 1 (MAdCAM-1)[94,95]].


The most prominent features of PSC along with the biliary changes are inflammatory bowel disease, cholangiocarcinoma and other autoimmune diseases (Figure 1).

The increased frequency of autoimmune diseases among patients with PSC is possibly due to the increased frequency of the AH8.1 among the patients[58,96]. Similarly, an increased frequency of IBD risk alleles among patients with PSC could contribute to the co-occurrence of these two phenotypes. Several IBD susceptibility genes have been identified during the last 6 years through the application of genome-wide linkage screens and subsequent fine-mapping approaches[26]. To determine if the high frequency of IBD among patients with PSC could be due to genetic risk factors shared with IBD in general, we recently genotyped key polymorphisms of known IBD susceptibility genes in a large cohort of Scandinavian PSC patients[97]. The following genes were studied: caspase activating recruitment domain 15 (CARD15), toll-like receptor 4 (TLR-4), caspase activating recruitment domain 4 (CARD4), solute carrier family 22, member 4 and 5 (SLC22A4 and SLC22A5), Drosophila discs large homolog 5 (DLG5) and multidrug resistance gene 1 (MDR1)[26,98]. No significant PSC associations were detected for any of the investigated polymorphisms[97]. These negative findings add to notions that the IBD phenotype in PSC may be a "third" IBD phenotype[99], possibly distinct from UC and Crohn's disease not only in clinical presentation, but also with regard to genetic susceptibility.

It is of interest to know whether genetic associations detected in PSC may be of particular importance for the IBD phenotype among the PSC patients or patients with IBD in general. In a recent study of HLA alleles in PSC and UC patients of the same ethnicity[100], the only parallel association detected was a protective effect of the DRB1*0404 allele, more pronounced among the PSC patients than among the patients with UC without liver disease. No association with any of the main PSC risk alleles (DRB1*0301, DRB1*1301 or DRB1*1501) was found among the regular UC patients. Interestingly, a non-significant trend towards a higher frequency of the DRB1*1501 allele was noted among the patients with PSC and concurrent IBD compared with PSC patients without IBD, and the possibility should be held open that this HLA haplotype may harbour genetic variants of particular importance for the IBD phenotype in PSC. A similar notion can be made with regard to the MMP3 5A allele association detected by Satsangi et al[79]. Although the replication study by Wiencke et al[78] failed to confirm an overall association with PSC susceptibility, a significant association was evident when PSC patients with UC were compared with UC patients without liver disease.

The study by Wiencke et al[78] also detected a possible association between cholangiocarcinoma and the MMP1 1G allele. Although the number of patients with cholangiocarcinoma in this series was too small for conclusive statistics to be performed (n = 15), the 100% occurrence of this allele among the cholangiocarcinoma patients warrants future replication attempts in other study populations. Recently, a highly significant association between polymorphisms in the NKG2D gene and cholangiocarcinoma in PSC was detected[71]. Previous studies have highlighted the importance of this activating NK cell receptor in protection against other cancer types[66]. Persistent exposure to effector molecules of inflammatory pathways (e.g., IL-6[101]), along with chronic cholestasis[102], is probably important for the malignant transformation of cholangiocytes. The study by Melum et al[33] points to the possible role of NK cell activity in protection against neoplastic cells. Polymorphisms of the NKG2D gene along with other parameters may also prove important in identifying PSC patients at a particular low risk of developing cholangiocarcinoma.


There is an increasing interest in so-called "modifier genes" in complex diseases (as compared with "susceptibility genes"),initiated by the recognition of the influence of such genes on disease expression (e.g., severity) in monogenic disorders like cystic fibrosis and haemochromatosis[103-105]. Modifier genes may point to biochemical and physiological systems of relevance to prognosis and are therefore of great clinical interest. Although PSC should be considered a progressive condition culminating in death or liver transplantation in most cases[106], the clinical course for each individual patient varies considerably[107,108]. In terms of disease course, indicators of PSC severity (e.g., portal hypertension and need for liver transplantation) are more likely to represent a particular disease stage than to serve as valid measures of disease progression. The most precise strategy for performing enquiries on effects from genotypes on disease course in PSC is thus to compare absolute survival time (defined as time from diagnosis until death or liver transplantation) using Kaplan-Meyer analyses, or calculating the relative risk for death and/or liver transplantation from Cox regressions[109,110].

We have recently observed that genetic variants of the steroid and xenobiotic receptor (SXR) are associated with a more aggressive disease course in PSC[110]. The SXR is a ligand-dependent transcription factor known to mediate protection against bile acid-induced liver injury in cholestatic animal models[111,112]. In this perspective, our data may suggest that the activity of bile acid detoxification systems could be of importance for disease progression in PSC. Interestingly, the SXR ligand rifampicin has been used in the treatment of cholestatic pruritus[113], and it has also been shown that ursodeoxycholic acid is able to activate SXR in human hepatocytes[114]. However, the SXR may also influence inflammatory pathways via the pro-inflammatory transcription factor nuclear factor kappa B (NF-κB)[115], as well as liver fibrogenesis and thus cirrhosis via direct effects on hepatic stellate cells and Kuppfer cells[116]. Further studies are needed to clarify the functional consequences of various polymorphisms of the SXR gene in patients with PSC.

The SXR variants associated with death or liver transplantation in our study were not associated with PSC susceptibility[110]. However, also for some of the disease- associated variants in the HLA complex, modifier effects have been observed. The first notion was made by Gow et al[117] who described an unusually aggressive disease progression in four patients carrying the DR4 allele. Later, Boberg et al[109] found that DR4 positive patients have an increased risk of cholangiocarcinoma, but do formally not experience an accelerated disease progression. In this study, an increased risk of death or liver transplantation was observed in patients heterozygous for the DR3-DQ2 haplotype. As long as the causative variants along the HLA haplotypes in question have not been identified, one can only hypothesize upon a biological explanation for these observations. Given the complexity of the HLA associations in PSC, it is even possible that other variants within this region may be important for disease progression than those primarily important for disease susceptibility. However, for the same reasons it has been difficult to pinpoint susceptibility genes in this region (strong LD, multiple genes of immunological relevance, etc.), such modifier genes may prove hard to identify conclusively.


Although several important findings have been made during the past 25 years since the first genetic association study in PSC was performed[24], PSC remains an enigmatic disease and future studies are warranted. With an ever increasing availability of methods for efficient genotyping of polymorphisms[118], a critical limitation for such studies in PSC is the availability of well-characterised patient materials. Collaborative efforts will be necessary to achieve patient collections required for detecting the modest effects (Figure 2), as well as for replicating results of uncertain validity[33]. Such collaborations are now being undertaken in other diseases[119], and have successfully aided in clarifying genetic associations found in PSC[37].

In terms of future research strategies, several proposals can be made. First, dissection of the widely replicated HLA-associated susceptibility to PSC should be considered a priority. Detailed maps of genetic markers in this region are now available[120]. It is anticipated that the systematic application of such marker maps in populations of an appropriate size may lead to the identification of true, disease causing variants in this difficult region[62].

Second, some biological pathways are pointed to by existing findings (e.g., the possible importance of bile acid homeostasis in influencing disease progression), and further candidate gene studies of critical components of these systems may identify additional risk factors. There is increasing awareness of the importance of interaction between polymorphisms in functionally related genes in complex diseases, i.e., epistasis[121,122]. In some cases, epistatic considerations have proven necessary for the detection of effects from genetic variation on a phenotype of interest[123,124]. These observations have implications for study design in future candidate gene studies in PSC. Polymorphisms not only in single genes, but in relevant panels of several genes encoding proteins with closely related functions, should be investigated.

Finally, two recent advances in the genetic research field now make genome-wide studies feasible also for case-control materials. First, the human haplotype map project (HAPMAP) was recently completed[125]. In the project, 3.9 million SNPs have been genotyped in families of three different ethnicities (at the time of writing). Results from the project enable researchers worldwide to efficiently select SNPs throughout the genome that are prone to cover genetic variation of interest to a project[126,127]. Second, although costs are high, genotyping technology now allows for the typing of 100000's of SNPs simultaneously in the same DNA sample[118]. Emerging reports provide proof-of-concept for genome-wide case-control studies[128,129]. However, there are still statistical problems to be solved regarding the many tests performed and risk of false positive results[130]. As evident from Figure 2, only strong effects may be detectable, and prospects may not yet justify the costs. However, sooner or later genome-wide studies seem warranted, also in PSC. Possibly, PSC susceptibility genes will be identified that would otherwise never have been included in hypothesis-driven candidate gene studies of the type performed so far[131].


