Published online Nov 7, 2014. doi: 10.3748/wjg.v20.i41.15374
Revised: June 30, 2014
Accepted: July 24, 2014
Published online: November 7, 2014
Processing time: 195 Days and 19.9 Hours
AIM: To validate the Montreal classification system for Crohn’s disease (CD) and ulcerative colitis (UC) within the Netherlands.
METHODS: A selection of 20 de-identified medical records with an appropriate representation of the inflammatory bowel disease (IBD) sub phenotypes were scored by 30 observers with different professions (gastroenterologist specialist in IBD, gastroenterologist in training and IBD-nurses) and experience level with IBD patient care. Patients were classified according to the Montreal classification. In addition, participants were asked to score extra-intestinal manifestations (EIM) and disease severity in CD based on their clinical judgment. The inter-observer agreement was calculated by percentages of correct answers (answers identical to the “expert evaluation”) and Fleiss-kappa (κ). Kappa cut-offs: < 0.4-poor; 0.41-0.6-moderate; 0.61-0.8-good; > 0.8 excellent.
RESULTS: The inter-observer agreement was excellent for diagnosis (κ = 0.96), perianal disease (κ = 0.92) and disease location in CD (κ = 0.82) and good for age of onset (κ = 0.67), upper gastrointestinal disease (κ = 0.62), disease behaviour in CD (κ = 0.79) and disease extent in UC (κ = 0.65). Disease severity in UC was scored poor (κ = 0.23). The additional items resulted in a good inter-observer agreement for EIM (κ = 0.68) and a moderate agreement for disease severity in CD (κ = 0.44). Percentages of correct answers over all Montreal items give a good reflection of the inter-observer agreement (> 80%), except for disease severity (48%-74%). IBD-nurses were significantly worse in scoring upper gastrointestinal disease in CD compared to gastroenterologists (P = 0.008) and gastroenterologists in training (P = 0.040). Observers with less than 10 years of experience were significantly better at scoring UC severity than observers with 10-20 years (P = 0.003) and more than 20 years (P = 0.003) of experience with IBD patient care. Observers with 10-20 years of experience with IBD patient care were significantly better at scoring upper gastrointestinal disease in CD than observers with less than 10 years (P = 0.007) and more than 20 years (P = 0.007) of experience with IBD patient care.
CONCLUSION: We found a good to excellent inter-observer agreement for all Montreal items except for disease severity in UC (poor).
Core tip: According to our study, the Montreal is a reliable classification system for phenotypes in inflammatory bowel disease, except for disease severity in ulcerative colitis. The inter-observer agreement for scoring Crohn’s disease severity was moderate. This highlights the need for accurate medical reporting and the use of additional parameters to define and classify disease severity. Such alternations are necessary to ensure high-quality data in multicentre prospective data collections.