1
|
Sakamaki K, Morita Y, Iba K, Kamiura T, Yoshida S, Ogawa N, Suganami H, Tsuchiya S, Fukimbara S. Multiplicity Adjustment and Sample Size Calculation in Clinical Trials with Multiple Endpoints: An Industry Survey of Current Practices in Japan. Ther Innov Regul Sci 2020; 54:1097-1105. [PMID: 32030692 DOI: 10.1007/s43441-020-00126-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 01/24/2020] [Indexed: 12/01/2022]
Abstract
BACKGROUND Two issues in clinical trials with multiple endpoints were surveyed: (1) the terminology of multiple endpoints, the relationship between rare events and endpoints, and the differences in multiplicity adjustment between regions, and (2) the current practice on multiplicity adjustment and sample size calculation. This article summarizes the results of the survey on the second issue. METHODS Eligible trials for this survey fulfilled the following conditions: (1) confirmatory phase 3 trial; (2) use of multiple primary endpoints, co-primary endpoints, key secondary endpoint(s) or composite endpoint(s); (3) inclusion of Japanese participants; and (4) protocols created in 2010 or later. The survey was conducted at member companies of the Japan Pharmaceutical Manufacturers Association from October 2017 to November 2017. RESULTS Useable responses were obtained from 78 trials in 13 companies based in Japan and 9 companies based in other countries. The Bonferroni procedure was mostly used in clinical trials with multiple primary endpoints, while multiple testing procedures that consider a hierarchy of endpoints or a structure of hypotheses were used in clinical trials with key secondary endpoint(s). In sample size calculation, we can consider the probability of study success, such as the probability of statistical significance in at least one comparison of primary endpoints; however, other probabilities were also considered. This survey reveals that multiplicity adjustment and the correlation of endpoints were not always considered in sample size calculation. CONCLUSIONS In clinical trials with multiple endpoints, clinical importance was considered when determining multiple testing procedures. Challenges remain with the definition of power, the consideration of multiple testing procedures and the correlation between endpoints in sample size calculation.
Collapse
Affiliation(s)
- Kentaro Sakamaki
- Department of Biostatistics and Bioinformatics, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8655, Japan.
- Center for Data Science, Yokohama City University, Yokohama, Japan.
| | - Yusuke Morita
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Clinical Data Science and Affairs, Kyorin Pharmaceutical Co., Ltd., Tokyo, Japan
| | - Katsuhiro Iba
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Department of Biometrics, Otsuka Pharmaceutical Co., Ltd., Tokyo, Japan
| | - Toshifumi Kamiura
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Data Science Department, Nippon Shinyaku Co., Ltd., Kyoto, Japan
| | - Seitaro Yoshida
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Clinical Information & Intelligence Department, Chugai Pharmaceutical Co., Ltd., Tokyo, Japan
| | - Naoyuki Ogawa
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Clinical Development Department, Sanwa Kagaku Kenkyusho Co., Ltd., Nagoya, Japan
| | - Hideki Suganami
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Clinical Data Science Department, Kowa Company, Ltd., Tokyo, Japan
| | - Satoru Tsuchiya
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Data Science, Sumitomo Dainippon Pharma Co., Ltd., Tokyo, Japan
| | - Satoru Fukimbara
- Data Science Expert Committee, Drug Evaluation Committee, Japan Pharmaceutical Manufacturers Association, Tokyo, Japan
- Data Science, Ono Pharmaceutical Co., Ltd., Osaka, Japan
| |
Collapse
|
2
|
Ye Z, Wang Z, Hou Y. Does Bonferroni correction "rescue" the deviation from Hardy-Weinberg equilibrium? Forensic Sci Int Genet 2020; 46:102254. [PMID: 32006894 DOI: 10.1016/j.fsigen.2020.102254] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Revised: 01/15/2020] [Accepted: 01/19/2020] [Indexed: 11/25/2022]
Abstract
The application of Bonferroni correction (BC) has been constantly controversial; nevertheless, in forensic population genetics research, it is common to apply it to Hardy-Weinberg equilibrium (HWE) tests referring to multiple loci. This letter aimed to discuss the problems of applying BC to HWE tests involving multiple loci by surveying population genetics research studies published over the last 10 years (2009-2019) from two major forensic genetic journals: Forensic Science International: Genetics (FSIG) and the International Journal of Legal Medicine (IJLM). The results showed that there was no uniform standard of whether to apply BC to HWE tests or not, and researchers commonly did not provide any explanation for the observation of deviations from HWE. Despite its widespread use in population genetics, BC may not guarantee a prudent result due to an irrelevant null hypothesis, reluctance to reject the null hypothesis, different interpretations of identical p-values, and inflated type Ⅱ error. We recommended a notable two-step approach suggested by Waples to evaluate the results of HWE tests: 1) identifying causes of departures from HWE and 2) evaluating the biological consequences of HW departures. In addition, for forensic researchers, we suggested that if a certain degree of deviation from HWE does occur, the first step to take should involve checking the technique and genotyping results carefully rather than recklessly using BC. In conclusion, according to the purpose of forensic population research, applying BC to HWE tests is unnecessary; rather, an unadjusted α should be used. BC does not "rescue" the deviation from HWE. To "rescue" it indeed, directly discussing the possible explanation for each departure from HWE and simply describing what has been done sequentially and why should be enough for readers to reach a reasonable conclusion even without the help of Bonferroni methods.
Collapse
Affiliation(s)
- Ziwei Ye
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China
| | - Zheng Wang
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China
| | - Yiping Hou
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu 610041, China.
| |
Collapse
|
3
|
Lemoine NP. Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. OIKOS 2019. [DOI: 10.1111/oik.05985] [Citation(s) in RCA: 171] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
4
|
Liu F, Tong T, Huang D, Yuan W, Li D, Lin J, Cai S, Xu Y, Chen W, Sun Y, Zhuang J. CapeOX perioperative chemotherapy versus postoperative chemotherapy for locally advanced resectable colon cancer: protocol for a two-period randomised controlled phase III trial. BMJ Open 2019; 9:e017637. [PMID: 30700474 PMCID: PMC6352769 DOI: 10.1136/bmjopen-2017-017637] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
INTRODUCTION Adjuvant chemotherapy with the CapeOX regimen is now widely used for treating colorectal cancer. However, prior studies have demonstrated better efficacy of pre-operative/neoadjuvant chemotherapy without increase of safety risks. METHODS AND ANALYSIS This multicentre, open-label, parallel-group, randomised, controlled, phase III study aims to compare the efficacy and safety of perioperative CapeOX chemotherapy with the postoperative one for treating patients with locally advanced R0 resectable colon cancers in China. In total 1370 eligible patients will be randomised to: the test group, up to four cycles (every 3 weeks is a cycle, Q3W) of chemotherapy plus radical surgery plus up to four cycles of post-operative chemotherapy; or the control group, radical surgery first, then up to eight cycles of chemotherapy. In each cycle, oxaliplatin will be given at a dose of 130 mg/m2 through continuous IV infusion for 2 hours on the first day. From day 1 to day 14, capecitabine will be taken orally every morning and evening at a dose of 1000mg/m2/d. The primary outcome measure is the 3-year disease free survival. The objective response rate, R0 resection rate, overall survival, as well as the adverse events will also be measured as second endpoints. The study may include two periods. If results of period 1 are not favourable, period 2 will be initiated, recruiting genetically sensitive patients and repeating the same process with period 1. ETHICS AND DISSEMINATION Informed consent will be required from, and provided, by all subjects. The study protocol has been approved by the independent ethics committee of Shanghai Fudan University Cancer Centre. This study will clearly demonstrate the potential benefit of perioperative chemotherapy with the CapeOX regimen. Results will be shared among all the participating centres, and with policymakers and the academic community to promote the clinical management of colon cancer. TRIAL REGISTRATION NUMBER NCT03125980.
Collapse
Affiliation(s)
- Fangqi Liu
- Department of Colorectal Surgery, Fudan University Shanghai Cancer Centre, Shanghai, China
- Department Of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Tong Tong
- Department Of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Department of Radiology, Fudan University Shanghai Cancer Centre, Shanghai, China
| | - Dan Huang
- Department Of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
- Department of Pathology, Fudan University Shanghai Cancer Centre, Shanghai, China
| | - Weitang Yuan
- Department of Anorectal Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Dechuan Li
- Department of Colorectal Surgery, Zhejiang Cancer Hospital, Hangzhou, Zhejiang, China
| | - Jianjiang Lin
- Department of Anorectal Surgery, The First Affiliated Hospital of Zhejiang University, Hangzhou, China
| | - Sanjun Cai
- Department of Colorectal Surgery, Fudan University Shanghai Cancer Centre, Shanghai, China
- Department Of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Ye Xu
- Department of Colorectal Surgery, Fudan University Shanghai Cancer Centre, Shanghai, China
- Department Of Oncology, Shanghai Medical College, Fudan University, Shanghai, China
| | - Wenbin Chen
- Department of Anorectal Surgery, The First Affiliated Hospital of Zhejiang University, Hangzhou, China
| | - Yueming Sun
- Department of Colorectal Surgery, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Jing Zhuang
- Department of General Surgery, Zhengzhou University Cancer Hospital, Zhengzhou, China
| |
Collapse
|
5
|
Fowler D, Hodgekins J, French P, Marshall M, Freemantle N, McCrone P, Everard L, Lavis A, Jones PB, Amos T, Singh S, Sharma V, Birchwood M. Social recovery therapy in combination with early intervention services for enhancement of social recovery in patients with first-episode psychosis (SUPEREDEN3): a single-blind, randomised controlled trial. Lancet Psychiatry 2018; 5:41-50. [PMID: 29242000 PMCID: PMC5818038 DOI: 10.1016/s2215-0366(17)30476-5] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/28/2017] [Revised: 10/30/2017] [Accepted: 11/02/2017] [Indexed: 10/28/2022]
Abstract
BACKGROUND Provision of early intervention services has increased the rate of social recovery in patients with first-episode psychosis; however, many individuals have continuing severe and persistent problems with social functioning. We aimed to assess the efficacy of early intervention services augmented with social recovery therapy in patients with first-episode psychosis. The primary hypothesis was that social recovery therapy plus early intervention services would lead to improvements in social recovery. METHODS We did this single-blind, phase 2, randomised controlled trial (SUPEREDEN3) at four specialist early intervention services in the UK. We included participants who were aged 16-35 years, had non-affective psychosis, had been clients of early intervention services for 12-30 months, and had persistent and severe social disability, defined as engagement in less than 30 h per week of structured activity. Participants were randomly assigned (1:1), via computer-generated randomisation with permuted blocks (sizes of four to six), to receive social recovery therapy plus early intervention services or early intervention services alone. Randomisation was stratified by sex and recruitment centre (Norfolk, Birmingham, Lancashire, and Sussex). By necessity, participants were not masked to group allocation, but allocation was concealed from outcome assessors. The primary outcome was time spent in structured activity at 9 months, as measured by the Time Use Survey. Analysis was by intention to treat. This trial is registered with ISRCTN, number ISRCTN61621571. FINDINGS Between Oct 1, 2012, and June 20, 2014, we randomly assigned 155 participants to receive social recovery therapy plus early intervention services (n=76) or early intervention services alone (n=79); the intention-to-treat population comprised 154 patients. At 9 months, 143 (93%) participants had data for the primary outcome. Social recovery therapy plus early intervention services was associated with an increase in structured activity of 8·1 h (95% CI 2·5-13·6; p=0·0050) compared with early intervention services alone. No adverse events were deemed attributable to study therapy. INTERPRETATION Our findings show a clinically important benefit of enhanced social recovery on structured activity in patients with first-episode psychosis who received social recovery therapy plus early intervention services. Social recovery therapy might be useful in improving functional outcomes in people with first-episode psychosis, particularly in individuals not motivated to engage in existing psychosocial interventions targeting functioning, or who have comorbid difficulties preventing them from doing so. FUNDING National Institute for Health Research.
Collapse
Affiliation(s)
- David Fowler
- Psychology Department, University of Sussex, Brighton, UK.
| | - Jo Hodgekins
- Norwich Medical School, University of East Anglia, Norwich, UK
| | - Paul French
- Psychosis Research Unit, Greater Manchester Mental Health NHS Trust, Manchester, UK; Institute of Health and Psychology, University of Liverpool, Liverpool, UK
| | - Max Marshall
- Lancashire Care NHS Foundation Trust, Preston, UK
| | | | - Paul McCrone
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Linda Everard
- Birmingham and Solihull NHS Mental Health Foundation Trust, Birmingham, UK
| | - Anna Lavis
- University of Birmingham, Birmingham, UK
| | | | - Tim Amos
- University of Bristol, Bristol, UK
| | | | - Vimal Sharma
- University of Chester, Chester, UK; Cheshire and Wirral Partnership NHS Foundation Trust, Chester, UK
| | | |
Collapse
|
6
|
Heston TF, King JM. Predictive power of statistical significance. World J Methodol 2017; 7:112-116. [PMID: 29354483 PMCID: PMC5746664 DOI: 10.5662/wjm.v7.i4.112] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2017] [Revised: 11/23/2017] [Accepted: 12/04/2017] [Indexed: 02/06/2023] Open
Abstract
A statistically significant research finding should not be defined as a P-value of 0.05 or less, because this definition does not take into account study power. Statistical significance was originally defined by Fisher RA as a P-value of 0.05 or less. According to Fisher, any finding that is likely to occur by random variation no more than 1 in 20 times is considered significant. Neyman J and Pearson ES subsequently argued that Fisher’s definition was incomplete. They proposed that statistical significance could only be determined by analyzing the chance of incorrectly considering a study finding was significant (a Type I error) or incorrectly considering a study finding was insignificant (a Type II error). Their definition of statistical significance is also incomplete because the error rates are considered separately, not together. A better definition of statistical significance is the positive predictive value of a P-value, which is equal to the power divided by the sum of power and the P-value. This definition is more complete and relevant than Fisher’s or Neyman-Peason’s definitions, because it takes into account both concepts of statistical significance. Using this definition, a statistically significant finding requires a P-value of 0.05 or less when the power is at least 95%, and a P-value of 0.032 or less when the power is 60%. To achieve statistical significance, P-values must be adjusted downward as the study power decreases.
Collapse
Affiliation(s)
- Thomas F Heston
- Department of Family Medicine, University of Washington, Seattle, WA 98195-6340, United States
- Department of Medical Education and Clinical Sciences, Elson S. Floyd College of Medicine, Washington State University, Spokane, WA 99210-1495, United States
| | - Jackson M King
- Department of Medical Education and Clinical Sciences, Elson S. Floyd College of Medicine, Washington State University, Spokane, WA 99210-1495, United States
| |
Collapse
|
7
|
Ghooi RB, Bhosale N, Wadhwani R, Divate P, Divate U. Assessment and classification of protocol deviations. Perspect Clin Res 2016; 7:132-6. [PMID: 27453830 PMCID: PMC4936072 DOI: 10.4103/2229-3485.184817] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION Deviations from the approved trial protocol are common during clinical trials. They have been conventionally classified as deviations or violations, depending on their impact on the trial. METHODS A new method has been proposed by which deviations are classified in five grades from 1 to 5. A deviation of Grade 1 has no impact on the subjects' well-being or on the quality of data. At the maximum, a deviation Grade 5 leads to the death of the subject. This method of classification was applied to deviations noted in the center over the last 3 years. RESULTS It was observed that most deviations were of Grades 1 and 2, with fewer falling in Grades 3 and 4. There were no deviations that led to the death of the subject (Grade 5). DISCUSSION This method of classification would help trial managers decide on the action to be taken on the occurrence of deviations, which would be based on their impact.
Collapse
Affiliation(s)
| | - Neelambari Bhosale
- Jehangir Clinical Development Centre, Jehangir Hospital, Pune, Maharashtra, India
| | - Reena Wadhwani
- Jehangir Clinical Development Centre, Jehangir Hospital, Pune, Maharashtra, India
| | - Pathik Divate
- Jehangir Clinical Development Centre, Jehangir Hospital, Pune, Maharashtra, India
| | - Uma Divate
- Jehangir Clinical Development Centre, Jehangir Hospital, Pune, Maharashtra, India
| |
Collapse
|
8
|
|
9
|
Streiner DL. Best (but oft-forgotten) practices: the multiple problems of multiplicity-whether and how to correct for many statistical tests. Am J Clin Nutr 2015; 102:721-8. [PMID: 26245806 DOI: 10.3945/ajcn.115.113548] [Citation(s) in RCA: 218] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Accepted: 07/10/2015] [Indexed: 11/14/2022] Open
Abstract
Testing many null hypotheses in a single study results in an increased probability of detecting a significant finding just by chance (the problem of multiplicity). Debates have raged over many years with regard to whether to correct for multiplicity and, if so, how it should be done. This article first discusses how multiple tests lead to an inflation of the α level, then explores the following different contexts in which multiplicity arises: testing for baseline differences in various types of studies, having >1 outcome variable, conducting statistical tests that produce >1 P value, taking multiple "peeks" at the data, and unplanned, post hoc analyses (i.e., "data dredging," "fishing expeditions," or "P-hacking"). It then discusses some of the methods that have been proposed for correcting for multiplicity, including single-step procedures (e.g., Bonferroni); multistep procedures, such as those of Holm, Hochberg, and Šidák; false discovery rate control; and resampling approaches. Note that these various approaches describe different aspects and are not necessarily mutually exclusive. For example, resampling methods could be used to control the false discovery rate or the family-wise error rate (as defined later in this article). However, the use of one of these approaches presupposes that we should correct for multiplicity, which is not universally accepted, and the article presents the arguments for and against such "correction." The final section brings together these threads and presents suggestions with regard to when it makes sense to apply the corrections and how to do so.
Collapse
Affiliation(s)
- David L Streiner
- Department of Psychiatry and Behavioral Neurosciences, McMaster University, Hamilton, Canada, and Department of Psychiatry, University of Toronto, Toronto, Canada
| |
Collapse
|
10
|
Armstrong RA. When to use the Bonferroni correction. Ophthalmic Physiol Opt 2014; 34:502-8. [DOI: 10.1111/opo.12131] [Citation(s) in RCA: 1269] [Impact Index Per Article: 115.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2014] [Accepted: 03/13/2014] [Indexed: 12/19/2022]
|
11
|
Xie C. Relations among Three Parametric Multiple Testing Methods for Correlated Tests. J STAT COMPUT SIM 2014; 84:812-818. [PMID: 24659830 DOI: 10.1080/00949655.2012.729212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Multiple endpoints in clinical trials are usually correlated. To control the family-wise type I error rate, both Huque and Alosh's flexible fixed-sequence (FFS) testing method and Li and Mehrotra's adaptive alpha allocation approach (4A) have taken into account correlations among endpoints. I suggested a weighted multiple testing correction (WMTC) for correlated tests and compared it with FFS. However, the relationship between the 4A method and the FFS method or the relationship between the 4A method and the WMTC method has not been studied. In this paper, simulations are conducted to investigate these relationships. Tentative guidelines to help choosing an appropriate method are provided.
Collapse
Affiliation(s)
- Changchun Xie
- Division of Epidemiology and Biostatistics, Department of Environmental Health, University of Cincinnati; Center for Clinical and Translational Science and Training, University of Cincinnati, Ohio 45267, USA
| |
Collapse
|
12
|
Hickey R, Vouche M, Sze DY, Hohlastos E, Collins J, Schirmang T, Memon K, Ryu RK, Sato K, Chen R, Gupta R, Resnick S, Carr J, Chrisman HB, Nemcek AA, Vogelzang RL, Lewandowski RJ, Salem R. Cancer concepts and principles: primer for the interventional oncologist-part I. J Vasc Interv Radiol 2013; 24:1157-64. [PMID: 23809510 DOI: 10.1016/j.jvir.2013.04.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2003] [Revised: 04/21/2013] [Accepted: 04/22/2013] [Indexed: 01/22/2023] Open
Abstract
A sophisticated understanding of the rapidly changing field of oncology, including a broad knowledge of oncologic disease and the therapies available to treat them, is fundamental to the interventional radiologist providing oncologic therapies, and is necessary to affirm interventional oncology as one of the four pillars of cancer care alongside medical, surgical, and radiation oncology. The first part of this review intends to provide a concise overview of the fundamentals of oncologic clinical trials, including trial design, methods to assess therapeutic response, common statistical analyses, and the levels of evidence provided by clinical trials.
Collapse
Affiliation(s)
- Ryan Hickey
- Department of Radiology and Division of Interventional Oncology, Northwestern University, Chicago, IL 60611, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Xie C, Lu X, Pogue J, Chen DG(D. Weighted Multiple Testing Correction for Correlated Binary Endpoints. COMMUN STAT-SIMUL C 2013. [DOI: 10.1080/03610918.2012.674599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
14
|
Li H, Sankoh AJ, D'Agostino RB. Extension of adaptive alpha allocation methods for strong control of the family-wise error rate. Stat Med 2012; 32:181-95. [DOI: 10.1002/sim.5485] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2010] [Accepted: 04/03/2012] [Indexed: 11/08/2022]
Affiliation(s)
- Haihong Li
- Vertex Pharmaceuticals; 130 Waverly Street, Cambridge; MA; 02139; U.S.A
| | - Abdul J. Sankoh
- Vertex Pharmaceuticals; 130 Waverly Street, Cambridge; MA; 02139; U.S.A
| | | |
Collapse
|
15
|
Gronseth GS, Ashman E. The AAN response to evidence-based medicine: promise and pitfalls. Mult Scler 2012; 18:949-50. [DOI: 10.1177/1352458512448449] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Gary S. Gronseth
- Professor of Neurology, University of Kansas Evidence-based Medicine Methodologist, American Academy of Neurology, USA
| | - Eric Ashman
- 673rd Medical Group, 5955 Zeamer Avenue, Alaska, USA
| |
Collapse
|
16
|
Xie C. Weighted multiple testing correction for correlated tests. Stat Med 2011; 31:341-52. [DOI: 10.1002/sim.4434] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2011] [Revised: 08/06/2011] [Accepted: 09/12/2011] [Indexed: 11/08/2022]
Affiliation(s)
- Changchun Xie
- Department of Clinical Epidemiology and Biostatistics; McMaster University; Population Health Research Institute, HamiltonHealth Sciences, and McMaster University; Hamilton Ontario Canada
| |
Collapse
|
17
|
Affiliation(s)
- David L Streiner
- Department of Psychiatry and Behavioural Neurosciences, Hamilton, ON, Canada; Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, Canada; Department of Psychiatry, University of Toronto, Toronto, ON, Canada.
| | - Geoffrey R Norman
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
18
|
French B, Joo J, Geller NL, Kimmel SE, Rosenberg Y, Anderson JL, Gage BF, Johnson JA, Ellenberg JH. Statistical design of personalized medicine interventions: the Clarification of Optimal Anticoagulation through Genetics (COAG) trial. Trials 2010; 11:108. [PMID: 21083927 PMCID: PMC3000386 DOI: 10.1186/1745-6215-11-108] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 11/17/2010] [Indexed: 11/16/2022] Open
Abstract
Background There is currently much interest in pharmacogenetics: determining variation in genes that regulate drug effects, with a particular emphasis on improving drug safety and efficacy. The ability to determine such variation motivates the application of personalized drug therapies that utilize a patient's genetic makeup to determine a safe and effective drug at the correct dose. To ascertain whether a genotype-guided drug therapy improves patient care, a personalized medicine intervention may be evaluated within the framework of a randomized controlled trial. The statistical design of this type of personalized medicine intervention requires special considerations: the distribution of relevant allelic variants in the study population; and whether the pharmacogenetic intervention is equally effective across subpopulations defined by allelic variants. Methods The statistical design of the Clarification of Optimal Anticoagulation through Genetics (COAG) trial serves as an illustrative example of a personalized medicine intervention that uses each subject's genotype information. The COAG trial is a multicenter, double blind, randomized clinical trial that will compare two approaches to initiation of warfarin therapy: genotype-guided dosing, the initiation of warfarin therapy based on algorithms using clinical information and genotypes for polymorphisms in CYP2C9 and VKORC1; and clinical-guided dosing, the initiation of warfarin therapy based on algorithms using only clinical information. Results We determine an absolute minimum detectable difference of 5.49% based on an assumed 60% population prevalence of zero or multiple genetic variants in either CYP2C9 or VKORC1 and an assumed 15% relative effectiveness of genotype-guided warfarin initiation for those with zero or multiple genetic variants. Thus we calculate a sample size of 1238 to achieve a power level of 80% for the primary outcome. We show that reasonable departures from these assumptions may decrease statistical power to 65%. Conclusions In a personalized medicine intervention, the minimum detectable difference used in sample size calculations is not a known quantity, but rather an unknown quantity that depends on the genetic makeup of the subjects enrolled. Given the possible sensitivity of sample size and power calculations to these key assumptions, we recommend that they be monitored during the conduct of a personalized medicine intervention. Trial Registration clinicaltrials.gov: NCT00839657
Collapse
Affiliation(s)
- Benjamin French
- Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, 423 Guardian Drive, Philadelphia, Pennsylvania 19104, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Turk DC, Dworkin RH, McDermott MP, Bellamy N, Burke LB, Chandler JM, Cleeland CS, Cowan P, Dimitrova R, Farrar JT, Hertz S, Heyse JF, Iyengar S, Jadad AR, Jay GW, Jermano JA, Katz NP, Manning DC, Martin S, Max MB, McGrath P, McQuay HJ, Quessy S, Rappaport BA, Revicki DA, Rothman M, Stauffer JW, Svensson O, White RE, Witter J. Analyzing multiple endpoints in clinical trials of pain treatments: IMMPACT recommendations. Pain 2008; 139:485-493. [DOI: 10.1016/j.pain.2008.06.025] [Citation(s) in RCA: 163] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Revised: 06/11/2008] [Accepted: 06/30/2008] [Indexed: 11/15/2022]
|
20
|
Sjögren P, Nilsson E, Forsell M, Johansson O, Hoogstraate J. A systematic review of the preventive effect of oral hygiene on pneumonia and respiratory tract infection in elderly people in hospitals and nursing homes: effect estimates and methodological quality of randomized controlled trials. J Am Geriatr Soc 2008; 56:2124-30. [PMID: 18795989 DOI: 10.1111/j.1532-5415.2008.01926.x] [Citation(s) in RCA: 218] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The objective of this study was to investigate the preventive effect of oral hygiene on pneumonia and respiratory tract infection, focusing on elderly people in hospitals and nursing homes, by systematically reviewing effect estimates and methodological quality of randomized controlled trials (RCTs) and to provide an overview of additional clinical studies in this area. Literature searches were conducted in the Medline database, the Cochrane library databases, and by hand-searching reference lists. Included publications were analyzed for intervention (or topic) studied, main conclusions, strength of evidence, and study design. RCTs were further analyzed for effect magnitudes and methodological details. Absolute risk reductions (ARRs) and numbers needed to treat (NNTs) were calculated. Fifteen publications fulfilled the inclusion criteria. There was a wide variation in the design and quality of the studies included. The RCTs revealed positive preventive effects of oral hygiene on pneumonia and respiratory tract infection in hospitalized elderly people and elderly nursing home residents, with ARRs from 6.6% to 11.7% and NNTs from 8.6 to 15.3 individuals. The non-RCT studies contributed to inconclusive evidence on the association and correlation between oral hygiene and pneumonia or respiratory tract infection in elderly people. Mechanical oral hygiene has a preventive effect on mortality from pneumonia, and non-fatal pneumonia in hospitalized elderly people and elderly nursing home residents. Approximately one in 10 cases of death from pneumonia in elderly nursing home residents may be prevented by improving oral hygiene. Future research in this area should be focused on high-quality RCTs with appropriate sample size calculations.
Collapse
|
21
|
Huque MF, Alosh M. A flexible fixed-sequence testing method for hierarchically ordered correlated multiple endpoints in clinical trials. J Stat Plan Inference 2008. [DOI: 10.1016/j.jspi.2007.06.009] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Farcomeni A. A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion. Stat Methods Med Res 2007; 17:347-88. [PMID: 17698936 DOI: 10.1177/0962280206079046] [Citation(s) in RCA: 102] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In the last decade a growing amount of statistical research has been devoted to multiple testing, motivated by a variety of applications in medicine, bioinformatics, genomics, brain imaging, etc. Research in this area is focused on developing powerful procedures even when the number of tests is very large. This paper attempts to review research in modern multiple hypothesis testing with particular attention to the false discovery proportion, loosely defined as the number of false rejections divided by the number of rejections. We review the main ideas, stepwise and augmentation procedures; and resampling based testing. We also discuss the problem of dependence among the test statistics. Simulations make a comparison between the procedures and with Bayesian methods. We illustrate the procedures in applications in DNA microarray data analysis. Finally, few possibilities for further research are highlighted.
Collapse
|
23
|
Sloan JA, Dueck A. Issues for Statisticians in Conducting Analyses and Translating Results for Quality of Life End Points in Clinical Trials. J Biopharm Stat 2007; 14:73-96. [PMID: 15027501 DOI: 10.1081/bip-120028507] [Citation(s) in RCA: 120] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Quality of life (QOL) end points in pharmaceutical clinical trials are at a crossroads. On the one hand, much has been learned in recent years of how to efficiently and effectively measure patient QOL. On the other hand, investigators and regulatory agencies still struggle with exactly how to assess the results of QOL end points and other patient-reported outcomes. Statisticians are often left in the position of having to bridge the gap between investigators who want to assess patient QOL and regulatory bodies who want a sound scientific rationale and analysis plan for doing so. Unfortunately, little has been written specifically for the statistical audience to assist in this translation. The purpose of this paper is to attempt to bridge this gap. We will describe the language and methods that have been successful in translating the psychometric and statistical challenges into understandable findings for investigators and regulatory agencies. One of the most important advances is the development of a general guideline for assessing clinical significance, namely the "half standard deviation" method based on the empirical rule effect size (ERES) approach. We populate the paper with concrete examples of how QOL data need not be treated any different, in terms of statistical analysis, than tumor response or other clinical end points.
Collapse
Affiliation(s)
- J A Sloan
- Department of Health Sciences Research, Mayo Clinic Cancer Center, Rochester, Minnesota 55905, USA.
| | | |
Collapse
|
24
|
Kraus CN, Zalkikar J, Powers JH. Levofloxacin and Macrolides for Treatment of Legionnaires Disease: Multiple Comparisons Give Few Answers. Clin Infect Dis 2005; 41:416; author reply 416-7. [PMID: 16007549 DOI: 10.1086/431768] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
|
25
|
Abstract
Multiplicity problems emerge from investigators looking at many additional endpoints and treatment group comparisons. Thousands of potential comparisons can emanate from one trial. Investigators might only report the significant comparisons, an unscientific practice if unwitting, and fraudulent if intentional. Researchers must report all the endpoints analysed and treatments compared. Some statisticians propose statistical adjustments to account for multiplicity. Simply defined, they test for no effects in all the primary endpoints undertaken versus an effect in one or more of those endpoints. In general, statistical adjustments for multiplicity provide crude answers to an irrelevant question. However, investigators should use adjustments when the clinical decision-making argument rests solely on one or more of the primary endpoints being significant. In these cases, adjustments somewhat rescue scattershot analyses. Readers need to be aware of the potential for under-reporting of analyses.
Collapse
Affiliation(s)
- Kenneth F Schulz
- Family Health International, PO Box 13950, Research Triangle Park, NC 27709 USA
| | | |
Collapse
|
26
|
Abstract
Confidence intervals represent a routinely used standard method to document the uncertainty of estimated effects. In most cases, for the calculation of confidence intervals the conventional fixed 95% confidence level is used. Confidence curves represent a graphical illustration of confidence intervals for confidence levels varying between 0 and 100%. Although such graphs have been repeatedly proposed under different names during the last 40 years, confidence curves are rarely used in medical research. In this paper, we introduce confidence curves and present a short historical review. We draw attention to the different interpretation of one- and two-sided statistical inference. It is shown that these two options also have influence on the plotting of appropriate confidence curves. We illustrate the use of one- and two-sided confidence curves and explain their correct interpretation. In medical research more emphasis on the choice between the one- and two-sided approaches should be given. One- and two-sided confidence curves are useful complements to the conventional methods of presenting study results.
Collapse
Affiliation(s)
- Ralf Bender
- Institute for Quality and Efficiency in Health Care, Cologne, Germany.
| | | | | |
Collapse
|
27
|
Moyé LA, Deswal A. Perils of the random experiment. Am J Ther 2003; 10:112-21. [PMID: 12629589 DOI: 10.1097/00045391-200303000-00006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Most medical research is executed on samples selected from large populations. Nevertheless, health care researchers often blur the difference between interpreting sample-based research and evaluating research that included the entire population of interest. This is an implication-critical distinction; in population research, every result applies to the population (because the entire population was included in the analysis), although only a few results from sample-based research can be extended to the population at large. Treating every result from sample-based research as if that result applies to the population is misleading. Using nonmathematic terminology, this article develops the reason for the differences in the implications of these two research perspectives. In sample-based research, the best indicators of which results should be extended from the sample to the population are the presence of (1) a prospective plan for that experiment; and (2) the execution of the experiment according to that plan (concordant execution). The absence of these two features produces execution and analysis decisions based on the incoming data stream-the hallmark of the random experiment. In this latter paradigm, allowing the data to influence the execution and analysis decisions renders the usual estimates of effect size, standard errors, confidence intervals, and P values untrustworthy. Readers of clinical trial results must be vigilant for nonprotocol-driven research and understand that the results from these programs are at best exploratory and cannot be used to answer scientific questions.
Collapse
Affiliation(s)
- Lemuel A Moyé
- University of Texas School of Public Health, RAS Building E815, 1200 Herman Pressler, Houston, TX 77030, USA.
| | | |
Collapse
|
28
|
Affiliation(s)
- Lemuel A Moyé
- University of Texas Houston Health Science Center, School of Public Health, Houston, Tex, USA.
| | | |
Collapse
|
29
|
Moyé LA, Deswal A. Trials within trials: confirmatory subgroup analyses in controlled clinical experiments. CONTROLLED CLINICAL TRIALS 2001; 22:605-19. [PMID: 11738119 DOI: 10.1016/s0197-2456(01)00180-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Subgroup analyses remain a popular and necessary component of controlled clinical trials. However, lack of prospective specification, inadequate sample size, inability to maintain power, and the cumulative effect of sampling error can complicate their interpretation. This article demonstrates that clinical trial design tools that would allow the medical community to draw confirmatory and not just exploratory conclusions from specific subgroup evaluations are available to methodologists. Distinct from the use of a treatment by subgroup interaction term, this methodology provides an evaluation of the effect of an intervention within a particular subgroup stratum prospectively declared to be of interest to the investigators. The necessary prespecification of stratum-specific type I error rates, when combined with (1) a stratum-specific event rate in the subgroup, (2) a stratum-specific primary endpoint, (3) a stratum-specific endpoint precision, and/or (4) a stratum-specific efficacy, satisfies the requirements for a subgroup stratum's "stand-alone" interpretation at the trial's conclusion.
Collapse
Affiliation(s)
- L A Moyé
- University of Texas School of Public Health, RAS Building E815, 1200 Herman Pressler, Houston, TX 77030, USA.
| | | |
Collapse
|
30
|
Abstract
Multiplicity of data, hypotheses, and analyses is a common problem in biomedical and epidemiological research. Multiple testing theory provides a framework for defining and controlling appropriate error rates in order to protect against wrong conclusions. However, the corresponding multiple test procedures are underutilized in biomedical and epidemiological research. In this article, the existing multiple test procedures are summarized for the most important multiplicity situations. It is emphasized that adjustments for multiple testing are required in confirmatory studies whenever results from multiple tests have to be combined in one final conclusion and decision. In case of multiple significance tests a note on the error rate that will be controlled for is desirable.
Collapse
Affiliation(s)
- R Bender
- Institute of Epidemiology and Medical Statistics, School of Public Health, University of Bielefeld, Germany.
| | | |
Collapse
|
31
|
Abstract
I review some areas of medical statistics that have gained prominence over the last 5-10 years: meta-analysis, evidence-based medicine, and cluster randomized trials. I then consider several issues relating to data analysis and interpretation, many relating to the use and misuse of hypothesis testing, drawing on recent reviews of the use of statistics in medical journals. I also consider developments in the reporting of research in medical journals.
Collapse
Affiliation(s)
- D G Altman
- ICRF Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, Old Road, Oxford OX3 7LF, UK.
| |
Collapse
|
32
|
|
33
|
Abstract
Regardless of whether a statistician believes in letting a data set speak for itself through nominal p-values or believes in strict alpha conservation, the interpretation of experiments which are negative for the primary endpoint but positive for secondary endpoints is the source of some angst. The purpose of this paper is to apply the notion of prospective alpha allocation in clinical trials to this difficult circumstance. An argument is presented for differentiating between the alpha for the experiment ('experimental alpha' or alpha(E)) and the alpha for the primary endpoint (primary alpha, or alpha(P)) and notation is presented which succinctly describes the findings of a clinical trial in terms of its conclusions. Capping alpha(E) at 0.10 and alpha(P) at 0.05 conserves sample size and preserves consistency with the strength of evidence for the primary endpoint of clinical trials. In addition, a case is presented for the well defined circumstances in which a trial which did not reject the null hypothesis for the primary endpoint but does reject the null hypothesis for at least one of the secondary endpoints may be considered positive in a manner consistent with conservative alpha management.
Collapse
Affiliation(s)
- L A Moyé
- University of Texas School of Public Health, Ruell A. Stallones Building, 1200 Herman Pressler, Houston, Texas 77030, USA.
| |
Collapse
|
34
|
Moyé LA. End-point interpretation in clinical trials: the case for discipline. CONTROLLED CLINICAL TRIALS 1999; 20:40-9; discussion 50-1. [PMID: 10027499 DOI: 10.1016/s0197-2456(98)00051-8] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The recent submission of a new drug application to the federal Food and Drug Administration (FDA) has led to vigorous discussion concerning the rules of clinical trial conduct. The regulatory importance of prospectively defined end points, long held as a fundamental tenet of well-designed research efforts, and the proper role of alpha-spending functions in clinical trials are current foci of attention. Sound public policy requires that the highest research standards govern statistical analyses to support a new drug application. These research standards should be rooted in the fundamentals of epidemiology and biostatistics, long accepted by clinical trial workers. Also, because the physician-scientists whose research results are disseminated to the community bear considerable responsibility for both that community's protection and the protection of the individual patient, these investigators must provide unambiguous interpretations of type I and type II error rates. This obligation includes clear prospective statements of analysis plans and complete reporting of findings for the prospectively defined endpoints in presentations and in manuscripts. The public health is best served if regulatory agencies continue to join clinical trial workers in repudiating the philosophy of "sound methodology or a small p-value" in judging research efforts. Reviewers of new drug applications might best take the tack "judge first what the investigators set out to do, then judge what else was discovered in an 'exploratory light.'"
Collapse
Affiliation(s)
- L A Moyé
- University of Texas School of Public Health, Houston USA
| |
Collapse
|