1
|
Shehadeh F, Felix L, Kalligeros M, Shehadeh A, Fuchs BB, Ausubel FM, Sotiriadis PP, Mylonakis E. Machine Learning-Assisted High-Throughput Screening for Anti-MRSA Compounds. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1911-1921. [PMID: 39058605 DOI: 10.1109/tcbb.2024.3434340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]
Abstract
BACKGROUND Antimicrobial resistance is a major public health threat, and new agents are needed. Computational approaches have been proposed to reduce the cost and time needed for compound screening. AIMS A machine learning (ML) model was developed for the in silico screening of low molecular weight molecules. METHODS We used the results of a high-throughput Caenorhabditis elegans methicillin-resistant Staphylococcus aureus (MRSA) liquid infection assay to develop ML models for compound prioritization and quality control. RESULTS The compound prioritization model achieved an AUC of 0.795 with a sensitivity of 81% and a specificity of 70%. When applied to a validation set of 22,768 compounds, the model identified 81% of the active compounds identified by high-throughput screening (HTS) among only 30.6% of the total 22,768 compounds, resulting in a 2.67-fold increase in hit rate. When we retrained the model on all the compounds of the HTS dataset, it further identified 45 discordant molecules classified as non-hits by the HTS, with 42/45 (93%) having known antimicrobial activity. CONCLUSION Our ML approach can be used to increase HTS efficiency by reducing the number of compounds that need to be physically screened and identifying potential missed hits, making HTS more accessible and reducing barriers to entry.
Collapse
|
2
|
Sellars E, Savguira M, Wu J, Cancelliere S, Jen M, Krishnan R, Hakem A, Barsyte-Lovejoy D, Hakem R, Narod SA, Kotsopoulos J, Salmena L. A high-throughput approach to identify BRCA1-downregulating compounds to enhance PARP inhibitor sensitivity. iScience 2024; 27:110180. [PMID: 38993666 PMCID: PMC11238136 DOI: 10.1016/j.isci.2024.110180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 04/29/2024] [Accepted: 06/01/2024] [Indexed: 07/13/2024] Open
Abstract
PARP inhibitors (PARPi) are efficacious in BRCA1-null tumors; however, their utility is limited in tumors with functional BRCA1. We hypothesized that pharmacologically reducing BRCA1 protein levels could enhance PARPi effectiveness in BRCA1 wild-type tumors. To identify BRCA1 downregulating agents, we generated reporter cell lines using CRISPR-mediated editing to tag endogenous BRCA1 protein with HiBiT. These reporter lines enable the sensitive measurement of BRCA1 protein levels by luminescence. Validated reporter cells were used in a pilot screen of epigenetic-modifying probes and a larger screen of more than 6,000 compounds. We identified 7 compounds that could downregulate BRCA1-HiBiT expression and synergize with olaparib. Three compounds, N-acetyl-N-acetoxy chlorobenzenesulfonamide (NANAC), A-443654, and CHIR-124, were validated to reduce BRCA1 protein levels and sensitize breast cancer cells to the toxic effects of olaparib. These results suggest that BRCA1-HiBiT reporter cells hold promise in developing agents to improve the clinical utility of PARPi.
Collapse
Affiliation(s)
- Erin Sellars
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada
- Women's College Research Institute, Women's College Hospital, Toronto, ON M5S 1B2, Canada
| | - Margarita Savguira
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Jie Wu
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Sabrina Cancelliere
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - Mark Jen
- Lunenfeld-Tanenbaum Research Institute, Network Biology Collaborative Centre, High-Throughput Screening, Mt. Sinai Hospital, Sinai Health System, Toronto, ON M5G 1X5, Canada
| | - Rehna Krishnan
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 1L7, Canada
| | - Anne Hakem
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 1L7, Canada
| | - Dalia Barsyte-Lovejoy
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada
- Structural Genomics Consortium, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Razqallah Hakem
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 1L7, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON M5G 1L7, Canada
| | - Steven A Narod
- Women's College Research Institute, Women's College Hospital, Toronto, ON M5S 1B2, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
| | - Joanne Kotsopoulos
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada
- Women's College Research Institute, Women's College Hospital, Toronto, ON M5S 1B2, Canada
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON M5T 3M7, Canada
| | - Leonardo Salmena
- Department of Pharmacology & Toxicology, University of Toronto, Toronto, ON M5S 1A8, Canada
- Women's College Research Institute, Women's College Hospital, Toronto, ON M5S 1B2, Canada
| |
Collapse
|
3
|
Ajetunmobi OH, Wall G, Bonifacio BV, Montelongo-Jauregui D, Lopez-Ribot JL. A 384-Well Microtiter Plate Model for Candida Biofilm Formation and Its Application to High-Throughput Screening. Methods Mol Biol 2023; 2658:53-64. [PMID: 37024695 DOI: 10.1007/978-1-0716-3155-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Abstract
Candidiasis, infections caused by Candida spp., represents one of the most common nosocomial infections afflicting an expanding number of compromised patients. Antifungal therapeutic options are few and show limited efficacy. Moreover, biofilm formation is frequently associated with different manifestations of candidiasis and further complicates therapy. Thus, there is an urgent need for new effective therapeutic agents, particularly those with anti-biofilm activity. Here we describe the development of a novel, simple, fast, economical, and highly reproducible 384-well microtiter plate model for the formation of both Candida albicans and Candida auris biofilms and its application in high-throughput screening (HTS) techniques.
Collapse
Affiliation(s)
- Olabayo H Ajetunmobi
- Department of Molecular Microbiology and Immunology, and South Texas Center for Emerging Infectious Diseases, The University of Texas at San Antonio, San Antonio, TX, USA
| | - Gina Wall
- Department of Molecular Microbiology and Immunology, and South Texas Center for Emerging Infectious Diseases, The University of Texas at San Antonio, San Antonio, TX, USA
| | - Bruna V Bonifacio
- Department of Molecular Microbiology and Immunology, and South Texas Center for Emerging Infectious Diseases, The University of Texas at San Antonio, San Antonio, TX, USA
| | - Daniel Montelongo-Jauregui
- Department of Molecular Microbiology and Immunology, and South Texas Center for Emerging Infectious Diseases, The University of Texas at San Antonio, San Antonio, TX, USA
| | - Jose L Lopez-Ribot
- Department of Molecular Microbiology and Immunology, and South Texas Center for Emerging Infectious Diseases, The University of Texas at San Antonio, San Antonio, TX, USA.
| |
Collapse
|
4
|
Peng X, Gibbs E, Silverman JM, Cashman NR, Plotkin SS. A method for systematically ranking therapeutic drug candidates using multiple uncertain screening criteria. Stat Methods Med Res 2021; 30:1502-1522. [PMID: 33847541 PMCID: PMC8189013 DOI: 10.1177/09622802211002861] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Multiple different screening tests for candidate leads in drug development may often yield conflicting or ambiguous results, sometimes making the selection of leads a nontrivial maximum-likelihood ranking problem. Here, we employ methods from the field of multiple criteria decision making (MCDM) to the problem of screening candidate antibody therapeutics. We employ the SMAA-TOPSIS method to rank a large cohort of antibodies using up to eight weighted screening criteria, in order to find lead candidate therapeutics for Alzheimer's disease, and determine their robustness to both uncertainty in screening measurements, as well as uncertainty in the user-defined weights of importance attributed to each screening criterion. To choose lead candidates and measure the confidence in their ranking, we propose two new quantities, the Retention Probability and the Topness, as robust measures for ranking. This method may enable more systematic screening of candidate therapeutics when it becomes difficult intuitively to process multi-variate screening data that distinguishes candidates, so that additional candidates may be exposed as potential leads, increasing the likelihood of success in downstream clinical trials. The method properly identifies true positives and true negatives from synthetic data, its predictions correlate well with known clinically approved antibodies vs. those still in trials, and it allows for ranking analyses using antibody developability profiles in the literature. We provide a webserver where users can apply the method to their own data: http://bjork.phas.ubc.ca.
Collapse
Affiliation(s)
- Xubiao Peng
- Department of Physics and Astronomy, University of British Columbia, Vancouver, BC, Canada
| | - Ebrima Gibbs
- Brain Research Center, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Judith M Silverman
- Brain Research Center, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Neil R Cashman
- Brain Research Center, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Steven S Plotkin
- Department of Physics and Astronomy, University of British Columbia, Vancouver, BC, Canada
- Genome Science and Technology Program, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
5
|
Guo Q, Pan T, Chen S, Zou X, Huang DY. A Novel Edge Effect Detection Method for Real-Time Cellular Analyzer Using Functional Principal Component Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1563-1572. [PMID: 30843848 DOI: 10.1109/tcbb.2019.2903094] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Real-time cellular analyzer (RTCA) has been generally applied to test the cytotoxicity of chemicals. However, several factors impact the experimental quality. A non-negligible factor is the abnormal time-dependent cellular response curves (TCRCs) of the wells located at the edge of the E-plate which is defined as edge effect. In this paper, a novel statistical analysis is proposed to detect the edge effect. First, TCRCs are considered as observations of a random variable in a functional space. Then, functional principal component analysis (FPCA) is adopted to extract the principal component (PC) functions of the TCRCs, and the first and second PCs of these curves are selected to distinguish abnormal TCRCs. The average TCRC of the inner wells with the same culture environment is set as the standard. If the distance between the scoring point of the standard curve and one designated scoring point exceeds the defined threshold, the corresponding TCRC of the designated point should be removed automatically. The experimental results demonstrate the effectiveness of the proposed algorithm. This method can be used as a standard method to resolve general time-dependent series issues.
Collapse
|
6
|
Chen L, Wilson K, Goldlust I, Mott BT, Eastman R, Davis MI, Zhang X, McKnight C, Klumpp-Thomas C, Shinn P, Simmons J, Gormally M, Michael S, Thomas CJ, Ferrer M, Guha R. mQC: A Heuristic Quality-Control Metric for High-Throughput Drug Combination Screening. Sci Rep 2016; 6:37741. [PMID: 27883049 PMCID: PMC5121902 DOI: 10.1038/srep37741] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Accepted: 11/01/2016] [Indexed: 11/09/2022] Open
Abstract
Quality control (QC) metrics are critical in high throughput screening (HTS) platforms to ensure reliability and confidence in assay data and downstream analyses. Most reported HTS QC metrics are designed for plate level or single well level analysis. With the advent of high throughput combination screening there is a need for QC metrics that quantify the quality of combination response matrices. We introduce a predictive, interpretable, matrix-level QC metric, mQC, based on a mix of data-derived and heuristic features. mQC accurately reproduces the expert assessment of combination response quality and correctly identifies unreliable response matrices that can lead to erroneous or misleading characterization of synergy. When combined with the plate-level QC metric, Z', mQC provides a more appropriate determination of the quality of a drug combination screen. Retrospective analysis on a number of completed combination screens further shows that mQC is able to identify problematic screens whereas plate-level QC was not able to. In conclusion, our data indicates that mQC is a reliable QC filter that can be used to identify problematic drug combinations matrices and prevent further analysis on erroneously active combinations as well as for troubleshooting failed screens. The R source code of mQC is available at http://matrix.ncats.nih.gov/mQC.
Collapse
Affiliation(s)
- Lu Chen
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Kelli Wilson
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Ian Goldlust
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Bryan T. Mott
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Richard Eastman
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Mindy I. Davis
- National Institute of Allergy and Infectious Diseases (NIAID), Rockville, MD 20852, USA
| | - Xiaohu Zhang
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Crystal McKnight
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Carleen Klumpp-Thomas
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Paul Shinn
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - John Simmons
- Laboratory of Cancer Biology and Genetics, National Cancer Institute (NCI), Bethesda, MD 20892, USA
| | - Mike Gormally
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Sam Michael
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Craig J. Thomas
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Marc Ferrer
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| | - Rajarshi Guha
- Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), Rockville, MD 20850, USA
| |
Collapse
|
7
|
Azorsa DO, Turnidge MA, Arora S. Data Analysis for High-Throughput RNAi Screening. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2016; 1470:247-60. [PMID: 27581298 DOI: 10.1007/978-1-4939-6337-9_19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
High-throughput RNA interference (HT-RNAi) screening is an effective technology to help identify important genes and pathways involved in a biological process. Analysis of high-throughput RNAi screening data is a critical part of this technology, and many analysis methods have been described. Here, we summarize the workflow and types of analyses commonly used in high-throughput RNAi screening.
Collapse
Affiliation(s)
- David O Azorsa
- Institute of Molecular Medicine, Phoenix Children's Hospital, Phoenix, AZ, USA. .,Department of Child Health, University of Arizona College of Medicine - Phoenix, Phoenix, AZ, USA.
| | - Megan A Turnidge
- Department of Child Health, University of Arizona College of Medicine - Phoenix, Phoenix, AZ, USA.,School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Shilpi Arora
- Constellation Pharmaceuticals, Cambridge, MA, USA
| |
Collapse
|
8
|
Gagarin A, Makarenkov V, Zentilli P. Using Clustering Techniques to Improve Hit Selection in High-Throughput Screening. ACTA ACUST UNITED AC 2016; 11:903-14. [PMID: 17092911 DOI: 10.1177/1087057106293590] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
A typical modern high-throughput screening (HTS) operation consists of testing thousands of chemical compounds to select active ones for future detailed examination. The authors describe 3 clustering techniques that can be used to improve the selection of active compounds (i.e., hits). They are designed to identify quality hits in the observed HTS measurements. The considered clustering techniques were first tested on simulated data and then applied to analyze the assay inhibiting Escherichia coli dihydrofo-late reductase produced at the HTS laboratory of McMaster University.
Collapse
Affiliation(s)
- Andrei Gagarin
- Laboratoire LaCIM, Université du Québec à Montréal, C.P. 8888, succursale Centre-Ville, Montréal, Québec, Canada.
| | | | | |
Collapse
|
9
|
Kevorkov D, Makarenkov V. Statistical Analysis of Systematic Errors in High-Throughput Screening. ACTA ACUST UNITED AC 2016; 10:557-67. [PMID: 16103415 DOI: 10.1177/1087057105276989] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
High-throughput screening (HTS) is an efficient technology for drug discovery. It allows for screening of more than 100,000 compounds a day per screen and requires effective procedures for quality control. The authors have developed a method for evaluating a background surface of an HTS assay; it can be used to correct raw HTS data. This correction is necessary to take into account systematic errors that may affect the procedure of hit selection. The described method allows one to analyze experimental HTS data and determine trends and local fluctuations of the corresponding background surfaces. For an assay with a large number of plates, the deviations of the background surface from a plane are caused by systematic errors. Their influence can be minimized by the subtraction of the systematic background from the raw data. Two experimental HTS assays from the ChemBank database are examined in this article. The systematic error present in these data was estimated and removed from them. It enabled the authors to correct the hit selection procedure for both assays.
Collapse
|
10
|
Zhang XD. A New Method with Flexible and Balanced Control of False Negatives and False Positives for Hit Selection in RNA Interference High-Throughput Screening Assays. ACTA ACUST UNITED AC 2016; 12:645-55. [PMID: 17517904 DOI: 10.1177/1087057107300645] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The z-score method and its variants for testing mean difference are commonly used for hit selection in high-throughput screening (HTS) assays. Strictly standardized mean difference (SSMD) offers a way to measure and classify the short interfering RNA (siRNA) effects. In this article, based on SSMD, the authors propose a new testing method for hit selection in RNA interference (RNAi) HTS assays. This SSMD-based method allows the differentiation between siRNAs with large and small effects on the assay output and maintains flexible and balanced control of both the false-negative rate, in which the siRNAs with strong effects are not selected as hits, and the restricted false-positive rate, in which the siRNAs with weak or no effects are selected as hits. This method directly addresses the size of siRNA effects represented by the strength of difference between an siRNA and a negative reference, whereas the classic z-score method and t-test of testing no mean difference address whether the mean of an siRNA is exactly the same as the mean of a negative reference. This method can readily control the false-negative rate, whereas it is nontrivial for the classic z-score method and t-test to control the false-negative rate. Therefore, theoretically, the SSMD-based method offers better control of the sizes of siRNA effects and the associated false-positive and false-negative rates than the commonly used z-score method and t-test for hit selection in HTS assays. The SSMD-based method should generally be applicable to any assay in which the end point is a difference in signal compared to a reference sample, including those for RNAi, receptor, enzyme, and cellular function. (Journal of Biomolecular Screening 2007:645-655)
Collapse
Affiliation(s)
- Xiaohua Douglas Zhang
- Biometrics Research, Merck Research Laboratories, West Point, Pennsylvania 19486, USA.
| |
Collapse
|
11
|
Zhai Y, Chen K, Zhong Y, Zhou B, Ainscow E, Wu YT, Zhou Y. An Automatic Quality Control Pipeline for High-Throughput Screening Hit Identification. ACTA ACUST UNITED AC 2016; 21:832-41. [PMID: 27313114 DOI: 10.1177/1087057116654274] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 05/19/2016] [Indexed: 01/02/2023]
Abstract
The correction or removal of signal errors in high-throughput screening (HTS) data is critical to the identification of high-quality lead candidates. Although a number of strategies have been previously developed to correct systematic errors and to remove screening artifacts, they are not universally effective and still require fair amount of human intervention. We introduce a fully automated quality control (QC) pipeline that can correct generic interplate systematic errors and remove intraplate random artifacts. The new pipeline was first applied to ~100 large-scale historical HTS assays; in silico analysis showed auto-QC led to a noticeably stronger structure-activity relationship. The method was further tested in several independent HTS runs, where QC results were sampled for experimental validation. Significantly increased hit confirmation rates were obtained after the QC steps, confirming that the proposed method was effective in enriching true-positive hits. An implementation of the algorithm is available to the screening community.
Collapse
Affiliation(s)
- Yufeng Zhai
- Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA
| | - Kaisheng Chen
- Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA
| | - Yang Zhong
- Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA
| | - Bin Zhou
- Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA
| | - Edward Ainscow
- Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA
| | - Ying-Ta Wu
- Genomics Research Center, Academia Sinica, Nankang, Taipei, Taiwan
| | - Yingyao Zhou
- Genomics Institute of the Novartis Research Foundation, San Diego, CA, USA
| |
Collapse
|
12
|
Gubler H. High-Throughput Screening Data Analysis. NONCLINICAL STATISTICS FOR PHARMACEUTICAL AND BIOTECHNOLOGY INDUSTRIES 2016. [DOI: 10.1007/978-3-319-23558-5_5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
13
|
Ekins S, Clark AM, Swamidass SJ, Litterman N, Williams AJ. Bigger data, collaborative tools and the future of predictive drug discovery. J Comput Aided Mol Des 2014; 28:997-1008. [PMID: 24943138 PMCID: PMC4198464 DOI: 10.1007/s10822-014-9762-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2014] [Accepted: 06/09/2014] [Indexed: 12/31/2022]
Abstract
Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either openly or through secure transactions using commercial tools. A major challenge in the future will be how such databases and software approaches handle larger amounts of data as it accumulates from high throughput screening and enables the user to draw insights, enable predictions and move projects forward. We now discuss how information from some drug discovery datasets can be made more accessible and how privacy of data should not overwhelm the desire to share it at an appropriate time with collaborators. We also discuss additional software tools that could be made available and provide our thoughts on the future of predictive drug discovery in this age of big data. We use some examples from our own research on neglected diseases, collaborations, mobile apps and algorithm development to illustrate these ideas.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC, 27526, USA,
| | | | | | | | | |
Collapse
|
14
|
Dahlin JL, Walters MA. The essential roles of chemistry in high-throughput screening triage. Future Med Chem 2014; 6:1265-90. [PMID: 25163000 PMCID: PMC4465542 DOI: 10.4155/fmc.14.60] [Citation(s) in RCA: 101] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
It is increasingly clear that academic high-throughput screening (HTS) and virtual HTS triage suffers from a lack of scientists trained in the art and science of early drug discovery chemistry. Many recent publications report the discovery of compounds by screening that are most likely artifacts or promiscuous bioactive compounds, and these results are not placed into the context of previous studies. For HTS to be most successful, it is our contention that there must exist an early partnership between biologists and medicinal chemists. Their combined skill sets are necessary to design robust assays and efficient workflows that will weed out assay artifacts, false positives, promiscuous bioactive compounds and intractable screening hits, efforts that ultimately give projects a better chance at identifying truly useful chemical matter. Expertise in medicinal chemistry, cheminformatics and purification sciences (analytical chemistry) can enhance the post-HTS triage process by quickly removing these problematic chemotypes from consideration, while simultaneously prioritizing the more promising chemical matter for follow-up testing. It is only when biologists and chemists collaborate effectively that HTS can manifest its full promise.
Collapse
Affiliation(s)
- Jayme L Dahlin
- Department of Molecular Pharmacology & Experimental Therapeutics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
- Medical Scientist Training Program, Mayo Clinic College of Medicine, Rochester, MN 55905, USA
| | - Michael A Walters
- Institute for Therapeutics Discovery & Development, University of Minnesota, Minneapolis, MN 55414, USA
| |
Collapse
|
15
|
Epinat JC. A yeast-based recombination assay for homing endonuclease activity. Methods Mol Biol 2014; 1123:105-26. [PMID: 24510264 DOI: 10.1007/978-1-62703-968-0_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Homing endonucleases (HEs) are natural enzymes that cleave long DNA target with a high specificity and trigger homologous recombination at the exact site of the break. Such mechanisms can thus be used for all the applications covered today by the generic name of "genome engineering": targeted sequence insertion, removal, or editing. However, before being able to address those applications, the engineering of HEs must be mastered so that any potential target would be efficiently and specifically recognized and cleaved. Working on the I-CreI model, we have developed a very powerful platform to generate HEs with new tailored specificity. We have put in place the first in vivo, functional, high throughput assay to generate I-CreI variants and measure their activity. We use semi-rational design combined with proprietary in silico predictions to design and synthesize I-CreI mutants that are tested for their capacity to induce homologous recombination in a yeast cell. The process has been standardized and robotized so that we can generate thousands of I-CreI derivatives, characterize their cleavage profile, and deliver them for further applications in the research, therapeutic, or agrobusiness fields.
Collapse
|
16
|
Amberkar S, Kiani NA, Bartenschlager R, Alvisi G, Kaderali L. High-throughput RNA interference screens integrative analysis: Towards a comprehensive understanding of the virus-host interplay. World J Virol 2013; 2:18-31. [PMID: 24175227 PMCID: PMC3785050 DOI: 10.5501/wjv.v2.i2.18] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2012] [Revised: 02/15/2013] [Accepted: 03/15/2013] [Indexed: 02/05/2023] Open
Abstract
Viruses are extremely heterogeneous entities; the size and the nature of their genetic information, as well as the strategies employed to amplify and propagate their genomes, are highly variable. However, as obligatory intracellular parasites, replication of all viruses relies on the host cell. Having co-evolved with their host for several million years, viruses have developed very sophisticated strategies to hijack cellular factors that promote virus uptake, replication, and spread. Identification of host cell factors (HCFs) required for these processes is a major challenge for researchers, but it enables the identification of new, highly selective targets for anti viral therapeutics. To this end, the establishment of platforms enabling genome-wide high-throughput RNA interference (HT-RNAi) screens has led to the identification of several key factors involved in the viral life cycle. A number of genome-wide HT-RNAi screens have been performed for major human pathogens. These studies enable first inter-viral comparisons related to HCF requirements. Although several cellular functions appear to be uniformly required for the life cycle of most viruses tested (such as the proteasome and the Golgi-mediated secretory pathways), some factors, like the lipid kinase Phosphatidylinositol 4-kinase IIIα in the case of hepatitis C virus, are selectively required for individual viruses. However, despite the amount of data available, we are still far away from a comprehensive understanding of the interplay between viruses and host factors. Major limitations towards this goal are the low sensitivity and specificity of such screens, resulting in limited overlap between different screens performed with the same virus. This review focuses on how statistical and bioinformatic analysis methods applied to HT-RNAi screens can help overcoming these issues thus increasing the reliability and impact of such studies.
Collapse
|
17
|
Zhang XD, Zhang Z. displayHTS: a R package for displaying data and results from high-throughput screening experiments. Bioinformatics 2013; 29:794-6. [DOI: 10.1093/bioinformatics/btt060] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
18
|
Abstract
Background High-throughput RNA interference (RNAi) screening has become a widely used approach to elucidating gene functions. However, analysis and annotation of large data sets generated from these screens has been a challenge for researchers without a programming background. Over the years, numerous data analysis methods were produced for plate quality control and hit selection and implemented by a few open-access software packages. Recently, strictly standardized mean difference (SSMD) has become a widely used method for RNAi screening analysis mainly due to its better control of false negative and false positive rates and its ability to quantify RNAi effects with a statistical basis. We have developed GUItars to enable researchers without a programming background to use SSMD as both a plate quality and a hit selection metric to analyze large data sets. Results The software is accompanied by an intuitive graphical user interface for easy and rapid analysis workflow. SSMD analysis methods have been provided to the users along with traditionally-used z-score, normalized percent activity, and t-test methods for hit selection. GUItars is capable of analyzing large-scale data sets from screens with or without replicates. The software is designed to automatically generate and save numerous graphical outputs known to be among the most informative high-throughput data visualization tools capturing plate-wise and screen-wise performances. Graphical outputs are also written in HTML format for easy access, and a comprehensive summary of screening results is written into tab-delimited output files. Conclusion With GUItars, we demonstrated robust SSMD-based analysis workflow on a 3840-gene small interfering RNA (siRNA) library and identified 200 siRNAs that increased and 150 siRNAs that decreased the assay activities with moderate to stronger effects. GUItars enables rapid analysis and illustration of data from large- or small-scale RNAi screens using SSMD and other traditional analysis methods. The software is freely available at http://sourceforge.net/projects/guitars/.
Collapse
Affiliation(s)
- Asli N Goktug
- Department of Chemical Biology and Therapeutics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | | | | |
Collapse
|
19
|
Zhang Z, Guan N, Li T, Mais DE, Wang M. Quality control of cell-based high-throughput drug screening. Acta Pharm Sin B 2012. [DOI: 10.1016/j.apsb.2012.03.006] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
20
|
Bushway PJ, Azimi B, Heynen-Genel S. Optimization and application of median filter corrections to relieve diverse spatial patterns in microtiter plate data. ACTA ACUST UNITED AC 2011; 16:1068-80. [PMID: 21900202 DOI: 10.1177/1087057111419028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The standard (STD) 5 × 5 hybrid median filter (HMF) was previously described as a nonparametric local backestimator of spatially arrayed microtiter plate (MTP) data. As such, the HMF is a useful tool for mitigating global and sporadic systematic error in MTP data arrays. Presented here is the first known HMF correction of a primary screen suffering from systematic error best described as gradient vectors. Application of the STD 5 × 5 HMF to the primary screen raw data reduced background signal deviation, thereby improving the assay dynamic range and hit confirmation rate. While this HMF can correct gradient vectors, it does not properly correct periodic patterns that may present in other screening campaigns. To address this issue, 1 × 7 median and a row/column 5 × 5 hybrid median filter kernels (1 × 7 MF and RC 5 × 5 HMF) were designed ad hoc, to better fit periodic error patterns. The correction data show periodic error in simulated MTP data arrays is reduced by these alternative filter designs and that multiple corrective filters can be combined in serial operations for progressive reduction of complex error patterns in a MTP data array.
Collapse
Affiliation(s)
- Paul J Bushway
- Sanford-Burnham Medical Research Institute, La Jolla, CA 92037, USA.
| | | | | |
Collapse
|
21
|
Eberhard Y, Gronda M, Hurren R, Datti A, MacLean N, Ketela T, Moffat J, Wrana JL, Schimmer AD. Inhibition of SREBP1 sensitizes cells to death ligands. Oncotarget 2011; 2:186-96. [PMID: 21406729 PMCID: PMC3260812 DOI: 10.18632/oncotarget.239] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Evasion of death receptor ligand-induced apoptosis contributs to cancer development and progression. To better understand mechanisms conferring resistance to death ligands, we screened an siRNA library to identify sequences that sensitize resistant cells to fas activating antibody (CH-11). From this screen, we identified the Sterol-Regulatory Element-Binding Protein 1 (SREBP1), a transcription factor, which regulates genes involved in cholesterol and fatty acid synthesis including fatty acid synthase. Inhibition of SREBP1 sensitized PPC-1 and HeLa to the death receptor ligands CH-11 and TRAIL. In contrast, DU145 prostate cancer cells that are resistant to death ligands despite expressing the receptors on their cell surface remained resistant to CH-11 and TRAIL after knockdown of SREBP1. Consistent with the effects on cell viability, the addition of CH-11 activated caspases 3 and 8 in HeLa but not DU145 cells with silenced SREBP1. We demonstrated that knockdown of SREBP1 produced a marked decrease in fatty acid synthase expression. Furthermore, genetic or chemical inhibition of fatty acid synthase with shRNA or orlistat, respectively, recapitulated the effects of SREBP1 inhibition and sensitized HeLa but not DU145 cells to CH-11 and TRAIL. Sensitization to death receptor ligands by inhibition of fatty acid synthase was associated with activation of caspase 8 prior to caspase 9. Neither silencing of SREBP1 or fatty acid synthase changed basal expression of the core death receptor components Fas, caspase 8, FADD, caspase 3 or FLIP. Thus, inhibition of SREBP1 or its downstream target fatty acid synthase sensitizes resistant cells to death ligands.
Collapse
Affiliation(s)
- Yanina Eberhard
- Princess Margaret Hospital, Ontario Cancer Institute, Toronto, ON, Canada
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Dragiev P, Nadon R, Makarenkov V. Systematic error detection in experimental high-throughput screening. BMC Bioinformatics 2011; 12:25. [PMID: 21247425 PMCID: PMC3034671 DOI: 10.1186/1471-2105-12-25] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2010] [Accepted: 01/19/2011] [Indexed: 11/21/2022] Open
Abstract
Background High-throughput screening (HTS) is a key part of the drug discovery process during which thousands of chemical compounds are screened and their activity levels measured in order to identify potential drug candidates (i.e., hits). Many technical, procedural or environmental factors can cause systematic measurement error or inequalities in the conditions in which the measurements are taken. Such systematic error has the potential to critically affect the hit selection process. Several error correction methods and software have been developed to address this issue in the context of experimental HTS [1-7]. Despite their power to reduce the impact of systematic error when applied to error perturbed datasets, those methods also have one disadvantage - they introduce a bias when applied to data not containing any systematic error [6]. Hence, we need first to assess the presence of systematic error in a given HTS assay and then carry out systematic error correction method if and only if the presence of systematic error has been confirmed by statistical tests. Results We tested three statistical procedures to assess the presence of systematic error in experimental HTS data, including the χ2 goodness-of-fit test, Student's t-test and Kolmogorov-Smirnov test [8] preceded by the Discrete Fourier Transform (DFT) method [9]. We applied these procedures to raw HTS measurements, first, and to estimated hit distribution surfaces, second. The three competing tests were applied to analyse simulated datasets containing different types of systematic error, and to a real HTS dataset. Their accuracy was compared under various error conditions. Conclusions A successful assessment of the presence of systematic error in experimental HTS assays is possible when the appropriate statistical methodology is used. Namely, the t-test should be carried out by researchers to determine whether systematic error is present in their HTS data prior to applying any error correction method. This important step can significantly improve the quality of selected hits.
Collapse
Affiliation(s)
- Plamen Dragiev
- Département d'informatique, Université du Québec à Montréal, Montreal (QC) H3C 3P8, Canada
| | | | | |
Collapse
|
23
|
Shun TY, Lazo JS, Sharlow ER, Johnston PA. Identifying actives from HTS data sets: practical approaches for the selection of an appropriate HTS data-processing method and quality control review. ACTA ACUST UNITED AC 2010; 16:1-14. [PMID: 21160066 DOI: 10.1177/1087057110389039] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
High-throughput screening (HTS) has achieved a dominant role in drug discovery over the past 2 decades. The goal of HTS is to identify active compounds (hits) by screening large numbers of diverse chemical compounds against selected targets and/or cellular phenotypes. The HTS process consists of multiple automated steps involving compound handling, liquid transfers, and assay signal capture, all of which unavoidably contribute to systematic variation in the screening data. The challenge is to distinguish biologically active compounds from assay variability. Traditional plate controls-based and non-controls-based statistical methods have been widely used for HTS data processing and active identification by both the pharmaceutical industry and academic sectors. More recently, improved robust statistical methods have been introduced, reducing the impact of systematic row/column effects in HTS data. To apply such robust methods effectively and properly, we need to understand their necessity and functionality. Data from 6 HTS case histories are presented to illustrate that robust statistical methods may sometimes be misleading and can result in more, rather than less, false positives or false negatives. In practice, no single method is the best hit detection method for every HTS data set. However, to aid the selection of the most appropriate HTS data-processing and active identification methods, the authors developed a 3-step statistical decision methodology. Step 1 is to determine the most appropriate HTS data-processing method and establish criteria for quality control review and active identification from 3-day assay signal window and DMSO validation tests. Step 2 is to perform a multilevel statistical and graphical review of the screening data to exclude data that fall outside the quality control criteria. Step 3 is to apply the established active criterion to the quality-assured data to identify the active compounds.
Collapse
Affiliation(s)
- Tong Ying Shun
- University of Pittsburgh Drug Discovery Institute, Pittsburgh, PA, USA.
| | | | | | | |
Collapse
|
24
|
Malo N, Hanley JA, Carlile G, Liu J, Pelletier J, Thomas D, Nadon R. Experimental Design and Statistical Methods for Improved Hit Detection in High-Throughput Screening. ACTA ACUST UNITED AC 2010; 15:990-1000. [DOI: 10.1177/1087057110377497] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Identification of active compounds in high-throughput screening (HTS) contexts can be substantially improved by applying classical experimental design and statistical inference principles to all phases of HTS studies. The authors present both experimental and simulated data to illustrate how true-positive rates can be maximized without increasing false-positive rates by the following analytical process. First, the use of robust data preprocessing methods reduces unwanted variation by removing row, column, and plate biases. Second, replicate measurements allow estimation of the magnitude of the remaining random error and the use of formal statistical models to benchmark putative hits relative to what is expected by chance. Receiver Operating Characteristic (ROC) analyses revealed superior power for data preprocessed by a trimmed-mean polish method combined with the RVM t-test, particularly for small- to moderate-sized biological hits.
Collapse
Affiliation(s)
- Nathalie Malo
- McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada
| | - James A. Hanley
- Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Quebec, Canada
| | - Graeme Carlile
- Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - Jing Liu
- Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - Jerry Pelletier
- Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - David Thomas
- Department of Biochemistry, McGill University, Montreal, Quebec, Canada
| | - Robert Nadon
- McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada
- Department of Human Genetics, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
25
|
Birmingham A, Selfors LM, Forster T, Wrobel D, Kennedy CJ, Shanks E, Santoyo-Lopez J, Dunican DJ, Long A, Kelleher D, Smith Q, Beijersbergen RL, Ghazal P, Shamu CE. Statistical methods for analysis of high-throughput RNA interference screens. Nat Methods 2009; 6:569-75. [PMID: 19644458 PMCID: PMC2789971 DOI: 10.1038/nmeth.1351] [Citation(s) in RCA: 447] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
RNA interference (RNAi) has become a powerful technique for reverse genetics and drug discovery, and in both of these areas large-scale high-throughput RNAi screens are commonly performed. The statistical techniques used to analyze these screens are frequently borrowed directly from small-molecule screening; however, small-molecule and RNAi data characteristics differ in meaningful ways. We examine the similarities and differences between RNAi and small-molecule screens, highlighting particular characteristics of RNAi screen data that must be addressed during analysis. Additionally, we provide guidance on selection of analysis techniques in the context of a sample workflow.
Collapse
|
26
|
Chelation of intracellular iron with the antifungal agent ciclopirox olamine induces cell death in leukemia and myeloma cells. Blood 2009; 114:3064-73. [PMID: 19589922 DOI: 10.1182/blood-2009-03-209965] [Citation(s) in RCA: 139] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Off-patent drugs with previously unrecognized anticancer activity could be rapidly repurposed for this new indication. To identify such compounds, we conducted 2 independent cell-based chemical screens and identified the antimicrobial ciclopirox olamine (CPX) in both screens. CPX decreased cell growth and viability of malignant leukemia, myeloma, and solid tumor cell lines as well as primary AML patient samples at low-micromolar concentrations that appear pharmacologically achievable. Furthermore, oral CPX decreased tumor weight and volume in 3 mouse models of leukemia by up to 65% compared with control without evidence of weight loss or gross organ toxicity. In addition, oral CPX prevented the engraftment of primary AML cells in nonobese diabetic/severe combined immunodeficiency mouse models, thereby establishing its ability to target leukemia stem cells. Mechanistically, CPX bound intracellular iron, and this intracellular iron chelation was functionally important for its cytotoxicity. By electron paramagnetic resonance, CPX inhibited the iron-dependent enzyme ribonucleotide reductase at concentrations associated with cell death. Thus, in summary, CPX has previously unrecognized anticancer activity at concentrations that are pharmacologically achievable. Therefore, CPX could be rapidly repurposed for the treatment of malignancies, including leukemia and myeloma.
Collapse
|
27
|
Abstract
An overview of the characteristics of classical and outlier-resistant data summaries is provided. The latter are important because outlier data can skew results and decisions based on them. The simple data summaries are the basis for all composite assay and screening data quality measures, for example, the signal-to-noise ratio, signal-to-background ratio, assay and screening window coefficients Z ' and Z, or strictly standardized mean difference (SSMD). In addition to the measures of assay reliability which are based on assessing the size of the "signal windows," some measures for the characterization of the degree of agreement of repeated measurements are also outlined.
Collapse
Affiliation(s)
- Hanspeter Gubler
- NIBR IT and Automation Services, Novartis Institutes for BioMedical Research (NIBR), Basel, Switzerland
| |
Collapse
|
28
|
Abstract
Screening is about making decisions on the modulating activity of one particular compound on a biological system. When a compound testing experiment is repeated under the same conditions or as close to the same conditions as possible, the observed results are never exactly the same, and there is an apparent random and uncontrolled source of variability in the system under study. Nevertheless, randomness is not haphazard. In this context, we can see statistics as the science of decision making under uncertainty. Thus, the usage of statistical tools in the analysis of screening experiments is the right approach to the interpretation of screening data, with the aim of making them meaningful and converting them into valuable information that supports sound decision making.In the HTS workflow, there are at least three key stages where key decisions have to be made based on experimental data: (1) assay development (i.e. how to assess whether our assay is good enough to be put into screening production for the identification of modulators of the target of interest), (2) HTS campaign process (i.e. monitoring that screening process is performing at the expected quality and assessing possible patterned signs of experimental response that may adversely bias and mislead hit identification) and (3) data analysis of primary HTS data (i.e. flagging which compounds are giving a positive response in the assay, namely hit identification).In this chapter we will focus on how some statistical tools can help to cope with these three aspects. Assessment of assay quality is reviewed in other chapters, so in Section 1 we will briefly make some further considerations. Section 2 will review statistical process control, Section 3 will cover methodologies for detecting and dealing with HTS patterns and Section 4 will describe approaches for statistically guided selection of hits in HTS.
Collapse
Affiliation(s)
- Isabel Coma
- Molecular Discovery Research, Glaxo SmithKline, Tres Cantos, Madrid, Spain
| | | | | |
Collapse
|
29
|
Coma I, Clark L, Diez E, Harper G, Herranz J, Hofmann G, Lennon M, Richmond N, Valmaseda M, Macarron R. Process Validation and Screen Reproducibility in High-Throughput Screening. ACTA ACUST UNITED AC 2008; 14:66-76. [DOI: 10.1177/1087057108326664] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The use of large-scale compound screening has become a key component of drug discovery projects in both the pharmaceutical and the biotechnological industries. More recently, these activities have also been embraced by the academic community as a major tool for chemical genomic activities. High-throughput screening (HTS) activities constitute a major step in the initial drug discovery efforts and involve the use of large quantities of biological reagents, hundreds of thousands to millions of compounds, and the utilization of expensive equipment. All these factors make it very important to evaluate in advance of the HTS campaign any potential issues related to reproducibility of the experimentation and the quality of the results obtained at the end of these very costly activities. In this article, the authors describe how GlaxoSmithKline (GSK) has addressed the need of a true validation of the HTS process before embarking in full HTS campaigns. They present 2 different aspects of the so-called validation process: (1) optimization of the HTS workflow and its validation as a quality process and (2) the statistical evaluation of the HTS, focusing on the reproducibility of results and the ability to distinguish active from nonactive compounds in a vast collection of samples. The authors describe a variety of reproducibility indexes that are either innovative or have been adapted from generic medical diagnostic screening strategies. In addition, they exemplify how these validation tools have been implemented in a number of case studies at GSK. ( Journal of Biomolecular Screening 2009:66-76)
Collapse
Affiliation(s)
- Isabel Coma
- GlaxoSmithKline R&D Pharmaceuticals, Screening and Compound Profiling, Tres Cantos, Spain,
| | - Liz Clark
- Screening and Compound Profiling, Harlow, UK
| | - Emilio Diez
- GlaxoSmithKline R&D Pharmaceuticals, Screening and Compound Profiling, Tres Cantos, Spain
| | - Gavin Harper
- Computational and Structural Chemistry, Stevenage, UK
| | - Jesus Herranz
- Computational and Structural Chemistry, Tres Cantos, Spain
| | - Glenn Hofmann
- Screening and Compound Profiling, Upper Providence, Collegeville, Pennsylvania
| | | | | | | | - Ricardo Macarron
- Compound Management, Upper Providence, Collegeville, Pennsylvania
| |
Collapse
|
30
|
Zhang XD. Novel analytic criteria and effective plate designs for quality control in genome-scale RNAi screens. ACTA ACUST UNITED AC 2008; 13:363-77. [PMID: 18567841 DOI: 10.1177/1087057108317062] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
One of the most fundamental challenges in genome-wide RNA interference (RNAi) screens is to glean biological significance from mounds of data, which relies on the development and adoption of appropriate analytic methods and designs for quality control (QC) and hit selection. Currently, a Z-factor-based QC criterion is widely used to evaluate data quality. However, this criterion cannot take into account the fact that different positive controls may have different effect sizes and leads to inconsistent QC results in experiments with 2 or more positive controls with different effect sizes. In this study, based on a recently proposed parameter, strictly standardized mean difference (SSMD), novel QC criteria are constructed for evaluating data quality in genome-wide RNAi screens. Two good features of these novel criteria are: (1) SSMD has both clear original and probability meanings for evaluating the differentiation between positive and negative controls and hence the SSMD-based QC criteria have a solid probabilistic and statistical basis, and (2) these QC criteria obtain consistent QC results for multiple positive controls with different effect sizes. In addition, I propose multiple plate designs and the guidelines for using them in genome-wide RNAi screens. Finally, I provide strategies for using the SSMD-based QC criteria and effective plate design together to improve data quality. The novel SSMD-based QC criteria, effective plate designs, and related guidelines and strategies may greatly help to obtain high quality of data in genome-wide RNAi screens.
Collapse
Affiliation(s)
- Xiaohua Douglas Zhang
- Biometrics Research, Merck Research Laboratories, West Point, Pennsylvania 19486, USA.
| |
Collapse
|
31
|
Zhang XD, Kuan PF, Ferrer M, Shu X, Liu YC, Gates AT, Kunapuli P, Stec EM, Xu M, Marine SD, Holder DJ, Strulovici B, Heyse JF, Espeseth AS. Hit selection with false discovery rate control in genome-scale RNAi screens. Nucleic Acids Res 2008; 36:4667-79. [PMID: 18628291 PMCID: PMC2504311 DOI: 10.1093/nar/gkn435] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
RNA interference (RNAi) is a modality in which small double-stranded RNA molecules (siRNAs) designed to lead to the degradation of specific mRNAs are introduced into cells or organisms. siRNA libraries have been developed in which siRNAs targeting virtually every gene in the human genome are designed, synthesized and are presented for introduction into cells by transfection in a microtiter plate array. These siRNAs can then be transfected into cells using high-throughput screening (HTS) methodologies. The goal of RNAi HTS is to identify a set of siRNAs that inhibit or activate defined cellular phenotypes. The commonly used analysis methods including median +/- kMAD have issues about error rates in multiple hypothesis testing and plate-wise versus experiment-wise analysis. We propose a methodology based on a Bayesian framework to address these issues. Our approach allows for sharing of information across plates in a plate-wise analysis, which obviates the need for choosing either a plate-wise or experimental-wise analysis. The proposed approach incorporates information from reliable controls to achieve a higher power and a balance between the contribution from the samples and control wells. Our approach provides false discovery rate (FDR) control to address multiple testing issues and it is robust to outliers.
Collapse
|
32
|
Zhang XD, Espeseth AS, Johnson EN, Chin J, Gates A, Mitnaul LJ, Marine SD, Tian J, Stec EM, Kunapuli P, Holder DJ, Heyse JF, Strulovici B, Ferrer M. Integrating experimental and analytic approaches to improve data quality in genome-wide RNAi screens. ACTA ACUST UNITED AC 2008; 13:378-89. [PMID: 18480473 DOI: 10.1177/1087057108317145] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
RNA interference (RNAi) not only plays an important role in drug discovery but can also be developed directly into drugs. RNAi high-throughput screening (HTS) biotechnology allows us to conduct genome-wide RNAi research. A central challenge in genome-wide RNAi research is to integrate both experimental and computational approaches to obtain high quality RNAi HTS assays. Based on our daily practice in RNAi HTS experiments, we propose the implementation of 3 experimental and analytic processes to improve the quality of data from RNAi HTS biotechnology: (1) select effective biological controls; (2) adopt appropriate plate designs to display and/or adjust for systematic errors of measurement; and (3) use effective analytic metrics to assess data quality. The applications in 5 real RNAi HTS experiments demonstrate the effectiveness of integrating these processes to improve data quality. Due to the effectiveness in improving data quality in RNAi HTS experiments, the methods and guidelines contained in the 3 experimental and analytic processes are likely to have broad utility in genome-wide RNAi research.
Collapse
Affiliation(s)
- Xiaohua Douglas Zhang
- Biometrics Research, Merck Research Laboratories, West Point, Pennsylvania 19486, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Chung N, Zhang XD, Kreamer A, Locco L, Kuan PF, Bartz S, Linsley PS, Ferrer M, Strulovici B. Median absolute deviation to improve hit selection for genome-scale RNAi screens. ACTA ACUST UNITED AC 2008; 13:149-58. [PMID: 18216396 DOI: 10.1177/1087057107312035] [Citation(s) in RCA: 141] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
High-throughput screening (HTS) of large-scale RNA interference (RNAi) libraries has become an increasingly popular method of functional genomics in recent years. Cell-based assays used for RNAi screening often produce small dynamic ranges and significant variability because of the combination of cellular heterogeneity, transfection efficiency, and the intrinsic nature of the genes being targeted. These properties make reliable hit selection in the RNAi screen a difficult task. The use of robust methods based on median and median absolute deviation (MAD) has been suggested to improve hit selection in such cases, but mean and standard deviation (SD)-based methods are still predominantly used in many RNAi HTS. In an experimental approach to compare these 2 methods, a genome-scale small interfering RNA (siRNA) screen was performed, in which the identification of novel targets increasing the therapeutic index of the chemotherapeutic agent mitomycin C (MMC) was sought. MAD values were resistant to the presence of outliers, and the hits selected by the MAD-based method included all the hits that would be selected by SD-based method as well as a significant number of additional hits. When retested in triplicate, a similar percentage of these siRNAs were shown to genuinely sensitize cells to MMC compared with the hits shared between SD- and MAD-based methods. Confirmed hits were enriched with the genes involved in the DNA damage response and cell cycle regulation, validating the overall hit selection strategy. Finally, computer simulations showed the superiority and generality of the MAD-based method in various RNAi HTS data models. In conclusion, the authors demonstrate that the MAD-based hit selection method rescued physiologically relevant false negatives that would have been missed in the SD-based method, and they believe it to be the desirable 1st-choice hit selection method for RNAi screen results.
Collapse
Affiliation(s)
- Namjin Chung
- Department of Automated Biotechnology, Merck Research Laboratories, North Wales, PA 19454, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
|
35
|
Seiler KP, George GA, Happ MP, Bodycombe NE, Carrinski HA, Norton S, Brudz S, Sullivan JP, Muhlich J, Serrano M, Ferraiolo P, Tolliday NJ, Schreiber SL, Clemons PA. ChemBank: a small-molecule screening and cheminformatics resource database. Nucleic Acids Res 2007; 36:D351-9. [PMID: 17947324 PMCID: PMC2238881 DOI: 10.1093/nar/gkm843] [Citation(s) in RCA: 203] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
ChemBank (http://chembank.broad.harvard.edu/) is a public, web-based informatics environment developed through a collaboration between the Chemical Biology Program and Platform at the Broad Institute of Harvard and MIT. This knowledge environment includes freely available data derived from small molecules and small-molecule screens and resources for studying these data. ChemBank is unique among small-molecule databases in its dedication to the storage of raw screening data, its rigorous definition of screening experiments in terms of statistical hypothesis testing, and its metadata-based organization of screening experiments into projects involving collections of related assays. ChemBank stores an increasingly varied set of measurements derived from cells and other biological assay systems treated with small molecules. Analysis tools are available and are continuously being developed that allow the relationships between small molecules, cell measurements, and cell states to be studied. Currently, ChemBank stores information on hundreds of thousands of small molecules and hundreds of biomedically relevant assays that have been performed at the Broad Institute by collaborators from the worldwide research community. The goal of ChemBank is to provide life scientists unfettered access to biomedically relevant data and tools heretofore available primarily in the private sector.
Collapse
Affiliation(s)
- Kathleen Petri Seiler
- Chemical Biology Program and Platform, Broad Institute of Harvard and MIT, 7 Cambridge Center, Cambridge, MA 02142, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Inglese J, Shamu CE, Guy RK. Reporting data from high-throughput screening of small-molecule libraries. Nat Chem Biol 2007; 3:438-41. [PMID: 17637769 DOI: 10.1038/nchembio0807-438] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Publications reporting results of small-molecule screens are becoming more common as academic researchers increasingly make use of high-throughput screening (HTS) facilities. However, no standards have been formally established for reporting small-molecule screening data, and often key information important for the evaluation and interpretation of results is omitted in published HTS protocols. Here, we propose concise guidelines for reporting small-molecule HTS data.
Collapse
Affiliation(s)
- James Inglese
- US National Institutes of Health Chemical Genomics Center, National Human Genome Institute, National Institutes of Health, 9800 Medical Center Drive, Bethesda, Maryland 20892-3370, USA.
| | | | | |
Collapse
|
37
|
Makarenkov V, Zentilli P, Kevorkov D, Gagarin A, Malo N, Nadon R. An efficient method for the detection and elimination of systematic error in high-throughput screening. Bioinformatics 2007; 23:1648-57. [PMID: 17463024 DOI: 10.1093/bioinformatics/btm145] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION High-throughput screening (HTS) is an early-stage process in drug discovery which allows thousands of chemical compounds to be tested in a single study. We report a method for correcting HTS data prior to the hit selection process (i.e. selection of active compounds). The proposed correction minimizes the impact of systematic errors which may affect the hit selection in HTS. The introduced method, called a well correction, proceeds by correcting the distribution of measurements within wells of a given HTS assay. We use simulated and experimental data to illustrate the advantages of the new method compared to other widely-used methods of data correction and hit selection in HTS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Vladimir Makarenkov
- Department d'informatique, Université du Québec à Montreal, C.P.8888, s. Centre Ville, Montreal, QC, Canada.
| | | | | | | | | | | |
Collapse
|
38
|
Zhang XD, Ferrer M, Espeseth AS, Marine SD, Stec EM, Crackower MA, Holder DJ, Heyse JF, Strulovici B. The use of strictly standardized mean difference for hit selection in primary RNA interference high-throughput screening experiments. ACTA ACUST UNITED AC 2007; 12:497-509. [PMID: 17435171 DOI: 10.1177/1087057107300646] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
RNA interference (RNAi) high-throughput screening (HTS) has been hailed as the 2nd genomics wave following the 1st genomics wave of gene expression microarrays and single-nucleotide polymorphism discovery platforms. Following an RNAi HTS, the authors are interested in identifying short interfering RNA (siRNA) hits with large inhibition/activation effects. For hit selection, the z-score method and its variants are commonly used in primary RNAi HTS experiments. Recently, strictly standardized mean difference (SSMD) has been proposed to measure the siRNA effect represented by the magnitude of difference between an siRNA and a negative reference group. The links between SSMD and d+-probability offer a clear interpretation of siRNA effects from a probability perspective. Hence, SSMD can be used as a ranking metric for hit selection. In this article, the authors investigated both the SSMD-based testing process and the use of SSMD as a ranking metric for hit selection in 2 primary siRNA HTS experiments. The analysis results showed that, as a ranking metric, SSMD was more stable and reliable than percentage inhibition and led to more robust hit selection results. Using the SSMD -based testing method, the false-negative rate can more readily be obtained. More important, the use of the SSMD-based method can result in a reduction in both the false-negative and false-positive rates. The applications presented in this article demonstrate that the SSMD method addresses scientific questions and fills scientific needs better than both percentage inhibition and the commonly used z-score method for hit selection.
Collapse
Affiliation(s)
- Xiaohua Douglas Zhang
- Biometrics Research, Merck Research Laboratories, West Point, Pennsylvania 19486, USA. xiaohua_zhang @merck.com
| | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Zhang XD. A pair of new statistical parameters for quality control in RNA interference high-throughput screening assays. Genomics 2007; 89:552-61. [PMID: 17276655 DOI: 10.1016/j.ygeno.2006.12.014] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2006] [Revised: 12/20/2006] [Accepted: 12/20/2006] [Indexed: 02/02/2023]
Abstract
RNA interference (RNAi) high-throughput screening (HTS) enables massive parallel gene silencing and is increasingly being used to reveal novel connections between genes and disease-relevant phenotypes. The application of genome-scale RNAi relies on the development of high-quality RNAi HTS assays. To obtain high-quality HTS assays, there is a strong need for an easily interpretable and theoretically based quality control (QC) metric. Signal-to-noise ratio (S/N), signal-to-background ratio (S/B), and Z-factor have been adopted as QC metrics in HTS assays. In this paper, I proposed a pair of new parameters, strictly standardized mean difference (SSMD) and coefficient of variability in difference (CVD), as QC metrics in RNAi HTS assays. Compared to S/B and S/N, SSMD and CVD capture the variabilities in both compared populations. Compared to Z-factor, SSMD and CVD have a clear probability interpretation and a solid statistical basis. Accordingly, the cutoff criteria of using SSMD or CVD as a QC metric in HTS assays are fully theoretically based. In addition, I discuss the relationship between the SSMD-based criterion and the popular Z-factor-based criterion and elucidate why p-value from t-test of testing mean difference fails to serve as a QC metric.
Collapse
|
40
|
Buxser S, Chapman DL. Use of mixture distributions to deconvolute the behavior of “hits” and controls in high-throughput screening data. Anal Biochem 2007; 361:197-209. [PMID: 17214952 DOI: 10.1016/j.ab.2006.11.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2006] [Revised: 10/26/2006] [Accepted: 11/21/2006] [Indexed: 10/23/2022]
Abstract
The stochastic nature of high-throughput screening (HTS) data indicates that information may be gleaned by applying statistical methods to HTS data. A foundation of parametric statistics is the study and elucidation of population distributions, which can be modeled using modern spreadsheet software. The methods and results described here use fundamental concepts of statistical population distributions analyzed using a spreadsheet to provide tools in a developing armamentarium for extracting information from HTS data. Specific examples using two HTS kinase assays are analyzed. The analyses use normal and gamma distributions, which combine to form mixture distributions. HTS data were found to be described well using such mixture distributions, and deconvolution of the mixtures to the constituent gamma and normal parts provided insight into how the assays performed. In particular, the proportion of hits confirmed was predicted from the original HTS data and used to assess screening assay performance. The analyses also provide a method for determining how hit thresholds--values used to separate active from inactive compounds--affect the proportion of compounds verified as active and how the threshold can be chosen to optimize the selection process.
Collapse
|
41
|
Sui Y, Wu Z. Alternative statistical parameter for high-throughput screening assay quality assessment. ACTA ACUST UNITED AC 2007; 12:229-34. [PMID: 17218666 DOI: 10.1177/1087057106296498] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
High-throughput screening is an essential process in drug discovery. The ability to identify true active compounds depends on the high quality of assays and proper analysis of data. The Z factor, presented by Zhang et al. in 1999, provides an easy and useful summary of assay quality and has been a widely accepted standard. However, as data analysis has undergone much improvement recently, the assessment of assay quality has not evolved in parallel. In this article, the authors study the implications of Z factor values under different conditions and link the Z factor with the power of discovering true active compounds. They discuss the different interpretations of Z factor depending on error distributions and advocate direct analysis of power as assay quality assessment. They also propose that in estimating assay quality parameters, adjustments in data analysis should be taken into account. Studying the power of identifying true "hits" gives a more direct interpretation of assay quality and may provide guidance in assay optimization on some occasions.
Collapse
Affiliation(s)
- Yunxia Sui
- Department of Community Health, Brown University, Providence, RI 02903, USA
| | | |
Collapse
|
42
|
Fay N. The role of the informatics framework in early lead discovery. Drug Discov Today 2006; 11:1075-84. [PMID: 17129826 DOI: 10.1016/j.drudis.2006.10.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2005] [Revised: 09/26/2006] [Accepted: 10/19/2006] [Indexed: 10/24/2022]
Abstract
Recent developments in screening technologies and data analysis have been driven by promises that the numbers of new lead compounds will increase. Although many of these promises have become reality, the success of this strategy also depends on the information framework that ties the individual components together. In particular, high-content technologies represent a new force in challenging established informatics frameworks; largely because of their data volume, variety of assay parameters and increased scientific complexity. A successful informatics framework design can be regarded as crucial for new technologies, both in terms of scientific content and information, and process integration across large corporate networks.
Collapse
Affiliation(s)
- Nicolas Fay
- Evotec AG, Schnackenburgallee 114, D-22525 Hamburg, Germany.
| |
Collapse
|
43
|
Zhang XD, Yang XC, Chung N, Gates A, Stec E, Kunapuli P, Holder DJ, Ferrer M, Espeseth AS. Robust statistical methods for hit selection in RNA interference high-throughput screening experiments. Pharmacogenomics 2006; 7:299-309. [PMID: 16610941 DOI: 10.2217/14622416.7.3.299] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
RNA interference (RNAi) high-throughput screening (HTS) experiments carried out using large (>5000 short interfering [si]RNA) libraries generate a huge amount of data. In order to use these data to identify the most effective siRNAs tested, it is critical to adopt and develop appropriate statistical methods. To address the questions in hit selection of RNAi HTS, we proposed a quartile-based method which is robust to outliers, true hits and nonsymmetrical data. We compared it with the more traditional tests, mean +/- k standard deviation (SD) and median +/- 3 median of absolute deviation (MAD). The results suggested that the quartile-based method selected more hits than mean +/- k SD under the same preset error rate. The number of hits selected by median +/- k MAD was close to that by the quartile-based method. Further analysis suggested that the quartile-based method had the greatest power in detecting true hits, especially weak or moderate true hits. Our investigation also suggested that platewise analysis (determining effective siRNAs on a plate-by-plate basis) can adjust for systematic errors in different plates, while an experimentwise analysis, in which effective siRNAs are identified in an analysis of the entire experiment, cannot. However, experimentwise analysis may detect a cluster of true positive hits placed together in one or several plates, while platewise analysis may not. To display hit selection results, we designed a specific figure called a plate-well series plot. We thus suggest the following strategy for hit selection in RNAi HTS experiments. First, choose the quartile-based method, or median +/- k MAD, for identifying effective siRNAs. Second, perform the chosen method experimentwise on transformed/normalized data, such as percentage inhibition, to check the possibility of hit clusters. If a cluster of selected hits are observed, repeat the analysis based on untransformed data to determine whether the cluster is due to an artifact in the data. If no clusters of hits are observed, select hits by performing platewise analysis on transformed data. Third, adopt the plate-well series plot to visualize both the data and the hit selection results, as well as to check for artifacts.
Collapse
|
44
|
Makarenkov V, Kevorkov D, Zentilli P, Gagarin A, Malo N, Nadon R. HTS-Corrector: software for the statistical analysis and correction of experimental high-throughput screening data. Bioinformatics 2006; 22:1408-9. [PMID: 16595559 DOI: 10.1093/bioinformatics/btl126] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION High-throughput screening (HTS) plays a central role in modern drug discovery, allowing for testing of >100,000 compounds per screen. The aim of our work was to develop and implement methods for minimizing the impact of systematic error in the analysis of HTS data. To the best of our knowledge, two new data correction methods included in HTS-Corrector are not available in any existing commercial software or freeware. RESULTS This paper describes HTS-Corrector, a software application for the analysis of HTS data, detection and visualization of systematic error, and corresponding correction of HTS signals. Three new methods for the statistical analysis and correction of raw HTS data are included in HTS-Corrector: background evaluation, well correction and hit-sigma distribution procedures intended to minimize the impact of systematic errors. We discuss the main features of HTS-Corrector and demonstrate the benefits of the algorithms.
Collapse
Affiliation(s)
- Vladimir Makarenkov
- Departement d'informatique, Université du Québec à Montreal, C.P.8888 suc.Centre-Ville, Montreal, QC, H3C 3P8, Canada.
| | | | | | | | | | | |
Collapse
|
45
|
Malo N, Hanley JA, Cerquozzi S, Pelletier J, Nadon R. Statistical practice in high-throughput screening data analysis. Nat Biotechnol 2006; 24:167-75. [PMID: 16465162 DOI: 10.1038/nbt1186] [Citation(s) in RCA: 505] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
High-throughput screening is an early critical step in drug discovery. Its aim is to screen a large number of diverse chemical compounds to identify candidate 'hits' rapidly and accurately. Few statistical tools are currently available, however, to detect quality hits with a high degree of confidence. We examine statistical aspects of data preprocessing and hit identification for primary screens. We focus on concerns related to positional effects of wells within plates, choice of hit threshold and the importance of minimizing false-positive and false-negative rates. We argue that replicate measurements are needed to verify assumptions of current methods and to suggest data analysis strategies when assumptions are not met. The integration of replicates with robust statistical methods in primary screens will facilitate the discovery of reliable hits, ultimately improving the sensitivity and specificity of the screening process.
Collapse
Affiliation(s)
- Nathalie Malo
- McGill University and Genome Quebec Innovation Centre, 740 avenue du Docteur Penfield, Montreal, Quebec, Canada, H3A 1A4
| | | | | | | | | |
Collapse
|
46
|
Gribbon P, Lyons R, Laflin P, Bradley J, Chambers C, Williams BS, Keighley W, Sewing A. Evaluating real-life high-throughput screening data. ACTA ACUST UNITED AC 2005; 10:99-107. [PMID: 15799953 DOI: 10.1177/1087057104271957] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
High-throughput screening (HTS) is the result of a concerted effort of chemistry, biology, information technology, and engineering. Many factors beyond the biology of the assay influence the quality and outcome of the screening process, yet data analysis and quality control are often focused on the analysis of a limited set of control wells and the calculated values derived from these wells. Taking into account the large number of variables and the amount of data generated, multiple views of the screening data are necessary to guarantee quality and validity of HTS results. This article does not aim to give an exhaustive outlook on HTS data analysis but tries to illustrate the shortfalls of a reductionist approach focused on control wells and give examples for further analysis.
Collapse
Affiliation(s)
- Philip Gribbon
- Automated Screening Technologies, Pfizer Global Research and Development, Sandwich, UK
| | | | | | | | | | | | | | | |
Collapse
|
47
|
Brideau C, Gunter B, Pikounis B, Liaw A. Improved statistical methods for hit selection in high-throughput screening. ACTA ACUST UNITED AC 2004; 8:634-47. [PMID: 14711389 DOI: 10.1177/1087057103258285] [Citation(s) in RCA: 252] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
High-throughput screening (HTS) plays a central role in modern drug discovery, allowing the rapid screening of large compound collections against a variety of putative drug targets. HTS is an industrial-scale process, relying on sophisticated automation, control, and state-of-the art detection technologies to organize, test, and measure hundreds of thousands to millions of compounds in nano- to microliter volumes. Despite this high technology, hit selection for HTS is still typically done using simple data analysis and basic statistical methods. The authors discuss in this article some shortcomings of these methods and present alternatives based on modern methods of statistical data analysis. Most important, they describe and show numerous real examples from the biologist-friendly Stat Server HTS application (SHS), a custom-developed software tool built on the commercially available S-PLUS and StatServer statistical analysis and server software. This system remotely processes HTS data using powerful and sophisticated statistical methodology but insulates users from the technical details by outputting results in a variety of readily interpretable graphs and tables.
Collapse
Affiliation(s)
- Christine Brideau
- Department of Biochemistry and Molecular Biology, Merck Frosst Centre for Therapeutic Research, Kirkland, Quebec, Canada.
| | | | | | | |
Collapse
|