INTRODUCTION
Whilst they are amongst the last addition to the RNA family, circular RNAs (circRNAs) are not new discoveries[1]. Circular transcripts were originally found to naturally exist in plant viroids in 1976[2] and in the hepatitis delta virus in 1986[3]. They were noted as endogenous molecules in eukaryotes by a study investigating splicing in the DCC gene[4]. In this study, splicing was observed to occur in a non-sequential fashion by means of “exon scrambling”; upstream exons moved downstream to bind exons and yielded circular transcripts[4]. Because their exons are inverted compared to the exonic arrangement on the genomic open reading frame, circRNAs were initially labeled as by-products of splicing error[5]. This narrative began to change upon discovery that the Syr gene in adult mice was only expressed as 1.23-kb circular transcripts[6]. Given the importance of this gene in sex determination during embryogenesis, it inferred possible pre-determined biological of circRNAs, albeit being grouped as non-coding RNAs (ncRNAs) at this time[7]. However, renewed interest in circRNAs occurred when Salzman et al[8] identified a myriad of circRNAs in a variety of normal, and malignant cell types. Additionally, the functional exploration of CDR1as revealed its ability to sponge miR-7 in neuronal tissue, inferring that miRNA sponging may be a function of other circRNAs as well[9]. Consequently, interest in the mechanistic machinery that drives the genesis of circRNAs, as well as their function has intensified over the last few years.
CIRCULAR RNA BIOGENESIS
The combinatorial model best explains the alternative splicing (AS) mechanism that facilitates exon skipping. In this model, splicing regulatory factors coordinate the splicing order to determine which exons are included in the final mRNA transcript[10]. The outcome is multiple isoforms of a protein with different functions[11]. AS not only coordinates diversity amongst the linear transcriptome, it also facilitates a diverse group of circRNAs formed via backsplicing[12]. In the backsplicing process, circular transcripts are generated through covalently fusing the 5′ site of an upstream exon (acceptor) with the 3′ end of the same, or a downstream exon (donor)[5,13,14] (Figure 1A). The diversity amongst circRNAs was evidenced with multiple genes in a recent study- a salient example was the BIRC6 gene which was shown to generate over 500 circular isoforms[15]. Unsurprisingly, the study also highlighted that diversity amongst circular isoforms was directly proportional to exon counts in the gene[15].
Figure 1 Biogenesis of circRNAs.
A: In backsplicing, circRNAs are usually flanked by the canonical splicing motifs, AG-GT, and covalently fuse the 5′ site of an upstream exon (acceptor) with the 3′ end of a downstream exon (donor); B: In the exon skipping model, an unstable intermediate lariat consisting of introns and skipped exons are generated after splicing. The intermediate lariat is then spliced to produce circRNA; C: Flanking introns containing complementary sequences (Alu repeats) bind and increase the possibility of backsplicing; D: RNA-binding proteins, such as Quaking can bind to flanking introns and dimerize to create a closed RNA loop which facilitates backsplicing. QKI: Quaking.
Interestingly, backsplicing is flanked by the canonical splicing motif, AG-GT[15] and the circular RNAs and their relative linear RNAs share canonical splice sites suggesting that they are both generated by the same spliceosome machinery[16]. One study demonstrated that introducing mutations into the canonical splice sites significantly decreased circRNA production[16]. This study, as well as others[17] have also projected that circular and linear RNAs are competitively generated by the same spliceosome.
Liang et al[18] indicate that circRNAs are seldomly formed from the first or last exons as these exons lack splicing binding sites. Moreover, the number of exons in a single circRNA usually ranges between one and five exons, with several sources reporting that circRNAs with two to three exons are most prevalent[4,5,8,12]. Nonetheless, exons are not exclusive components of circRNAs; circularization of introns, long non-coding RNAs (lncRNAs), antisense transcripts, and intergenic regions is also possible[8,19]. Fascinatingly, there are multiple pieces of evidence of circRNAs consisting of both exonic and intronic regions[5,8,20,21], but exonic circRNAs are still most prevalent and studied[12,20]. Interestingly, Vo et al[15], mentioned a new subset of circRNAs generated from exons provided by adjacent genes on the same strand called read-through circRNAs (rt-circRNAs). The specific mechanisms of backsplicing are intricate and are still being investigated as bioinformatics of circRNA mapping improves. However, the following models are recurrently proposed to facilitate backsplicing: Exon skipping model (Lariat model), Intron-pairing, and the RNA-binding protein (RBP) models.
Exon skipping model (Lariat model)
In the exon skipping model, canonical splicing occurs first, producing the mRNA transcript, and an intermediate lariat consisting of introns and skipped exons[1,5] (Figure 1B). The intermediate lariat is unstable and undergoes further splicing (intra-lariat splicing) in which circRNA(s) are produced via backsplicing, and the intron lariat forms a separate RNA strand[1,5,20]. However, backsplicing via exon skipping can also occur independent of lariat formation by means of direct backsplicing[5].
Intron-pairing
A common feature amongst circularized exons is the presence of long flanking introns containing complementary sequences (Alu repeats)[20] (Figure 1C). This characteristic makes it possible to predict the backsplicing sites of circularization using bioinformatics. Hybridization of these complementary sequences increases the proximity of exonic backsplicing sites and facilitates backsplicing of said sites[18,20]. In this model, the circRNA generation is prioritized over linear transcripts, unlike in the exon skipping model[5,20]. Thus further suggesting that circRNAs are purposely produced, and according to Eger et al[5], explain the higher expression of certain circRNAs for some genes over linear transcripts. Interestingly, multiple studies propose that flanking intronic sequences represented in this model can be considered modulators in circularization efficiency[16,20,22]. Zhang et al[21] calls this model of backsplicing “alternative circularization”, and adds that alternative circularization in concert with alternative splicing, also enhances exonic circularization diversity from a single gene.
RBPs-mediated backsplicing
Multiple studies have demonstrated RBPs-mediated exon circularization with RBPs such as Quaking (QKI) and Muscleblind protein (MBL)[16,23]. In this model, RBPs bind to flanking introns (near to splicing sites) and dimerize to create a closed RNA loop that facilitates backsplicing[23,24] (Figure 1D). Conn et al[23] showed that inserting synthetic QKI into intron sites significantly induced circRNA formation and confirmed QKI-directed biosynthesis of circRNA. Similarly, in a prior study, circMbl formation was significantly increased after cells were transfected with MBL variants. This finding was accompanied by a reduction in linear Mbl generation[16]. Altogether, these results not only demonstrated RBP-regulated circRNA generation but also demonstrated the role of RBPs in competitive splicing to generate circular versus linear mRNAs.
CIRCULAR RNA FUNCTIONS
Though there are several pieces of evidence supporting functions such as miRNA sponging in molecules like CDR1as, substantial investigation of general functionality have only been demonstrated in a handful of circRNAs. Herein, we highlight three proposed functions of circRNAs that have been investigated: MiRNA sponging, protein binding, and cap-independent translation. However, whether these functions are generally exhibited by all or most circRNAs is not known.
CircRNAs are miRNA sponges and intermediate miRNA reservoirs
Perhaps the most examined function of circRNAs is their ability to sponge miRNAs. Some circRNAs harbor microRNA response elements (MREs) which facilitate the competitive binding of miRNAs[25,26]. The sequestration of miRNAs by circRNAs modifies their activity in regards to mRNA target gene expression[1,25]. In essence, circRNAs are indirectly involved in mRNA gene expression through miRNA sponging. For example, CDR1as contains over 70 conserved binding sites for miR-7[9,25,27], and the binding capacity is 10 times higher than that of any other transcript or mRNA target[27]. Hansen and colleagues further add that the competition between miR-7 targets and CDR1as creates a buffer effect that prevents transient fluctuations in miR-7 expression[28]. Furthermore, cleavage of CDR1as-miR-7 by argonaute 2 (AGO2) results in the release of miR-7 and the subsequent inhibition of miR-7 targets[25,28,29]. As such, CDR1as functions not only as a miRNA sponge but also as an intermediate reservoir for miR-7[29].
Protein binding
Some circRNAs can competitively bind RBPs as well as store, sort, and sequester proteins in the cytoplasm to limit nuclear entry, regulate their function, and act as scaffolds for protein-protein interactions[30,31]. For example, CircFOXO3 binds and prevents the interaction of p21 and CKD1 to suppress cell cycle progression at the G1 stage in a non-tumor cell line[32], and scaffolds p53 and Mdm2 in breast cancer cell lines to promote Mdm2-induced p53 degradation[33]. The interaction between circMbl and MBL is interesting as MBL can prioritize the generation of circMbl over linear forms, which in turn regulates MBL levels by sponging[16].
CircRNAs mediated protein translation in a cap-independent manner
The predominant opinion on circRNAs is that they are ncRNAs that do not translate proteins. However, the advent of engineered circRNAs that translate protein[20] fostered questions as to whether protein-coding endogenous eukaryotic circRNAs exist. Whilst the predominant stance still aligns with the former view, it has since come to light that there is a minute proportion (< 1%) of circRNAs that contain the start AUG codon, and are able to associate with ribosomes. Amongst them is circZNF609, which consists of a start and stop codon similar to those in the linear transcript. In their study, Legnini et al[34] were able to identify circ-ZNF609 as eukaryotic circRNAs that associate with polysomes, and are protein-coding. In circular transcripts like circ-ZNF609, the 5’untranslated regions (5’UTR) are included in the circular sequence during circularization. The 5’UTRs undergo folding to form internal ribosomal entry sites (IRES) which facilitate ribosomal association[34]. Some circRNAs such as circ-FBXW7 are also able to translate protein by other mechanisms such as N6- adenosine methylation[12,29]. Considering that most circRNAs are less abundant than their linear counterparts, it is unsurprising that the aforementioned examples of protein-coding circRNAs are less efficient in this activity than linear transcripts. Accumulating evidence also suggests that cap-independent translation is a cellular stress response to generate immediate and selective changes in protein levels[34].
THE POTENTIAL OF CIRCULAR RNA AS BIOMARKERS
Abundance
CircRNAs represent approximately 10% of the total RNA content in cells[35], with some being more abundantly expressed than their linear isoforms[8,36]. Their global expression and abundance can be stage-or-age dependent[37] as evidenced by several studies demonstrating variation in circRNA expressions at different developmental stages. Two studies reported the induction of circRNA expression during embryonic development in humans and flies across a range of tissues[38,39]. For example, the circular RNA generated from the NCX1 gene (primarily expressed in cardiomyocytes) was most highly expressed during fetal development according to Szabo et al[38]. In the mouse brain, one study demonstrated that certain circRNAs were more expressed in aged mice versus mice half their age[40] suggesting a function in neuronal maturity; another study described circRNA abundance at different stages of hippocampus development in the brain[41]. Interestingly, circRNA abundance can be independent of linear RNA expression[42] indicating splicing preference for generating certain circRNAs at different biological stages and suggesting an overall function in development.
Tissue- and cell lineage-specificity
The expression of some circRNAs is cell and tissue-dependent[17,42,43] which suggests they can be used as molecular markers for different diseases. For example, the expression levels of circular isoforms of the DCC gene varied across human tissues and did not correlate with their linear counterparts[4]. Similarly, certain circRNAs are concentrated in different parts of mammalian brains, and also had varying ratios of circRNAs versus linear RNAs[17]. In mice, the circular forms of Rmst and Khl12 were highly expressed in the brains versus the liver and lungs[41]. These studies suggest that circRNA generation and subsequent expression is a widely regulated process. Furthermore, this regulation appears to be evolutionarily conserved across mammals, having had several studies document the conservation between mouse, pigs, flies, and humans in brain tissues[1,17,20,42].
Stability
Unlike linear transcripts, circRNAs are covalently closed loops that lack polyadenylated tails[8,20]. Hence, circRNAs are relatively more stable, and have increased protection from exonuclease degradation[8,20]. Considering that exonucleases, and not endonucleases are the predominant nucleases in host RNA cells[44], it is inferred that the accumulation and detection of circRNAs is favored over the linear transcripts. Though RNA circularization generally increases stability of RNA molecules, hepatitis delta virus (HDV) circular RNAs become more susceptible to degradation by nucleases as they increase in molecular size. However, there is evidence suggesting that these larger HDV circles can be stabilized by their interactions with RBPs such as Ag-S[45].
Unsurprisingly, most circRNAs also have a half-life that is approximately 2.5 times longer than their linear counterparts in mammalian cells[20,25]. Due to their relative stability, circRNAs can also be detected at higher levels (approximately 6.3 folds higher) in exosomes than in cells[46]. This is an important property which contributes to their detection in body fluids.
Exosome enrichment and detection in body fluids
CircRNAs are more enriched in exosomes compared to intracellular levels[30,46]. Exosomes are vesicles that facilitate cell-to-cell communication between parent and recipient cells[27]. CircRNAs are sorted into exosomes potentially as a response to stimuli or physiological needs[27]. Though the precise mechanism is largely unclear, the sorting of circRNAs into exosomes is considered to be a regulated and selective process and can be guided by different factors such as RBPs and miRNA abundance[30,46]. Because of their enrichment and stability in exosomes, circRNAs are detectable in a range of body fluids including saliva[47], plasma[48], urine[49], gastric fluid[50], and supports their consideration as minimally-invasive biomarkers. One study shows that a group of exosomal-circRNAs (exo-circRNAs) in serum could distinguish between colon cancer patients and healthy controls[46]. Another study demonstrated that circRNA-IARS in exosomes could be a potential early diagnostic and prognostic predictor of pancreatic ductal adenocarcinoma (PDAC)[51]. These two studies demonstrate the translational potential of exo-circRNAs as circulating clinical biomarkers.
Genomic information
Unlike protein biomarkers, circRNAs are transcriptomic molecules that entail nucleic acid sequences. These sequences could potentially convey genomic information pertaining to germline mutations, as well as therapy-related somatic mutations which may inform disease prognosis and facilitate therapy decision[52]. Although cell-free tumor DNA can also provide similar information, it reflects the tumor cell genome and is passively released from dead tumor cells. In contrast, circRNAs are gene transcripts and can be both passively and selectively released from tumor cells in exosomes. Therefore, circRNAs could be more effective early indicators of disease.
CIRCULAR RNA IN PROSTATE CANCER
Current biomarkers in prostate cancer
Prostate cancer (PCa) is one of the most common cancers amongst men worldwide[53,54]. Like many other cancers, PCa management is plagued with the possibility of metastasis, therapy resistance, and poor diagnostic and prognostic biomarkers for screening[54]. Despite the emergence of a plethora of potential prostate cancer biomarkers, the prostate-specific antigen (PSA) still remains the best tool to general screening, and monitoring post-treatment[54]. Still, PSA testing is not without its shortcomings and controversies. Whilst it is prostate-specific, the PSA is not PCa specific, and its level in the blood can be affected by other factors such as age, trauma, inflammation, benign prostatic hypertrophy (BPH), etc[55]. Moreover, the established normal range of PSA (< 4.0 ng/mL) insufficiently captures PCa cases and often lead to under-diagnoses and false-positives[56,57]. Reports show that only 25%-30% of elevated PSA within the grey zone (4.0-9.9 ng/mL) cases are confirmed with PCa when biopsied[57,58]. From their study, Thompson et al[57] showed that normal PSA is also possible in men with PCa and high Gleason grade- this was observed in 15% of their study participants with normal PSAs.
The limitation of PSA also lies in deciding which cases move forward with biopsy for pathological diagnosis of PCa, which has been the blame for hundreds of thousands of unnecessary prostate biopsies in the United States yearly[59]. Serum levels of other PSA isoforms (e.g. p2PSA) show improved specificity to the PSA blood test[55]. Other potential biomarkers such as the prostate cancer antigen 3 (PCA3) score has shown utility in PCa diagnosis and monitoring[60]. PCA3 is a long non-coding RNA that is highly expressed in PCa (primary and metastatic cases)[60]. Whilst possessing a higher specificity than serum PSA, PCA3 score has variable sensitivity and requires a digital rectal examination to collect the specimen, which limits its clinical usage[61]. As evidenced by one study, using PCa-specific circRNAs (circ_0057558 and circ_0062019) from tissues and PSA levels together could offer a diagnostic advantage over just the PSA test[62]. In this study, the combination increased the AUC, specificity, and sensitivity for distinguishing between BPH and PCa[62]. However, reliable, and minimally-invasive PCa clinical biomarkers that can provide diagnostic and prognostic information solely, or in supplementation to the PSA test is still lacking.
CircRNAs as potential biomarkers of prostate cancer
The advancement of transcriptomic profiling has revealed a plethora of circRNAs worthy of further investigations for PCa biomarker development[15,36,63,64]. Chen and colleagues identified a group of circRNAs that are able to distinguish between localized PCa and normal prostate[36]. This study also proposed that circRNA abundance may not only be tissue-dependent but also based on functional roles in the tumor such as cell proliferation[36]. The functional analyses conducted in this study have strengthened the consideration of circRNAs as PCa biomarkers.
Along with establishing the MiOncoCirc catalog of circRNAs, Vo and colleagues identified a subset of circRNAs able to distinguish between PCa subtypes using tissue biopsies[36]. From this subset, circAMACR was upregulated and associated with androgen receptor (AR) amplification in castration-resistant prostate cancer. Additionally, circAURKA was upregulated in the suggestion of neuroendocrine prostate cancer (NEPC)[36]. These are promising markers for therapy-resistant PCa progression and warrant further investigations in clinical settings in different patient cohorts.
In collaboration with Yan Dong’s Lab, we reported and validated that multiple circRNAs are encoded by the AR gene, and are widespread in PCa cells and xenograft models[65]. We have further demonstrated that one of the AR circRNAs, namely circAR3, is abundantly expressed in prostate tissues and detectable in patient plasma in prostate- and prostate cancer-specific manners[52]. It is worth to be noticed that the levels of intratumoral circAR3 reduced in high Gleason tumors, while plasma circAR3 is positively associated with high Gleason scores and positive lymph node metastasis, making it suitable for biomarker development in PCa[52]. This disproportional expression of circRNAs in tissue and blood may likely be explained by the release rates of circRNAs from tissue to bloodstream that can be affected by multiple factors (Figure 2): (1) CircRNAs can be selectively packaged into exosomes and actively released from the tumor into the circulatory system where they are detectable in plasma; (2) With PCa development, the prostate architecture is disrupted leading to faster release of circRNAs from the tissues into the stromal space. They can circumvent the endothelial cells of the blood vessels and enter the bloodstream. Similar to PSA, the plasma concentration of PCa-specific circRNAs can be increased in this way; (3) Cell death induced by stresses such as hypoxia, inflammation, and anti-tumor therapies can increase the release of circRNAs into the bloodstream; and (4) As tumor invasion and metastasis occur, microparticles containing circRNAs are shed from tumor cells, subsequently increasing the circRNA concentration in plasma. As indicated with circAR3, plasma levels were higher in lymph node metastasis than without[52]. Altogether, these form a complex network that constitutes the disproportion between circRNA levels in tumors versus plasma.
Figure 2 The disproportion of circRNAs between tumor and plasma.
A: CircRNAs can be selectively enriched in exosomes and actively released into plasma as exosomes. During PCa progression, the integrity of normal prostatic tissues will be interrupted; this facilitates the release of circRNAs into the bloodstream; B: Stresses such as hypoxia, inflammation, and anti-tumor therapies will cause cell death and increase the release of circRNAs. Microparticles containing circRNAs shed from the metastasizing tumor will subsequently increase the circRNA concentration in plasma.
The functional characterization of circRNAs in PCa cells further advocates that certain circRNAs could be developed into PCa biomarkers. CircRNA-miRNA mapping has revealed that studying the interaction between circRNAs and miRNAs may further help to characterize the role of certain circRNAs in PCa development. In vitro investigations of interactions such as CDR1as-miR-7[66], circRNA-MYLK- miR-29a[64], and circBAGE2-miR-103a[66] have implicated tumor suppressive and oncogenic roles of circRNAs, which could imply their utility as biomarkers as well as therapeutic targets[64]. Other studies have shown that some circRNAs may play roles in contributing to therapy-resistance PCa. For example, downregulated circFOXO3 promotes PCa progression to be resistance to docetaxel[67], while hsa_circ_0004870 downregulation is correlated with enzalutamide resistance[11].
CONCLUSION
The surmounting evidence linking circRNA expression to the development of PCa is promising. Their presence and stability in body fluids such as plasma and urine allow their expressions to be analyzed in regards to a range of urologic diseases. Moreover, their detectability in said body fluids is a key pro in regards to convenient, minimally invasive sample collection which is an important feature for ideal biomarkers. Most exciting is the validation of a circRNA that is prostate and prostate-cancer specific, and detectable in the plasma of patients. Overall, further investigations are needed to truly label circRNAs as biomarkers. Firstly, it might be useful to focus on functionally characterizing specific circRNAs in pathogenesis and or tumorigenesis.
Molecular pathological epidemiology (MPE) research focuses on the etiology and pathogenesis of diseases. The inclusion of MPE studies in the future could provide clearer correlations between circRNAs, tumor characteristics/molecular changes, risk factors (environmental, lifestyle, microbiome, genetic mutations, etc.), and disease outcome (including tumor subtypes) in PCa patients. It would also be interesting to see whether the findings of such studies could expand on the potential clinical applications of circRNAs in cancer management; specifically as it relates to constructing predictive models that could improve screening and personalized medicine. But, the success of MPE research is hindered by challenges such as the need for trans-disciplinary experts, and poorer success rates with funding applications[68]. Nonetheless, MPE research generally have strong impact[68], thus it is a promising direction for elevating prostate cancer research with circRNAs.
Furthermore, considering the wide expression of circRNAs, perhaps closer attention should be on defining disease-specific circRNA panels which could be used in addition to traditional diagnostic markers. Additionally, for clinical validation, sample processing, detection method, and interpretation (cut-off) values need to be standardized across studies prior to truly establishing their clinical capacity as biomarkers. Nonetheless, with the growing capacity of next-generation sequencing and bioinformatics, the knowledge of circRNAs and their biomarker potential will undoubtedly continue to expand.