Functional pseudogenes inhibit the superoxide production

Department of Cardiovascular Medicine, Key Laboratory of Molecular Cardiology, Cardiovascular Research Center, and Key Laboratory of Environment and Genes Related to Diseases, First Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China Cardiovascular Research Institute, Morehouse School of Medicine, Atlanta, Georgia 30310, USA Department of Medicine, Morehouse School of Medicine, Atlanta, Georgia 30310, USA Department of Obstetrics and Gynecology, First Affiliated Hospital, Xi'an Jiaotong University, Xi'an, 710061, Shaanxi, China 4DGENOME Inc, Atlanta,Georgia 30092, USA


Introduction
Neutrophil Cytosolic Factor 1 (NCF1) gene encodes p47 phox , the regulatory component of NADPH oxidase, the enzyme that converts oxygen to superoxide [1][2][3] .NADPH oxidase is primarily active in immune cells and plays an essential role in the immune system.When foreign invaders trigger the phosphorylation of p47 phox , it binds the other cytosolic components, migrates to the membranes and assembles the active enzyme [4,5] .Mutations in the NCF1 gene cause the chronic granulomatous disease (CGD), a disorder featured with recurrent episodes of infection and inflammation due to a weakened immune system [6] .

RESEARCH ARTICLE
NCF1 has two pseudogene partners (NCF1B and NCF1C) co-localized at the same genomic locus in human genome [7] .We recently discovered that a copy number variation (CNV) on these two NCF1 pseudogenes in human populations, which showed a significant difference on the copy numbers between different populations [8] .These two pseudogenes and their true gene partner are nearly identical (>99.5%),but two pseudogenes contain a 2-bp deletion (∆GT) in their second exon, which causes a frameshift and a premature stop codon so that they cannot produce full-length p47 phox protein as their true gene partner does [9][10][11] , and thus were thought non-functional.However, we found that only about 25% of NCF1 pseudogene transcripts use the mutant ∆GT exon; the other transcripts did not use this mutant exon [8] .This result triggered us to think that these two NCF pseudogenes may mainly produce other transcripts that skip over this ∆GT mutant through alternative splicing and thus are not necessarily non-functional.Moreover, these two pseudogenes seemed to be under a tight transcriptional control and their expression level is related to the cell differentiation [8] .These observations prompted us to examine the functional impact of these two NCF1 pseudogenes.

Samples and reagents
Total RNAs of human tissues were purchased from Clontech (Mountain View, CA, USA).Human umbilical vein endothelial cells (HUVECs) were purchased from American Type Culture Collection (ATCC, Manassas, VA, USA).Phorbol myristate acetate (PMA) was purchased from Sigma (Saint Louis, MO, USA).The fluorescent sensor hydrocyanine 3 (H-cy3) was synthesized by the Dr. Niren Murthy laboratory.

Cloning of the NCF1 splicing isoforms
PCR products were visualized in the agarose gel electrophoresis, then bands were excised, collected and purified with the Qiagen Gel Purification Kit (Valencia, CA, USA).The purified DNAs were cloned into the pGEM-T vector (Promega, Madison, WI, USA), and transformed into E. coli competent cells.About 200 colonies were subjected to DNA extraction with Qiagen Miniprep Kit, followed by Sanger sequencing.The full-length cDNAs were then amplified from the human tissue RNAs for those validated novel splicing isoforms, and then subcloned into a mammalian expression vector pCMV-HA (Clontech, Mountain View, CA) in the open reading frame at its EcoRI/NotI restriction sites, this vector has a hemagglutinin A (HA) tag at the C-terminus.

Cell culture and transfection
Human umbilical vein endothelial cells (HUVECs) were cultured and maintained in the DMEM medium (Invitrogen).When cells reached 80% confluence, they were transfected with 0.4 µg of plasmid DNA using the Invitrogen Lipofectamin LTX kit.Three controls were included throughout these experiments, 1) a control without PMA challenge, 2) a control without transfection but with PMA challenge, and 3) a control with both vector transfection and the PMA challenge.A plasmid encoding the enhanced green fluorescent protein (pEGFP-N1, Clontech Catalog #6085-1) was used to co-transfect the cells together with the vectors encoding the NCF1 isoforms in all of the transient transfection assays to monitor the transfection efficiency.

ROS (reactive oxygen species) measurement
We used two methods to measure the levels of superoxide and ROS in HUVECs.In the first method, cells were transfected with NCF1 splicing isoforms for 48 hours, then incubated with hydrocyanine 3 (H-cy3) (provided by Dr. Niren Murthy's lab) for 15 minutes.H-cy3 is a fluorescent sensor that can react with ROS in live cells, which enables a direct observation and measurement under a fluorescent microscope [12] .Then the cells were washed and challenged with phorbol myristate acetate (PMA) 100 nM for 1.5 hours.The fluorescent intensity was quantified with the Image-J software, which indicated the superoxide level.In the second method, cells were transfected with NCF1 splicing isoforms for 48 hours, then HUVECs were washed and incubated with ROS/superoxide detection mix (Enzo Life Sciences, Farmingdale, NY, USA, Total ROS/Superoxide Detection Kit) for 1 hour, and treated with pyocyanin (Enzo Life Sciences, Total ROS/Superoxide Detection Kit) 50 µM in growth medium for 1 hour.The fluorescent signals were imaged with fluorescent microscope and quantified with the Image-J software.This method measures the levels of superoxide and general ROS simultaneously.

Subcellular localization of NCF1 splicing isoforms in HUVECs
Transfected HUVECs were cultured in chamber slides for 48 hours.The cells were fixed with 0.01% formaldehyde for 10 minutes at room temperature, permeabilized with 0.5% of Tween 20 in PBS for 10 minutes, and blocked with 5% BSA for 1 hour.The cells were subsequently incubated with an anti-HA monoclonal antibody (Sigma) (2µg/ml in PBS containing 1% of BSA and 0.5% of Tween 20) for 1 hour, and incubated with Alexa Fluor 488-labeled goat anti-mouse IgG (Invitrogen) for 1 hour.Nuclei were stained with DAPI (4',6-diamidino-2-phenylindole, 1:1000 diluted in PBS).Images were visualized under a fluorescent microscope.

Bioinformatics analysis
UCSC Genome Browser was used to analyze the exon structure of NCF1 splicing isoforms following the ag-gt splicing rule.NCBI ORF Finder was used to predict protein sequences translated from splicing mRNA isoforms.Domain structures were analyzed with the NCBI CDD tool (Conserved Domain Database).Alignment of protein sequences and mRNA sequences was carried out with the EMBL-EBI ClustalW2-Multiple Sequence Alignment software.

Statistical analysis
All experiments were performed in duplicates and minimally three times.The ROS levels were normalized by the co-transfected GFP levels before comparison among groups.Statistical significance was examined by student t test.

Discovery of novel splicing isoforms transcribed from the NCF1 pseudogenes
Agarose gel electrophoresis of the PCR products amplified with the pooled human tissue cDNAs displayed several bands between 0.75-1.5 kb.After cloning and sequencing, we identified a number of novel splicing variants.Among these variants, the isoform S201 was derived from the NCF1 gene, whereas S21, S22 and S85 were derived from the NCF1 pseudogenes according to their sequences at the position of the GT deletion.The splicing variant S21 was composed of 12 exons, it contained a newly discovered exon located in the previous intron-1 as the new exon-2, and the previous GT-deleted defective "exon-2" was used as exon-3 (Fig. 1A).The other three splicing variants had 11 exons, the same as the wild type transcripts from the true gene.Compared with the wild type transcripts, S22 used an alternative exon-8, S85 skipped exon-3 and exon-4, S201 used an alternative exon-6 (Fig. 1A).
The amino acid sequences were aligned together with the wild-type sequence (S8) translated from the NCF1 true gene transcripts (Fig. 2).Compared with the wild type p47 phox protein (Fig. 1B), S21 lacked the N-terminal 26 amino acids (1-26), it had a truncated PX(Phagocyte oxidase) domain and contact SH3 domains, the AIR domain and the PRR domain; S22 lacked the first 24 amino acids (1-24) and some amino acids in the C-terminal, so it had only a truncated PX domain and a SH3 domain; S85 lacked the131 amino acids (1-131) so it had all domains except the PX domain; S201 had one SH3 domain, the AIR domain and the PRR domain, without the PX domain and the other SH3 domain.

Transient transfection of pseudogene products affected the superoxide production
We examined the functional impacts of the alternative splicing isoforms (S8, S21, S22, S85, and S201) on superoxide production.We used two methods to measure the levels of superoxide and ROS, the fluorescent probe hydrocyanine-3 (H-cy3 [12] that we synthesized and the commercial kit for total ROS/superoxide detection (Enzo Life Sciences).Experiments were performed in a double-blinded design, and the observations using two measuring approaches were consistent.Compared with the cells transfected with the wild type construct (S8), the cells transfected with the splicing variant S21, had a slightly (120%) and statistically significantly higher level of superoxide upon stimulation of PMA (Fig. 3 and Fig. 4); the cells transfected with either of the other three splicing variants, S22, S85, and S201, showed significantly lower levels of superoxide that was close to half of the superoxide level in the cells transfected with the S8 wild type construct (S22, 62.2%; S85, 55.6%; S201, 50.7%) (Fig. 3 and Fig. 4).Furthermore, the levels of superoxide generation in the cells transfected with the splicing isoforms (S22, S85, S201) was even lower than that in the cells transfected with the empty vector (S22, 77.5%; S85, 69.3%; S201, 63.1%), indicating that these three splicing variants might inhibit the endogenous NADPH oxidase activity in HUVEC cells.

Subcellular localization of NCF1 splicing isoforms in HUVEC cells
To validate the translation from the novel splicing isoforms of mRNA and to explore their subcellular localization, we used an anti-HA tag antibody to stain the cells transfected with those NCF1 splicing isoforms.The observations under a fluorescent microscope showed that the proteins translated from these splicing isoforms were localized in the cytoplasm in the perinuclear area, whereas no staining was observed in the nucleus and plasma membrane (Fig. 5).This localization was exactly the area where superoxide was produced and characterized by hydrocynian-3 staining (Fig. 3A).

Expression of NCF1 splicing variants in human normal tissues and infarcted heart
To elucidate the distribution of the expressions of the NCF1 splicing variants, isoform-specific primers were designed based on their unique exons.First, the results of quantitative real-time PCR showed that these isoforms were expressed predominantly in cells of immune system.In addition, blood vessels, small intestine, pancreas and brain also showed a significant expression (Fig. 6).Second, these isoforms showed obviously distinct patterns of tissue distribution on their expressions.For example, the expressions of isoform S21 were extremely low in all heart samples, but another isoform S201 was abundant in these heart samples (Fig. 6).Third, the expression of those inhibitory isoforms (S22, S85 and S201) in infarcted heart was significantly lower than the expression in the normal heart (Fig. 7), among which S201 showed the most obviously lower expression compared with its expression in the normal heart (3-fold decrease, P=0.00063).On the contrary, S21 showed a 7.5-fold higher expression level in the infarcted heart compared with normal heart (P=0.000021).

Discussion
In our study, we observed that the NCF1 pseudogenes can generate products that can reduce remarkably the superoxide generation.P47 phox has a strictly conserved domain architecture from fishes to mammals (Fig. 8) [4] , these domains include a PX domain for membrane translocation [13]   ; two tandem-SH3 domains for binding to p22 phox on the membrane [14,15] ; an auto-inhibitory region (AIR) that locks the p47 phox activity through an intra-molecular interaction in the resting state [4] ; and a proline-rich region (PRR) for recruiting p67 phox [16][17][18] .Upon stimulation, p47 phox will be phosphorylated, which unlocks the AIR domain to release the other p47 phox domains to recruit the p67 phox and p40 phox subunits and transport them to the membrane to activate the membrane subunits (gp91 phox and p22 phox ) [4,19] .The protein domain analysis showed that these isoforms produced from pseudogenes via alternative splicing lack some domains, especially the PX domain (responsible for translocation to the membrane) and the PRR (recruiting the p67 phox ) and AIR domain (turn-on switch).Copy number variations at the NCF1 locus have been suggested to be associated with hypertension, Williams-Beuren syndrome (WBS), rheumatoid arthritis and autoimmune diseases [20][21][22][23] .Particularly, it is well known that mutations in the true NCF1 gene cause the chronic granulomatous disease (CGD) transmitted in an autosomal recessive fashion [6] .When both allele copies are mutant on the true NCF1 gene, the individual will develop CGD; the existence of NCF1 pseudogenes cannot compensate the loss of the function from the mutant true NCF1 gene.Therefore, the pseudogenes are unlikely to have similar function to the true product.In accordance to this hypothesis, we found that these pseudogenes are actually functioning as the inhibitor of the true gene product.Obviously these pseudogene products cannot make up the loss of the NCF1 mutations in the CGD patients, but they may serve as modulators of the superoxide production in those non-immune cells.
Quantitative real-time PCR analysis with isoform-specific primers showed a tissue distribution pattern of the splicing of NCF1 pseudogene transcripts (Fig. 6).For example, S21 were extremely low in the atriums and ventricles of the heart, but S201 was expressed abundantly in the heart (Fig. 6).The expressions of those isoforms (S22, S85 and S201) with inhibitor functions were significantly reduced in the infarcted heart compared with the healthy heart (Fig. 7); on the  contrary, the isoform (S21) showed a remarkable increase on its expression level within the infarcted heart (P=0.000021).Together with our previous findings that the NCF1 pseudogene splicing is related to monocyte differentiation [8] , it suggests that these pseudogenes may be of potential physiological relevance.Superoxide is a double-edged sword [24]   .On one hand, it is needed by phagocytic cells to kill the bacteria and parasites; on the other hand, it may hurt the host non-immune cells when an excess amount of superoxide was generated [25][26][27][28] .It is critically important to maintain the balance between the sufficient amount of superoxide in the immune cells and the essential amount of superoxide in other cell types.Therefore, it is reasonable for an organism to develop an endogenous control system for a fine-tuning of the superoxide production; the NCF1 pseudogenes may be one member of this system.
It is estimated that human genome has ~20,000 pseudogenes [29] , which is close to the total number of protein-coding genes.The regulation of pseudogene expression is different in various cell lineages and various physiological and pathological processes [30,31] .Pseudogenes are often partners of genes in cell differentiation system.This study challenges the concept that pseudogenes in human genome are only junks without biological functions.If this phenomenon is broadly present on other pseudogenes in human genome, this large number of pseudogenes should receive sufficient attention when investigating the pathogenesis of complex diseases because the genetic variants on these pseudogenes may also contribute to the diseases; moreover, these pseudogenes should be regarded as a natural resource for novel drug discovery.

Figure 1 .
Figure 1.Novel splicing isoforms from the human NCF1 true gene or pseudogenes.A. Exons of the newly identified splicing transcripts (S21, S22, S85, S201) and the wildtype transcript from the true NCF1 gene (S8).Each box represents an exon; the arrows on the gene represent the translation starting sites (ATG).Mutant exons are indicated by red boxes.B. The protein domain architecture of newly identified splicing variants.Three phosphorylation sites on the AIR domain are indicated by red circles.

Figure 2 .
Figure 2. Alignment of protein sequences of splicing isoforms.Blue, the PX domain; green, the SH3 domains; red, the Auto-Inhibitory Region (AIR); purple, the Proline-Rich Region (PRR).

Figure 3 .
Figure 3.Effect of NCF1 splicing variants on superoxide production as measured by the hydrocyanine-3 method.A. Fluorescent images (20x) of HUVECs transfected with various plasmids and stained with hydrocyanine-3.B. Superoxide production quantified by the intensities of fluorescent signals.The graph represents the mean±SD (*p<0.01).

Figure 4 .
Figure 4. Effect of NCF1 splicing variants on ROS as measured by the pyocyanin method.A. Fluorescent images (20x) of HUVECs transfected with various plasmids and stained with ROS (green)/superoxide (red) detection mix and pyocyanin.B. ROS (green)/superoxide (red) production quantified by the intensities of fluorescent signals.The graph represents the mean± SD (*p<0.01).

Figure 5 .
Figure 5. Subcellular localization of NCF1 splicing variants.HA-tagged expressions (green) of NCF1 splicing isoforms in HUVECs were detected by an anti-HA mouse monoclonal antibody (Sigma) probed with Alexa Fluor 488-labeled goat anti-mouse IgG (Invitrogen).Nuclei (blue) were stained with DAPI.Images (20x) were visualized with a fluorescent microscope.

Figure 6 .
Figure 6.Tissue-specific expression of splicing isoforms from the human NCF1 pseudogenes.Quantitative real-time PCR was carried out to measure the expression levels of different splicing isoforms in human tissues.The data was normalized with the GAPDH level, and transformed into a logarithmic scale.

Figure 7 .
Figure 7. Differential expressions of splicing isoforms in normal heart and infarcted heart as measured by quantitative real-time PCR.The data was normalized with the GAPDH level.