|
|
||||||||
1From the Departments of Ophthalmology, 2Cell Biology, and 4Medicine and Center for Human Genetics, Duke University Medical Center, Durham, North Carolina; the 5Human Genetics Center, University of Texas Health Sciences Center, Houston, Texas; the 6National Eye Institute, Bethesda, Maryland; and the 7National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina.
| Abstract |
|---|
|
|
|---|
METHODS. Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR.
RESULTS. Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified.
CONCLUSIONS. The EyeSAGE database, combining three different gene-profiling platforms including the authors multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions.
Gene expression profiling in the human macula is complicated by the technical difficulties inherent in isolating RNA from human donor tissue and is further confounded by the presence of pigmented RPE cells (for an excellent review of these technical hurdles, please see Chowers et al.8 ). Recent technical advances in large-scale, RNA-based technologies and analysis strategies (e.g., microarray technologies and serial analysis of gene expression, or SAGE9 ), have vastly increased the extent of the transcriptome that can be uncovered from small amounts of starting material.
SAGE is a powerful technique that provides quantitative and comprehensive gene expression profiling.9 Conventional or shortSAGE uses short 14-bp tags of internal transcript signatures9 to identify and quantify individual gene transcripts. We describe the construction of four new shortSAGE libraries representing topographic regions of human retina and RPE/choroid, as well as the first retinal longSAGE library. LongSAGE produces 21-bp transcript tags that, owing to its increased tag length, can be used for direct assignment to genome sequences and for identification of novel genes and alternative transcripts.10 11 Analysis of the human retina and RPE transcriptomes has been approached using a wide array of large-scale expression profiling methodologies (please see Refs. 12 ,13 for reviews), including SAGE14 and microarrays8 15 16 17 used here and suppression subtraction hybridization.12 These studies have been instrumental in developing knowledge of the retina and RPE transcriptomes. To enhance the utility of these data we created EyeSAGE, a relational database for the analysis and presentation of human retina and RPE/choroid SAGE and microarray expression profiles, compared with transcript expression in the human body or brain. The unique strength of the EyeSAGE database is that by examining the tissue-specific patterns of gene expression, we are able to identify sets of transcripts that are characteristic of subpopulations of cells within the retina, including cell populations that would be difficult or impossible to isolate physically. By combining these cell-typespecific expression profiles with the results of genomic linkage studies, we have identified candidate genes for a variety of ocular genetic disorders. The EyeSAGE database is posted at the National Eye Institutes NEIBank Web site (http://neibank. nei.nih.gov/index.shtml/ provided in the public domain by the National Eye Institute, Bethesda, MD) and the candidate retinal disease gene expression tables are also available at our Web site (http://www.duke.edu/
bowes007/EyeSAGE.htm/ provided in the public domain by Duke University, Durham, NC), as well as through RetNet (http://www.sph.uth.tmc.edu/RetNet/ provided in the public domain by the University of Texas Houston Health Science Center, Houston, TX).
| Materials and Methods |
|---|
|
|
|---|
|
|
Retina-derived RNA was isolated in Trizol (Invitrogen, Carlsbad, CA) plus glycogen and quantified (RiboGreen; Invitrogen, Eugene, OR).19 RPE/choroid-derived RNA was isolated using the same methods and further treated to remove visible melanin contamination as previously described.20 RPE-enriched RNA was prepared from RPE cells carefully brushed off the posterior cup and processed as previously described.20 RNA quality was verified on a 0.8% agarose gel and by real-time quantitative RT-PCR (qRT-PCR) analysis of known tissue-specific genes.19 21
Synthesis and Analysis of SAGE Libraries
ShortSAGE libraries were constructed from 10 µg RNA, using NlaIII as the anchoring enzyme and standard methodologies.22 SAGE libraries were sequenced at Agencourt Bioscience (Beverly, MA) to a depth of approximately 100,000 tags per library (Table 2) . The 4cRET library was constructed with a longSAGE kit (I-SAGE; Invitrogen) and sequenced at Agencourt to a depth of 98,408 tags.
|
cDNA Microarray-Based Expression Profiling
Total RNA was isolated from 4-mm trephine punches of pooled human maculas and 4-mm trephine punches of pooled midperipheral retina from the same donors used for the retina SAGE libraries (Table 1) and used to probe a human UniGene 1 LifeArray (UniGene: 8466 unique genes; Incyte Genomics, Palo Alto, CA). RNA (5 µg) from each retina region was submitted to Incyte for T7 amplification and array hybridization. To correct for variations in data, the average signal from all elements in the Cy3 channel (the Macula-derived probe) was divided by the average signal from all elements in the Cy5 (the peripheral retina-derived probe) channel, resulting in the balance coefficient. The Cy5 signal for each element was then multiplied by the balance coefficient, before calculating the balanced differential expression ratio. The balanced, differential expression ratio was calculated as Cy3/Cy5 if the Cy3 signal was greater, reported as a positive number, or Cy5/Cy3 if the Cy5 signal was greater, reported as a negative number. According to Incyte, a balanced, differential expression ratio greater than 1.7 (or less than 1.7) can be considered differentially expressed with 99% confidence. cDNA from the macula was labeled with Cy3 and cDNA from the peripheral retina, with Cy5, so that negative values indicate preferential expression in the peripheral retina and positive values in the macula. Array controls included sensitivity controls (ranging from 2 to 2000 pg); variable ratios of labeled cDNA to control for preferential labeling with dye, housekeeping genes, for which there were sufficient signal levels and no differential expression for ribosomal S9, tubulin, and 23-kDa HBP, and buffer-only array spots to control for background hybridization, all performed in quadruplicate.
Real-Time qRT-PCR
Total RNA extraction, cDNA synthesis and qRT-PCR (using intron-spanning primers) was performed as previously described.19 21 Real-time quantitation of candidate mRNAs normalized to an endogenous reference(s) (i.e, ß-actin [ACTB] in retina or glyceraldehyde-3-phosphate dehydrogenase [GAPDH], ß-2-microglobulin [B2M] and ubiquitin C [UBC] in RPE samples25 ) was performed on a sequence-detection system (iCycler iQ; Bio-Rad) using SYBR-Green. The x-fold difference between candidate genes were normalized to a single endogenous control gene or to several that were then geometrically averaged25 and calculated by the comparative threshold cycle (CT) method (2
Ct).26 PCR primer sequences for each gene analyzed are available on request.
Photoreceptor Enrichment
The photoreceptor layer of human donor eyes was isolated using the sandwich method, as described by Nishizawa et al.27 with the following modifications. A 7.5-mm diameter punch from over the macula or central peripheral retina through the sclera was removed and placed retina-side-down onto a PBS-soaked piece of Whatman filter paper. Starting from the edge, the sclera was grasped and pulled gently. The retina remained attached, ganglion cell-side-down, to the filter paper. A piece of 0.2 µm nitrocellulose membrane was then placed directly on top of the retina. The resulting sandwich was then inverted, and the filter paper was carefully peeled away, leaving the retina intact on the nitrocellulose paper. To split the retina, a piece of dry filter paper was placed directly onto the ganglion cell retina and firmly pressed, and then the two sides were pulled apart.
Using a sterile 6 or 4 mm trephine punch, a central punch was collected from both the nitrocellulose (photoreceptor layer attached) membrane and filter paper (inner retina attached) yielding four separate in situ cell samples (two per punch): macular photoreceptor layer and macular inner retina or peripheral photoreceptor layer and peripheral inner retina. For each pair of donor eyes, one eye was prepared using the sandwich method, and the other was prepared in the same manner as the tissue used for preparing the SAGE libraries in which whole 6- or 4-mm trephine punches of retina from over the macula and periphery were collected. Total RNA was isolated from the resultant six tissue samples, DNased, and cDNAs were synthesized with a cDNA synthesis kit (iScript; Bio-Rad, Hercules, CA). The efficacy of these mechanical cell separations was analyzed by qRT-PCR, which was used to compare quantitatively the expression levels of cone-, rod-, and inner retinalocalized genes between the different tissue preparations.
Detection of Cone Transcripts with Single Photoreceptor Cell RT-PCR
Two-millimeter diameter punches of macular retina were isolated from human donor eyes stored at 4°C (RNAlater; Ambion) and placed in PBS. Each punch was gently triturated with a wide-bore pipette to float off individual cells. Single cone or rod photoreceptor cells were isolated on a micromanipulator (TransferMan NK2; Eppendorf, Westbury, NY) mounted on a microscope (Diaphot 200; Nikon, Tokyo, Japan) based on their visual rodlike phenotype.28 Five microliters per well of cDNA mix described by McHeyzer-Williams et al.29 was placed in a 72-well low-profile plate (Scientific; Robbins, Sunnyvale, CA). Captured single cells were ejected into the cDNA mix, one cell per well, with one in six wells receiving no cells and processed as negative controls. After cDNA synthesis at 37°C in an incubator for 90 minutes, the plates were stored at 80°C. Nested primer sets were designed to amplify mRNA transcripts without genomic amplification by spanning an intron. The cDNA reaction from each cell was split in half and placed in parallel first-round RT-PCR reactions in which primers for the cone-specific gene (PDE6C) were mixed with primers for a candidate cone gene in one tube, and primers that amplify the rod-specific PDE6A were combined with the candidate primers in the other. A second round of RT-PCR was performed with 1 µL of the first-round product as the template for the reaction with nested primer pairs for the genes amplified in the first round. Products were visualized on a 3.5% acryl agarose gel.
| Results |
|---|
|
|
|---|
Building the EyeSAGE Database
One goal of this study was to generate a comprehensive picture of gene expression in the human macula that is accurate, readily accessible, and can be used as a resource to identify and quantitate cell-typespecific or associated genes. To this end we integrated large-scale expression data obtained from this tissue, by using different technologies: SAGE, longSAGE, and cDNA microarrays into a database that we named EyeSAGE. Starting with the short tag retina and RPE/choroid SAGE libraries summarized in Table 2 , 160,723 unique tags were used as the first building block (column) for the EyeSAGE database. Each tag was analyzed and assigned a best gene match,23 and UniGene cluster assignment (based on NCBI Build 182) if available. Genomic map positions (as nucleotide numbers along the chromosome) were assigned as previously described.24 Columns of tag counts normalized to 200,000 for each tag in each posterior eye library were added. The 4cRET longSAGE library was incorporated by matching the longSAGE tags with their reliable best gene matches, based on CGAPs SAGE Genie assignments, to the short tag, the sequence of which is the first 10 bases of the 17-bp tag. Next the tag counts for each tag in 39 additional normal tissue SAGE libraries (available at SAGE Genie) were added to incorporate expression information from a variety of tissue and cell types. Incyte cDNA microarray expression data of peripheral and macular retina were imported and linked to the SAGE data after using BLAT homology searches to assign a UniGene cluster number (Build 182) to each microarray probe. Using the convention at CGAPs SAGE Genie (http://cgap.nci.nih.gov/SAGE/AnatomicViewer) the SAGE libraries were normalized to 200,000 tags for pair-wise comparisons. The entire EyeSAGE database in Access was sorted by tag number and genes with expression (tags) totaling five or more (normalized to 200,000 therefore totaling approximately two or more raw tag counts/library), in the eight posterior-eye shortSAGE libraries combined, were exported into spreadsheet software (Excel; Microsoft; the entire Microsoft Access version of EyeSAGE is available on request). This step removes unique tags that occur as singletons in only one retina or RPE/choroid library. This version of the EyeSAGE database was used for subsequent data mining (available at NEIBank, http://neibank.nei.nih.gov/index.shtml). The EyeSAGE database is an easily searchable, comprehensive expression dataset representing the posterior eye transcriptome. In its current form, EyeSAGE can be used to analyze tissue and cell-type expression of single genes or classes of genes, or to display ocular expression over user-defined genomic regions. It can also be mined to generate large-scale views of cell-type expression. Examples of specific queries follow.
Generating Cell-TypeAssociated Gene Lists
We were particularly interested in examining cone-photoreceptorassociated gene expression. To derive a cone-associated profile using EyeSAGE we took advantage of the fact that cone photoreceptors are concentrated in the macula and that cone-specific transcripts should be present at higher levels in the retina than in other neural tissues. In contrast, inner retina neuron and glial cell-associated genes elevated in macula might also be found in the brain but not in non-neural tissues. These observations were translated into the following set of queries of the EyeSAGE database to generate a list of putative cone-enriched transcripts (EyeSAGE column heading is given in quotes and described in Table 3 , legend):
3).
1.1).
|
3) the number of transcripts is reduced to 38 (see Table 3 for the top 20). Similar kinds of selection criteria can be used to generate lists of transcripts that are specific to or enriched in other cell types. In each case, the presence of known cell-specific genes was used as a reference to gauge the success of given parameters to return the desired cell-associated expression. A list of rod photoreceptor cellassociated transcripts was identified by selecting for genes with higher expression in peripheral retina than the macula and higher expression in the retina than the rest of the body (see Supplementary Table S2 online). RPE-enriched genes were identified by selecting for higher expression in the three RPE-derived SAGE libraries compared with retina and higher expression in the RPE than in the rest of the body (Supplementary Table S3 online). Finally, a list of putative ganglion cell- and inner retina-associated transcripts were identified based on a query for tags with higher expression in the macula than in the peripheral retina and a higher macula-to-periphery tag count ratio in our 4-mm punch-derived libraries (4Mac/4Peri) than in the 6 mm punch-derived retina libraries (HMac2/PeriB2). This parameter was imposed because second-order neurons and ganglion cells are concentrated in the primate macula.32 In addition, a requirement for higher tag counts in the other neural tissues but not in the rest of the body was used, because it is expected that inner retinaassociated gene expression overlaps significantly with brain-expression profiles (see Supplementary Table S4 online). A search for tags with counts totaling more than 15 combined in the retina libraries and average expression greater in retina than in all other represented tissues returned a list of more than 1000 tags for genes with highest expression in the retina (see Supplementary Table S5 online).
Validation of the 4Mac and 4Peri Retina-Derived SAGE Expression Profiles
The differential gene expression patterns revealed by SAGE analysis were validated with several approaches. Digital comparison of gene expression, as tag counts, of our shortSAGE retina libraries (4Mac and 4Peri) to the longSAGE 4cRET library and to the retina and RPE shortSAGE libraries generated by Sharon et al.14 provided a qualitative assessment of differential expression of any genes found in all three of these groups. cDNA microarrays were probed with the same RNA used to generate the 4Mac and 4Peri retina SAGE libraries (see Table 1 for donors). The resultant microarray expression profiles (5836 genes out a total of 8466 on the array) were related by UniGene cluster number to the SAGE profiles in EyeSAGE and found to match fairly consistently the gene expression given by SAGE analysis. For example, 33 of the top 100 rod-associated genes were present on the array and, of these, 27 (82%) showed the expected higher expression in the peripheral retina (see Supplementary Table S2 online).
To validate more rigorously the differential expression detected by SAGE and (when available) microarray analysis, expression of several known retina-specific genes and candidate cell-associated genes in the macula and midperipheral retina was determined by qRT-PCR. Rod-specific, PDE6A (phosphodiesterase 6A, cGMP-specific, rod, alpha; GeneID: 5145), cone-specific, PDE6C (phosphodiesterase 6C, cGMP-specific, cone, alpha prime; GeneID: 5146). GNAT2 (guanine nucleotide binding protein [G protein],
-transducing activity polypeptide 2; GeneID: 2780), and ganglion cell-associated THY1 (Thy-1 cell surface antigen; GeneID: 7070) expression was compared with expression of selected candidate cone-associated genes as well as one candidate inner retina-associated gene, UCHL1 (ubiquitin carboxyl-terminal esterase L1; GeneID: 7345; Fig. 2 ). In each case, this independent analysis verified the differential gene expression seen in the SAGE libraries. To further localize expression of selected cone-associated genes, qRT-RCR was performed on RNA isolated from the photoreceptor layer and inner retina of macula and peripheral retina (Fig. 3) . The highest expression for the known cone photoreceptor gene, GNAT2, and candidate cone-associated genes, HR (hairless homologue [mouse]; GeneID: 55806) and CPLX4 (complexin 4; GeneID: 225644) was detected in the macula photoreceptor layer-derived RNA containing the highest concentration of cone-derived transcripts. This provides strong evidence that the HR and CPLX4 genes are transcribed in cones.
|
|
|
The RPEB1 library generated by Sharon et al.14 was derived from a freshly enucleated eye obtained from an 88-year-old patient in which the RPE was gently scraped off large fragments of the posterior eyecup after removal of the retina. This library may have a much lower number of retina-derived tags because the donor was older, no mechanical compression of the retina and RPE occurred, because no punches were taken, and/or because this was fresh tissue and RNAlater was not used.14 We used the tags present in the RPEB1 library as one means to analyze the cell-source of the tags in the 4MacRPE and 4PeriRPE libraries. However, because our libraries surely contain legitimate RPE tags that are absent in the RPEB1 librarybecause it was made from RNA from a single donor of advanced age and was only sequenced to half to two thirds the depth of the rest of the posterior eye libraries (
54,000 versus over 100,000; Table 2 )additional comparisons were undertaken. Comparison to the retina libraries from both laboratories is helpful, but these are all contaminated with RPE tags for the same reasons that the RPE libraries contain retina-derived tags described herein. Retinal contamination was even detected in RPE RNA derived from pools of RPE cells isolated by laser-capture microdissection.33 Therefore, to test how well the parameters used to query the EyeSAGE database worked to generate a list of genes expressed in the RPE, qRT-PCR was performed on RNAs isolated from retina and RPE/choroid punches and compared to RNA isolated from pools of RPE cells mechanically purified from human donor eyes.20 We determined that in the RPE/choroid the housekeeping gene ACTB fluctuated in parallel reactions but not in retina. We therefore tested other housekeeping genes, GAPDH, B2M, and UBC, for normalization of gene expression levels in qRT-PCR assays using RPE-derived RNAs.25 Expression of EMP3 (epithelial membrane protein 3; GeneID: 2014) and MMP25 (matrix metallopeptidase 25 GeneID: 64386) was highest in the RPE-enriched samples like the known RPE-associated gene RDH5 (retinol dehydrogenase 5 [11-cis and 9-cis]; GeneID: 5959; Fig. 5 ).
|
Detection of Splice Variants
The 4cRET longSAGE provides a valuable means for detecting transcript variants of known retina genes. Alternative transcripts arising from tissue-specific splice forms or from alternative polyadenylation signals are known to occur at a high rate within the retina relative to the rest of the body.34 One example of a transcript variant detected by longSAGE analysis is a short form of the transcript for PDE6G (phosphodiesterase 6G, cGMP-specific, rod, gamma; GeneID: 5148). The GenBank RefSeq accession number for PDE6G corresponds to a 1223-bp long mRNA (NM_002602), but this full-length transcript is not detected by short or longSAGE in any of the retina libraries. Instead, the most abundant tag in retina libraries maps to the NlaIII site at position 847 corresponding to a shorter PDE6G transcript. This prediction from the SAGE data was tested by qRT-PCR using one forward primer at position 828 and two reverse primers at positions 949 and 1018. The shorter product amplified at a much higher rate (>9000 times higher relative expression than the longer transcript, data not shown), verifying the transcript length predicted by SAGE.
SAGE, particularly longSAGE, is also an excellent method for detecting antisense transcription, because the method itself is inherently directional. Detection of antisense transcription to date has primarily relied on the use of expressed sequence tag (EST) databases, where the directionality of the sequence has to be verified by the presence of canonical intronexon splice junctions and/or poly(A) signals. A large number of ESTs lack these sequences and can therefore not be used for the analysis, leading to an underestimate of the presence of antisense transcription in the genome.35 In contrast, a 21-bp longSAGE tag that maps to only one location in the genome, but is found on the opposite strand of a normally transcribed locus, provides enough information to implicate antisense transcription of that locus. We detected antisense transcripts of rhodopsin and PDE6G by the presence of high-count longSAGE tags matching the antisense or opposite transcript strand. In addition, the cone-associated candidate gene ZNF593 was identified by its longSAGE expression, but the longSAGE tag did not match the RefSeq mRNA (accession number NM_015871). A BLAST search revealed that the tag matched to a location on 1p36 of the genomic DNA within the ZNF593 locus, but not within the mRNA sequence. An EST database search revealed the presence of the tag in several sequences, all derived from the eye. The UCSC Genome browser showed all these ESTs to be antisense transcripts of the ZNF593 locus, and the localization of the antisense transcripts is only in the eye whereas the localization of the sense transcription is ubiquitous (http://genome.ucsc.edu/cgi-bin/hgTracks?position=chr1:26179935-26183592&hgsid=59876575&intronEst=pack). Primers were designed to the antisense transcript in a region that did not overlap with the sequence for the sense transcript. qRT-PCR verified a profile consistent with cone-associated expression as predicted by the SAGE profile (Fig. 2) .
EyeSAGE Candidate Retinal Disease Gene Expression Tables
We found retinal diseases that have been mapped, but for which the disease gene has not yet been identified, using the RetNet table Summaries of Genes Causing Retinal Diseases, Table B (by Disease, http://www.sph.uth.tmc.edu/Retnet/sum-dis.htm#B-diseases). The EyeSAGE database was queried for genes that map within these regions of linkage and that are also expressed in the retina and RPE. These lists of candidate disease genes for 50 mapped but not yet identified retina disease genes can be accessed through RetNet, listed by disease symbol, and also through NEIBank with links to the UCSC genome browser and Single-Nucleotide Polymorphism (SNP) database.
In the resultant tables, selecting candidates by eye expression levels served to reduce the number of candidate genes greatly for each locus. The EyeSAGE database correctly predicted the BBS5 gene for the mapped-and-cloned retina disease region for the autosomal recessive Bardet-Biedl syndrome, BBS5.36 37 The chromosomal markers flanking the disease locus were mapped by Beales et al.36 to a 14-Mb region of chromosome 2 (bases 160851387174882568). The query of our EyeSAGE database returned 131 genes with expression in the eye in this region (at least one tag in at least one eye library). The table was sorted from highest to lowest total tag counts in the eye ("Eye Sum" in all eight shortSAGE libraries combined). Thirty-seven candidate genes had a sum of at least 15 tags in the eye libraries (representing significantly high expression) and among these, the top candidate, with the highest sum of tag counts in the eye plus fairly ubiquitous expression (tags in the nonocular libraries), was the BBS5 gene (see Supplementary Table S6 online at http://www.iovs.org/cgi/content/full/47/6/2305/DC1).37 Of course, many disease-causing genes will not be expressed at such high levels. Thus, all expressed genes within linkage intervals should be considered candidates.
As another example, a table of candidates for the as yet unidentified CORD1 gene on chromosome 18 was generated (see Supplementary Table S7 online). The 18-Mb CORD1 disease region contains 223 genes or UniGene clusters, of which only 79 show a sum of 10 or more tag counts in the retina and RPE libraries. Of these, the top two retina-associated genes (based on little or no expression in any of the other libraries) are the CPLX4 and RAX (retina and anterior neural fold homeobox, GeneID: 30062) genes.
The EyeSAGE database and cell-associated tables can also be used for prioritization of candidate genes for primary open-angle glaucoma (POAG), a neurodegenerative disease characterized by death of the retinal ganglion cell (RGC). The EyeSAGE database was queried ("Mac/Peri" > 1; "6 mm/4 mm" <1; "Ret/Neural" <1; sorted highest to lowest by "RetAve/Body and Neural Ave"; see Supplementary Table S4 online) to produce a set of 2393 genes that are enriched in the inner retinathese genes would also be expected to be enriched for transcripts found in the RGC. When these transcripts are mapped back to the genomic assembly and compared with previously published regions of POAG linkage,38 39 40 41 42 43 44 a set of 128 prioritized candidate genes emerges. This includes genes in pathways known or hypothesized to be involved in glaucomatous neurodegeneration, including apoptosis (DAD, BBC3, BCL2L2), axonal growth and regeneration (RTN4, NAB1), and calcium flux (JPH4). Moreover, this list of prioritized genes constitutes <6% of the almost 2400 UniGene clusters that map within regions of linkage. In this way, use of the cell-associated tables greatly reduces the number of genes that must be evaluated.
| Discussion |
|---|
|
|
|---|
Previously, Sharon et al. found that the cone photoreceptor contribution in their two retina SAGE libraries was similar in the macula and the peripheral retina whereas the rod contribution was higher in the periphery.14 This may seem counter intuitive because a 6-mm punch from the macula is enriched for cones (8:1 rod:cone) relative to a 6-mm peripheral retina punch (20:1 rod:cone). However, the macula also contains the highest concentration of ganglion cells (60%) and associated interneurons.53 In addition, when comparing SAGE libraries generated from single donors, individual-to-individual variation can complicate the identification of real tissue or cell-associated gene expression differences.30 Thus, we hypothesized that generating additional SAGE libraries from matched pooled donor sets could reduce the impact of individual variability and that additional human retina/RPE transcriptome profiles would validate gene expression differences identified previously if these were upheld in this pooled donor set. In fact, analysis of the resulting expression profiles in EyeSAGE bore this out (see Table 3 or supplementary cell-associated tables online for examples).
The EyeSAGE database, and particularly the first retina longSAGE library (4cRET), provides a unique opportunity to use eye transcriptome data in novel ways. The 21-bp longSAGE tags can, in most cases, be mapped to a single physical location in the human genome. This specificity overcomes much of the redundancy and uncertainty that can occur with the 14-bp tag assignments, and allows for the use of NCBI BLAST searches to assign tags that are not assigned to a gene by current SAGE Genie and SAGEmap resources. These advantages allowed us to identify new retina and photoreceptor-specific candidate genes, and detect transcript variants and antisense transcripts of known genes. For example, unlike standard SAGE, longSAGE is able to uniquely identify the cone-associated gene KIAA1345 (KIAA1345 protein; GeneID: 57545), which falls within the narrowest current mapping of the disease locus for MCDR2, an autosomal dominant inherited macular degeneration for which the disease gene has not yet been identified (RetNet, http://www.sph.uth.tmc.edu/RetNet/54 ).
LongSAGE analysis also facilitated identification of transcript variants and antisense transcription in the retina. For example, we identified a shorter transcript variant of PDE6G expressed at a much higher level than the reference sequence. LongSAGE also provided strong evidence for antisense transcription of PDE6G and rhodopsin, and allowed for identification of a retina-specific antisense transcript of the ubiquitously expressed gene ZNF593 (zinc finger protein 593; GeneID: 51042). These findings are particularly intriguing in the context of the recent study by Alfano et al. 55 establishing the importance of natural antisense transcription in eye development.
Analysis of the longSAGE tags also enhanced annotation of known genes and of their corresponding short tags. There are often instances in which a short tag is designated as the best tag for more than one gene on SAGE Genie. It can be difficult to determine which gene the tag counts represent, because they can be representative of the expression of either gene or even the sum of both. The algorithms used by SAGE Genie to assign the best gene match to a tag favor genes that are more ubiquitously expressed, and retina-specific genes, which often have fewer archived cDNA sequences, are underrepresented by these methods.23 Comparison to the longSAGE tag counts for the genes in the retina provides a means to estimate what percentage of the short tag counts correspond to expression of a given gene in the retina and thus which is the true tissue- or cell-associated gene. For example, the shortSAGE tag, CTGTTGATTT, emerged in the cone-associated expression profile, but it was assigned to two different genes. LongSAGE analysis identified the correct cone-associated gene to be GUCA1C (Table 3) , as has been previously reported.56
We also used the EyeSAGE database to produce candidate retina genes for mapped-but-not-yet-identified retina disease loci. We tested the validity of this approach by applying the data-mining paradigm for candidate retina disease genes to a retina disease, BBS5, for with the gene has been identified, BBS5,37 and returned BBS5 as the top candidate. Evaluation of the candidate disease genes for the autosomal dominant cone-rod dystrophy 1 (CORD1)57 yielded the gene for complexin IV, CPLX4, among the top candidates (see Supplementary Table S7 online). Complexin IV has very recently been localized to murine photoreceptor ribbon synapses where it modulates transmitter release.58 This is intriguing because CPLX4 is similar to another gene, RIMS1, responsible for the conerod dystrophy, CORD7.59 RIMS1 codes for a presynaptic protein expressed in brain and photoreceptors that also localizes to ribbon synapses where it functions in glutamate neurotransmission. Based on CPLX4s expression profile in EyeSAGE and by qRT-PCR where it is elevated in the macula photoreceptor layer relative to peripheral retina photoreceptor layer, CPLX4 behaves like a cone-associated gene (Table 3 , Fig. 3 ). Future studies will determine whether CPLX4 mutations are associated with CORD1.
In summary, these new SAGE libraries of the human retina and RPE/choroid and the relational transcriptome database EyeSAGE can be used to identify tissue and regional specificity of retinal gene expression, and global assessment of alternative transcription occurring in the human retina. These data can also be used to identify candidate retinal disease genes, either by using the candidate tables generated and now available on the RetNet and NEIBank Web sites, or by allowing researchers to query the database with new loci as they are identified. The new transcriptome information added by the work presented in this article and, more important, the database, which greatly expands the utilization of existing and these new expression profiles, will be an excellent resource for the vision research community as they explore questions related to gene expression in normal function and ocular disease.
| Footnotes |
|---|
Supported by National Eye Institute (NEI) Grants R01 EY11286 (CBR), R01 EY12012 (MAH), and R01 EY13315 (MAH); and NEI Core Grant P30EY0054722 Grant; a Research to Prevent Blindness (RPB) Career Development Award (CBR); and a RPB Core Grant to Duke Eye Center.
Submitted for publication November 7, 2005; revised January 5 and February 8, 2006; accepted March 27, 2006.
Disclosure: C. Bowes Rickman, None; J.N. Ebright, None; Z.J. Zavodni, None; L. Yu, None; T. Wang, None; S.P. Daiger, None; G. Wistow, None; K. Boon, None; M.A. Hauser, None
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be marked "advertisement" in accordance with 18 U.S.C.
1734 solely to indicate this fact.
Corresponding author: Catherine Bowes Rickman, Departments of Ophthalmology and Cell Biology, Duke University Medical Center, Box 3802 Erwin Road, Durham, NC 27710; bowes007{at}duke.edu.
| References |
|---|
|
|
|---|