We analyzed the composition of the opsin gene array using the MassArray assay for two subject pools: a group of male subjects unselected for color vision status who participated in the AREDS study and a group of subjects for whom we performed standard color vision tests in our laboratory. The results for both subject pools are shown in
Figure 2, where the percentage of genes in the array that are in the first position (the inverse of this proportion is opsin gene copy number) is compared to the percentage of the total genes that encode an L opsin. Subjects clustered into groups based on the characteristics of their arrays, and this provided a first pass diagnosis of the best possible color vision each subject can have.
Males with normal color vision typically have one L gene followed by one or more M genes,
1 and they fall along the unity line below approximately [60%, 60%] in the MassArray assay. Rarely, males with dichromatic color vision (protanopic or deuteranopic) also will fall in this area because they have inactivating mutations in one of the first two genes in the array that are not detected by the MassArray assay.
2,16 Subjects who lack L genes fall along the
x-axis and all of these have, at best, protan color vision deficiencies, with single-gene protanopes clustering at [100%, 0%]. Subjects who lack M genes fall along the
y = 100% line and all of these can have no better than deutan color vision deficiencies, with single-gene deuteranopes clustering at [100%, 0%]. Individuals whose results plot above the unity line possess at least one extra L opsin gene and may either be color normal or have a deutan-type color vision deficiency, depending on whether the first two genes in the array encode an L and an M pigment, or whether both encode an L pigment. Females who are likely to be carriers of red–green color vision deficiency also are distinguishable from color anomalous individuals and noncarrier normals: carriers of protan defects tend to have a paucity of L opsin genes compared to normal and, thus, should fall between the normal (one L gene) line and the
x-axis, while carriers of deutan defects tend to have extra L genes compared to normal and, thus, should fall between the normal line and the deutan range.
The MassArray measurements are subject to a certain amount of error and it is necessary to set mathematical criteria for distinguishing genotypes and determine the likelihood of classification errors. With regard to providing information that is useful in diagnosing color vision defects the MassArray assay involves making the following distinctions: (1) Discriminate people with one X-chromosome photopigment gene from those with more than one gene. Everyone with only one photopigment gene on the X-chromosome is color blind and at best a dichromat. (2) The ability to distinguish between people who have L and M photopigment genes from those who have only L or only M genes. Those people with only L genes have deutan color vision deficiencies and those with only M genes have protan color vision deficiencies. (3) Approximately 90% of men with normal color vision have either two, three, or four X-linked photopigment genes. Men with two genes, both of which are not L, cannot have a deutan color vision defect; conversely, men with three or four genes in which two are of the L pigment class have a high probability of having a deutan-type color vision defect and, thus, can be classified as deutans or “deutan suspects.” The precision of the MassArray is sufficient to easily separate all the above classifications. For men with three or four genes, separating normals from deutan suspects is straightforward because the proportion of genes that are L is greatly different in these cases. For three genes, normals have 33% L and deutan suspects have 67% L genes. For four genes, normals have 25% L and deutans have 50%. However, the precision of the MassArray is near its limits of predictive accuracy for people with more than four genes on the X-chromosome, which constitute approximately 10% of the population. For these rarer cases, distinguishing between a person who has five genes, four M and one L, from a person with four M and two L, that is, between 20% and 33% L becomes statistical and there is theoretically a probability that a four M and two L deutan suspect could be misclassified as four M and one L color normal person. Thus, the following procedure was used to determine a mathematical criterion for distinguishing arrays with one L gene and multiple M genes, from arrays with more than one L gene indicating that the subject is a deutan or deutan suspect. The polar angle of each data point in the scatterplot (
Fig. 2) was calculated. The polar angle indicates the number of L genes in the array for men and the average number of L genes per array for women (women have one array on each of their X chromosomes). As shown in
Supplemental Figure S2, the polar angle distribution was fit to a Gaussian function representing arrays with one L gene and a sum of Gaussians representing individuals with two and three L genes (normals with extra downstream L genes and deutans). The angle at which the one L gene Gaussian intersected with the sum of the multiple L gene Gaussian was taken as the criterion dividing point for segregating subject into classes. We calculated the percentage of arrays expected to be misidentified from the overlap of the one L gene function and the sum of the two and three L gene Gaussian functions. The accuracy of separating normal individuals with one L gene and multiple M genes from protans, deutans, and deutan suspects estimated in this way is 99.6%.
To examine the same question of the possibility of misclassification another way, we looked specifically at the 1° bins on each side of the cutoff between normal and deutan suspects (1 vs. >1 L genes) for subjects whose color vision was tested. Nine males fell into this category. Six of the nine fell in the normal bin adjacent to the cutoff and not one was diagnosed as deutan in the color vision testing. For the three that fell on the deutan side of the cutoff, one was deutan by color vision tests and the other two had normal color vision in the tests, and were classified as normal with extra L genes. Thus, our results did not indicate that anyone was misclassified in our sample, although we cannot be sure that the normal men classified as having an extra L gene actually had the extra L. However, it is clear that there is a statistical probability that people with a large number of genes, in particular, could be misclassified. If for the nine people on either side of the cutoff, there is approximately a 50% to 50% chance of an error, then we estimated the error rate to be approximately 0.04%, very close to what was estimated from the Gaussian overlap.
We noted that the problem of avoiding misclassification errors could be easily solved. A histogram for polar angle examining just the approximately 10% of subjects with the highest estimated gene number shows that 7 of the 9 people falling in the uncertain zone were estimated to have more than 4 genes. A remedy that should almost eliminate any possibility of misclassification is to put people who are estimated to have high gene number and fall near the gap in a separate category. This would constitute <1% of all subjects. Something like the following could be appropriate “the genetic assay indicates that the patient has more than four pigment genes; the assay places the subject near the cutoff between normal and deutan subjects.” Being absolutely clear about when the classification could be ambiguous is a way of avoiding misclassifying patients with the MassArray assay.
Data from the pool of male AREDS subjects is shown in
Figure 2A. Of 798 men, 708 fell along the unity line and are predicted to have normal color vision. The color coding for diagnoses in
Figure 2A is the same as that indicated in
Figure 2B. In
Figure 2A, the diagnosis of each subject indicated by the color code was determined entirely from genetic data, including estimated %L and %downstream genes indicated in the plot, plus the estimate from the MassArray tuning site SNP data of spectral separation for deutan subjects with multiple L genes and protans with multiple M genes. The final piece of data used for the diagnosis indicated by the color code was the identity of the last gene in the array as L or M where applicable. Thus, in
Figure 2A, eight males were identified as single-gene deuteranopes based on clustering at [100%, 100%] and two multigene deuteranopes were identified based on clustering along the
y = 100% line along with absence of differences in SNPs in codons that influence spectral tuning. One person whose results fell along the
y = 100% line was determined to be (at best) deuteranomalous with multiple L genes that differed in spectral tuning. Nine males were identified as protan: one single-gene protanope at [100%, 0%], two multigene protanopes with no differences in their M genes at SNPs that influence spectral tuning, and three protan-type individuals were identified whose M genes did encode spectral differences. The other three protan males had M genes that differed only in exon 2, which encodes amino acid differences that may affect optical density and give rise to protanomalous behavior in the anomaloscope test.
8 The final 70 of the 798 males fell above the unity line but below the
y = 100% line, indicative of arrays with at least two L opsin genes and at least one M opsin gene. These individuals may either be color normal or deutan, and distinguishing the two requires identification of the last gene in the array to deduce the order of the genes. For 20 of these subjects, there was a single extra L gene and it was at the 3′ end of the array; they are predicted to have normal color vision. For 21 of the subjects, the extra L gene was deduced to be in the second position; their arrays had only one M gene and it was found in the last position. Their L genes were different at spectral tuning sites so they are predicted to be, at best, deuteranomalous. The remaining 29 subjects had too many genes to determine whether the first and second genes both encoded L pigments and their diagnosis remains ambiguous. They are labeled as “Mild DA\Normal extra L” and they can be considered to be “deutan suspects.”
We also performed some direct gene sequencing for these subjects and we found two previously unreported missense mutations in expressed L opsin genes, L232V and L55V, and one silent mutation, GTC>GTT in codon 97. Additionally, one subject was homozygous and four were heterozygous for the deleterious C203R mutation in M genes. One of the heterozygous subjects had two M genes and the C203R mutation was found in the gene in the last gene position, meaning the expressed M did not have the mutation. The other three subjects had more than two M genes and so it could not be determined if the C203R mutation was in an expressed gene. Therefore, point mutations, all of which except the C203R are not detected by the MassArray assay, could render the subject dichromatic depending on the position of the gene encoding C203R and on the effects of the L232V and L55V mutations (which could affect protein folding or function) and the synonymous mutation (which could affect splicing). The L232V substitution is predicted by Mutation Taster
17 to be disease-causing, while the L55V is not, and none of the three is predicted by Human Splicing Finder
18 to affect splicing.
In the second subject pool, we used MassArray data, gene order, gene sequencing, and psychophysical tests to evaluate color vision status. The MassArray data for these subjects is shown in
Figure 2B. All subjects are summarized in
Table 1. Our subject pool included 31 females, 29 of whom passed the HRR and D15, one of whom made protan errors on those tests, and one of whom made deutan errors. The MassArray test classified all 29 women who were color normal in psychophysical assessments as having genes required for normal color vision. Additionally, the genetic test identified four of the women as probable deutan carriers and two as probable protan carriers based on established characteristics of opsin genes in carriers.
14,19 The protan female had five spectrally identical M opsin genes in her array. The deuteranomalous female had an extra L opsin on one chromosome, displacing the M opsin gene from an expressed position, and a missense mutation (V63M) in the expressed M gene of the other chromosome, leaving her with three L opsins and a presumably nonfunctional M opsin. The V63M substitution is predicted by Mutation Taster to be disease-causing.
Among the male subjects (n = 1043), the MassArray identified 22 as having arrays comprised of only a single L gene ([100%, 0%]), making them deuteranopes; another 5 had arrays comprised of only a single M gene ([100%, 100%]), making them protanopes. All of these obligate dichromats failed the psychophysical tests.
The MassArray identified 9 males who had L opsin genes with distinct spectral tuning and no M opsin gene, making them, at best, deuteranomalous. However, only four made deutan errors on the psychophysical tests; the others, who we call “minimal deutans,” made no errors at all on either test. Examination of their L opsin sequences revealed that these subjects all had L photopigments with spectral separations of 6.5 to 10 nm. Three of these five agreed to return for testing with the Nagel anomaloscope, which confirmed that they were very mild deutans.
There were 18 males who had no L opsin gene and multiple M opsin genes. Five of them had arrays comprised solely of multiple identical M genes, making them obligate protanopes. Seven subjects had M opsin genes with differences only in exon 2, which does not affect spectral tuning but may create optical density differences; another 5 had M opsins with differences in both spectral tuning sites and in exon 2, making them, at best, protanomalous. The 18th protan male had three M opsin genes, but direct sequencing showed that one of them encoded a tyrosine at 309 (normally found in L pigments) making him appear normal in the MassArray assay. Seventeen of these 18 protans made protan errors on the color vision tests, but one (who had differences in spectral tuning and optical density controlling sites) was misdiagnosed by the HRR and D15 as a deutan.
Of the psychophysically tested male subjects, 887 had one L gene followed by a variable number of M opsin genes, a configuration normally associated with normal color vision. In 7 of those subjects, direct sequencing revealed the presence of mutations in the opsin genes that led to red–green color vision defects. One had a Cys203Arg mutation in his M gene, one had a Cys203Opal mutation in his M gene, one had a single base insertion in his M gene introducing a frame shift in exon 3, one had a gene rearrangement where his L was displaced to the third position leaving two spectrally identical M genes in expressed positons, and three had toxic opsin LIAVA or LVAVA variants.
20 These seven subjects all failed the psychophysical color vision tests.
Sequencing found that 5 of the 888 subjects with normal arrays had mutations in the SNPs that the MassArray assays uses to characterize the array. One of those subjects was outside all diagnosis clusters at [100%, 50%] and appeared to have an impossible array – 50% of his genes were of the L class, but he seemed to have only a single gene in his array. Sequencing of the promoter region revealed that he had in all genes the “first gene version” of the SNP which the MassArray assay uses to determine gene number (i.e., a G at nucleotide +1) but was heterozygous at other positions, meaning he had an LM array sufficient for normal color vision. The other four subjects had mutations in the codon 309 SNP that the MassArray uses to distinguish L and M opsin genes. One had an LM array in which the L had a phenylalanine at 309 (typically associated with M genes), making him appear protan to the MassArray. Similarly, three subjects who appeared to have deutan LLM arrays actually had normal LMM arrays because one M opsin gene encoded a tyrosine at 309, normally found in L pigments. All of these five subjects passed the color vision tests.
Direct sequencing of the opsin genes for the other 875 of the 888 subjects with normal numbers of L and M genes showed no deleterious mutations, together indicating the genetic basis for normal color vision. Of these 875 subjects, 869 passed the HRR and D15 color vision tests. The remaining six showed mild defects in protan, deutan, or, in two cases, tritan stimuli on psychophysical color vision tests. No genetic cause for color vision deficiency was ultimately found for these subjects.
The final group of psychophysically-tested subjects were 102 men with at least two L opsin genes and at least one M opsin gene. In 26 of those subjects, the extra L opsin gene was in the last position of the array, not displacing the M gene. These arrays (LML or LMML) are consistent with normal color vision, and, indeed, 25 of these men passed the psychophysical color vision tests. The last man had an LMML array but made mild errors on the HRR and D15, had a Rayleigh match range of 44 to 59, and was of East Asian descent. We sequenced his promoter region and found he was heterozygous for an A-71C mutation, a common cause of red–green defects in Asian populations.
21 In another 28 men, two L opsin genes were in expressed positions by deduction from last gene sequencing (LLM or LLLM arrays). L opsin sequences indicated that there was no spectral difference between the L opsins for two of these 28, making them multigene deuteranopes, and there was a spectral difference for the other 26, making them, at best, deuteranomalous. Of the 26 deuteranomals, 22 were classified as mild or medium deutans by the HRR and D15. However, two were minimal deutans who made no errors on either test. The last two deuteranomals had a 2.5 nm spectral separation but behaved more poorly on color vision tests than we would predict from having two pigments with that separation. One had very poor acuity and amblyopia, which likely led him to perform worse on the psychophysical tests than his color vision defect would have allowed; the other was not tested for visual acuity and reported no other eye disorders.
The remaining 48 of the 102 subjects with extra L opsin genes had long arrays containing four or more opsin genes. While we can specifically analyze the first and last genes in the array, we cannot access the second gene. For longer arrays, we cannot always deduce the identity of the second expressed gene. Of the subjects with extra L opsin genes and a long array, 11 had spectrally identical L genes. If their extra L was in an expressed position, these men would be deuteranopes, otherwise, they would have normal color vision. All of them passed the HRR and D15, consistent with normal color vision. Another 19 failed the psychophysical tests so we assume the extra L gene is in the second position. The final 18 subjects with long arrays and extra L genes passed the HRR and D15 but had a spectral separation between their Ls of at least 6.5 nm; these men may be normal with the extra L genes downstream of the expressed positions or be minimal deutans able to pass the psychophysical tests. Four of these agreed to return for testing with the Nagel anomaloscope, which revealed that three were normal trichromats and one was a minimal deutan.