Abstract
Purpose:
To evaluate the use of smartphone-based virtual reality to objectively assess activity limitation in glaucoma.
Methods:
Cross-sectional study of 93 patients (54 mild, 22 moderate, 17 severe glaucoma). Sociodemographics, visual parameters, Glaucoma Activity Limitation-9 and Visual Function Questionnaire – Utility Index (VFQ-UI) were collected. Mean age was 67.4 ± 13.2 years; 52.7% were male; 65.6% were driving. A smartphone placed inside virtual reality goggles was used to administer the Virtual Reality Glaucoma Visual Function Test (VR-GVFT) to participants, consisting of three parts: stationary, moving ball, driving. Rasch analysis and classical validity tests were conducted to assess performance of VR-GVFT.
Results:
Twenty-four of 28 stationary test items showed acceptable fit to the Rasch model (person separation 3.02, targeting 0). Eleven of 12 moving ball test items showed acceptable fit (person separation 3.05, targeting 0). No driving test items showed acceptable fit. Stationary test person scores showed good criterion validity, differentiating between glaucoma severity groups (P = 0.014); modest convergence validity, with mild to moderate correlation with VFQ-UI, better eye (BE) mean deviation, BE pattern deviation, BE central scotoma, worse eye (WE) visual acuity, and contrast sensitivity (CS) in both eyes (R = 0.243–0.381); and suboptimal divergent validity. Multivariate analysis showed that lower WE CS (P = 0.044) and greater age (P = 0.009) were associated with worse stationary test person scores.
Conclusions:
Smartphone-based virtual reality may be a portable objective simulation test of activity limitation related to glaucomatous visual loss.
Translational Relevance:
The use of simulated virtual environments could help better understand the activity limitations that affect patients with glaucoma.
We set out to develop a test that would reflect real world challenges described by patients with glaucoma, such as searching for objects, motion detection, and driving.
3,8 The VR-GVFT was thus based on objective tests well-validated in patients with glaucoma; the CGVFT and ADREV (
Supplementary Material S1).
7 Similar to the CGVFT, we utilized indoor and outdoor scenes to recreate the real-life experience of patients with glaucoma (
Fig. 1).
Two major iterative processes were used to develop the study measures. Firstly, we based the design of the CGVFT on the ADREV, and then further refined the CGVFT to the VR-GVFT. Secondly, Rasch analysis is an iterative process that was used to evaluate the VR-GVFT. Rasch analysis involves evaluating test items and to identify items that do not fit the Rasch model. This process is repeated until all remaining test items pass preset Rasch metrics.
The VR-GVFT consists of 38 tests that are related to glaucoma and reflective of daily life. The initial 14 tests are stationary tests, where the time required to identify stationary objects in a VR 180° Photo Sphere (Samsung, Seoul, South Korea) image environment is recorded. The subsequent 24 items are video tests where reaction time to key events is recorded, and consists of two parts: 12 motion ball tests and 12 driving tests. High resolution (8 megapixel) 180° Photo Sphere images and videos were recorded using the native camera application on the Samsung Galaxy Note 3 smartphone (Samsung). Driving videos of 5 to 12 seconds' duration were taken with the smartphone on a dashboard stabilization attachment.
The 14 stationary tests simulated 10 indoor and four outdoor scenes, with objects ranging in position within the field of view (and therefore the amount of eye and head movement required to be seen):
-
Identifying a fast food store sign at an outdoor intersection (STN 01)
-
Identifying a road work construction sign at an outdoor intersection (STN 02)
-
Identifying a street sign at an outdoor intersection (STN 03)
-
Identifying a general store sign at an outdoor intersection (STN 04)
-
Identifying a microwave in a tea room (STN 05)
-
Identifying a sink in a tea room (STN 06)
-
Identifying a sandwich maker in a tea room (STN 07)
-
Identifying a clock in a tea room (STN 08)
-
Identifying a refrigerator in a tea room (STN 09)
-
Identifying a poster on the wall in a cluttered study (STN 10)
-
Identifying a guitar in a cluttered study (STN 11)
-
Identifying a laptop computer in a cluttered study (STN 12)
-
Identifying a printer in a cluttered study (STN 13)
-
Identifying a row of books on the shelf in a cluttered study (STN 14)
Each stationary test outcome was a binary item of whether the object was seen in the allocated time of 60 seconds (Yes or No). If the object was seen, we also recorded a timing item of the seconds taken to identify the object. The stationary test thus has 28 items prior to Rasch analysis.
The 12 motion ball tests simulated a binocular confrontational visual field, with a single white ball moving from various peripheral positions on the periphery of the screen toward the screen center against a grass background. Eight peripheral positions were tested, with four additional balls being repeats. Participants were asked to immediately indicate when they saw the white ball in their vision. They were encouraged to use their peripheral vision and combined head/eye movement to locate the moving ball.
The 12 driving tests simulated road hazard perception under various driving conditions. The first four asked the participant to verbally identify when it was safe to start driving, and the latter eight asked the participant to verbally indicate when they would brake to avoid a potential hazard. The scenes ranged in time of day, to reflect real life driving.
Scenes that required participants to verbally identify when it was safe to start driving:
-
At an intersection with traffic light change from red to green (MOV1-1)
-
Behind stationary car (MOV1-2)
-
Behind stationary car at night (MOV1-3)
-
Behind stationary car and cyclist at an intersection without traffic lights (MOV1-4)
Scenes that required participants to verbally identify when to brake to avoid a potential hazard:
-
Identify that a pickup truck ahead slows down and makes a U-turn (MOV2-1)
-
Identify an intersection with a stop sign (MOV2-2)
-
Identify that a car ahead is stationary and has its hazard lights on (MOV2-3)
-
Identify that a roundabout ahead has two cyclists in it (MOV2-4)
-
Identify a zebra crossing ahead in a parking lot in the day time, with pedestrians crossing (MOV2-5)
-
Identify a zebra crossing ahead at a parking lot in the evening, with pedestrians crossing (MOV2-6)
-
Identify a cyclist on the side of the road (MOV2-7)
-
Identify a cyclist on the side of the road after passing through a traffic intersection (MOV2-8)
The VR-GVFT test was preceded by two “practice” stationary scenes (one indoor lounge room scene and one outdoor street scene), which allowed participants to get used to the VR environment and headset. In each scene, participants were asked to describe what they could see and encouraged to move their head and eyes to best simulate real life. For the driving tests, patients were given a four-second countdown prior to each video commencing in order to orient themselves to the task.
For each timed item, timing was recorded with an electronic timer, commencing from the moment the administrator finished reading the task instructions to the time when the participant verbally indicated the correct identification of the object or completed the task. The view seen by each participant was monitored and recorded throughout by the administrator on a laptop computer. For stationary test items, the item was recorded as a miss if more than 60 seconds were required to identify an object. For driving test items, a miss was recorded if the participant failed to respond within the duration of the video. For moving ball test items, failure to identify a ball within the duration that the ball appeared was recorded as a miss.
The VR-GVFT was administered to participants by one of the test administrators (XYGK, RLZG, or JL). All administrators conferred before, during, and after testing to ensure strict and consistent adherence testing protocols as maintained by the principal investigator (SES). Images and videos were delivered using a Samsung Note 3 smartphone (Samsung) inserted into a low-cost, head-mounted Google Cardboard Project Virtual Reality Adaptor (Google Inc., Mountain View, CA). A commercially available VR display software application, VR Player (version 1.8.2, VimersiV, Montreal, Quebec, Canada) was used to display images and videos on the smartphone. The smartphone was linked to a laptop computer to enable the test administrator to control the flow of images, and for live recording of patient performance using Mobizen screen synchronization software (Mobizen Inc., Seoul, South Korea). If a participant had a refractive error requiring distance correction spectacles, they wore their spectacles under the goggles throughout the test. Optical properties of Google Cardboard have been previous documented,
39 with a total field of view of 80° and a nominal virtual image distance of −667 mm. Validation of field of view was performed by taking a 180° Photo Sphere image of calibrated markers placed at 5° intervals from a central location and showing the image to normal volunteers (
n = 3) (nasal field 38.3 ± 2.9, temporal field 36.7 ± 2.9).
Statistical Package for Social Sciences (SPSS Inc., Chicago, IL) for Windows (version 16.0, Microsoft Corp., Redmond, WA) was used for statistical analyses. The sample size calculation was based on the modelled standard error of the item calibration. The modelled standard errors are in the range: 2/[√sample size] < standard error < 3/[√sample size]. Rearranging this, one can calculate the sample size based on the standard error: 4/[(standard error)
2] < sample size < 9/[(standard error)
2]. A sample size was calculated to achieve 95% confidence that any item calibration was within 0.5 logits from its modeled standard error, which equates to a required sample size in the range of 64 to 144.
50 Hence, the minimum acceptable sample size set for this study was 64. To account for subject drop-out, we aimed to recruit slightly greater numbers.
Demographic variables, BE VA and CS, BE visual field parameters (VFI, MD, and PSD), and VR-GVFT and GAL-9 (logit) scores were compared among mild, moderate, and severe glaucoma patient groups. Intergroup significance was assessed using analysis of variance (ANOVA) for parametric data and the χ2 test for dichotomous variables.
A univariable regression analysis was performed to examine the association between VR-GVFT person scores (logit) and the following variables: age, gender, demographic data, driving status, employment status, BE MD, PSD, VFI, VA, CS, glaucoma severity, and GAL-9 (logit) scores. To evaluate for multicollinearity, the correlation between all variables was assessed using Spearman and Kendal tau b correlation tests for nonparametric data and Pearson's for parametric data. All variables with P-value <0.05 were then included in a multivariate analysis. All variables were assessed for normality and linearity and transformed as required, and analysis of residuals was performed. Independence of residuals was assessed using the Durbin-Watson statistic. All tests were two-tailed with P-value <0.05 considered significant.
For the 28 stationary items, the initial person separation was 1.8, with four misfitting items. After removing misfitting items, person separation improved to 3.02 with no misfits, and targeting was 0.
For the 12 moving ball items, the initial person separation was 3.23, with one misfitting item. After removing the misfitting item, the person separation deteriorated to 3.05 with no misfitting items, and targeting was 0.
For the 12 driving items, the initial person separation was 1.41 and one item was found to misfit. After removing the misfitting item, the person separation deteriorated to a suboptimal 1.39.
Hence, only the stationary and moving ball tests passed Rasch analysis. A Rasch-scaled scoring algorithm for the person scores of these two scales are available on request and provide a linear transformation of ordinal VR-GVFT data.
On multivariate regression modelling, higher stationary test person scores were associated with having worse WE CS (regression coefficient [b], −0.360; 95% confidence interval [CI], −0.710 to −0.010; P = 0.044) and older age (b, 0.007; 95% CI, 0.002 to 0.013; P = 0.009). The variance (R2) explained by the multivariate model was 0.306.
As none of the univariate correlations between moving ball test person scores and covariates had P-value <0.05, multivariate regression modelling was not performed for the moving ball test person scores.
To our knowledge, this proof of concept study is the first VR test designed to simulate visual function limitation related to glaucoma in both stationary and driving tasks. Of the 52 starting items, only the stationary and moving ball tests showed reasonable measurement.
Rasch-analyzed person scores for the stationary test showed good criterion validity; that is, the ability to differentiate between glaucoma severity groups. They also demonstrated reasonable convergence validity with mild to moderate correlation with VFQ-UI, BE MD, BE PSD, BE having central scotoma, WE poorer VA, and either eyes having reduced CS. Multivariate analysis showed that poorer CS in the WE is an independent factor for worse stationary test person scores. Divergence validity, however, was suboptimal, with worse stationary test person scores associated with increasing age; this finding is consistent with our previous study of the CVGFT.
20 Slower response and reduced comprehension in older patients may explain their worse performance.
Rasch-analyzed person scores for the moving ball test showed that it is a valid measuring scale. However, there was poor criterion validity. A weak correlation with having a peripheral scotoma in the WE also did not reach statistical significance. The driving test failed Rasch analysis and hence is unsuitable in its current form as a measure for patient ability.
The lack of measurement validity for the moving ball and driving tests may be because it is unclear what the tests actually measure; we hypothesize that they may have been related to factors that are not measured, such as patterns of head and eye movements, or attributes unrelated to vision, such as familiarity with technology. Unfamiliarity with technology may have been compounded by the short length of videos (5–12 seconds).
The driving test may have involved tasks and commands that were too difficult for the patient to comprehend or perform to allow reliable and consistent measurements. For example, it required that the participant verbally indicate when they would stop/start driving. This may not correlate precisely with reflex-type motor behaviors, such as braking when a driver sees a cyclist. Future testing may address these issues by using a pedal instead of verbal cues, varying the time for each task, and the use of preceding “practice” videos. As this is a pilot study, the results of these tests form the basis of an ongoing iterative process to optimize study measures for future research design.
Compared to the GAL-9 and VFQ-UI, stationary test person scores had a lower correlation with BE MD (correlation coefficient; −0.546, −0.400, and −0.244, respectively). The strength of the GAL-9 and VFQ-UI result in part from serial refinement. Both were derived from extensively validated questionnaires whose psychometric qualities were refined through Rasch analysis.
7,33 In comparison, this is the first proof of concept version of the VR-GVFT, which may be improved with future modifications. The VR-GVFT also has advantages over PROs. First, unlike traditional clinical parameters, it visually simulates real life scenarios. This is further enhanced by its integration of the eye and head movement that most activities require as part of the visual search to complete a task. The very realistic nature of the test may allow patients, clinicians, and policy makers to gain greater insights into the potential impact of glaucoma on daily visual function. Second, the VR-GVFT utilizes widely available smartphone technology and low-cost VR head-mounted goggles (current estimated average cost of US$20). Both are easily portable, allowing for potential use in low-resource, rural, and remote clinical environments. These qualities also give it an advantage over physical simulations of visual function; unlike the ADREV test, the VR-GVFT is useable for patients with manoeuvrability issues, and requires less effort, equipment, and space to set up. Third, it is timed, allowing fine gradations of visual function to be detected. Lastly, the stationary test component correlates well with subjective measures of QoL, such as the VFQ-UI.
By simulating real-world tasks, VR is more likely to reflect the difficulties with day-to-day tasks experienced by each patient. Far more than a visual field test printout, this can heighten a patient's understanding of their own visual disability and be used by caregivers, clinicians, and policymakers to identify personalized means of optimizing QoL, such as modifications to the home environment catered to a patient's own vision-related challenges. It may also aid understanding in the doctor–patient relationship, which can influence treatment adherence—one of the key problems in glaucoma management today.
51
This study has potential limitations. First, as this was a prototype test designed specifically to evaluate visual dysfunction related to glaucoma, patients with coexisting ocular disease were excluded from this study. While this ensured we were measuring mainly glaucomatous visual dysfunction, extrapolation to patients with other ocular and nonocular co-morbidities that can affect vision is limited. Further studies evaluating the influence of these on VR-GVFT performance would be worthwhile.
52,53
Second, due to its nature as a pilot study, normal individuals without glaucoma were not included. Using such patients as controls would benefit future evaluation of this test.
Third, we had a larger proportion of mild glaucomatous cases. This is unfortunately a limitation when patients are recruited from clinical patient encounters, in that very advanced cases are generally rarer than moderate or mild cases of glaucoma. However, we feel this may better reflect the proportion of patients with glaucoma in real life.
Fourth, 24-2 Humphrey visual fields were primarily used in this study and this could miss field defects in the far periphery, which may have influenced patients' performance on the VR-GVFT. However, routine clinical testing was prioritized during recruitment, and the 24-2 is a commonly used standard visual field assessment in the clinical setting and it is sufficient for the purpose of grading mild, moderate, and severe glaucoma in our study.
54
Fifth, smartphone-based goggles, while attempting to simulate real life, do not capture the precise daily visual challenges experienced by individual patients with glaucoma. Such tasks are impossible to precisely recreate and measure under the conditions of scientific study; like all clinical tests and PROs, the VR-GVFT is at best a potential sample of visual difficulties that might be experienced by patients with glaucoma. For instance, patients with glaucoma may develop compensatory responses in terms of eye or head movements to cope with limitations imposed by their scotoma; such compensation may mean their visual limitation may not be detected by visual challenges.
55
Lastly, gaze-tracking is not possible with Google Cardboard VR goggles. To minimize this, we monitored the visual experience of the patient from the administrator's laptop. Gaze-tracking capability on future smartphone technology may lead to improvements in testing,
56 to both monitor gaze fixation and measure ability to navigate a simulated three-dimensional environment.
We have demonstrated that using readily accessible VR goggles and a structured objective test can provide near real-world assessment of how glaucoma affects activities of daily living. Such testing may also help both clinicians and patients communicate how glaucoma may interfere in the practical, day-to-day world of patients. It could thus be an important intervention in patient and community education and a source of information for health policy. Further development is required to improve the precision of this test in glaucoma.
Centre for Eye Research Australia receives Operational Infrastructure Support from the Victorian Government (Australia).
Disclosure: R.L.Z. Goh, None; Y.X.G. Kong, None; C. McAlinden, None; J. Liu, None; J.G. Crowston, None; S.E. Skalicky, None