How to measure visual clarity in VR by yourself? The Present and Future of Visual Clarity for VR.
Github
Hello everyone, I'll try to explain everything about visual clarity in VR. I hope to help more people understand the present and future of VR visual clarity.
Just like computer monitors, VR headsets also use pixelated displays. The display resolution of a monitor determines the maximum image resolution it can display.
When the rendering resolution surpasses the display resolution, we refer to this as downsampling. Conversely, when the display resolution is greater than the rendering resolution, this is called upsampling.
When you play a 4K video on a 4K monitor, every pixel of the image can correspond directly to a physical pixel on the screen, resulting in the best possible display quality.
Therefore, playing an 8K video on a 4K monitor doesn't make much sense, as once the rendering resolution exceeds the display resolution, there will be no further improvement in visual clarity. This is why the rendering resolution of operating systems usually matches the display resolution.
Understanding these concepts is not difficult when it comes to computer monitors. However, in the context of VR headsets, the image's pixels do not directly correspond to the physical pixels on the screen.
In order to enhance immersion by using a larger field of view, the rendering pipeline of VR headsets employs distortion algorithms to warp high-resolution raw images. This allows the compression of a larger field of view into a limited image size. Then, the optical system performs reverse distortion on the image, decompressing it and ultimately delivering it to the human eye. Therefore, the visual clarity in VR is also affected by the optical system.
The lenses of VR headsets cannot deliver 100% image quality. The sharpness at the lens edges is usually lower than in the central area, leading to a decrease in image quality in non-central areas.
In VR, there isn't a simple case of upsampling and downsampling - both coexist. The distortion algorithm inherently introduces downsampling during compression in non-central areas. Therefore, the rendering resolution is often higher than the display resolution. This point is often overlooked and misunderstood because it's counterintuitive.
To better illustrate this, I've designed optotype-based standard images, OBSI.
The biggest difference between VR and traditional displays lies in the fact that in VR, what's displayed on a 4K screen could actually be a distorted and compressed 8K image. This image distortion and its correction by optical system can lead to a loss of visual clarity in non-central areas.
Moreover, if the image quality is compressed during VR streaming, it can result in additional loss of visual clarity. This is similar to the image quality loss caused by normal video streaming.
Typically, monitors use the pixel density Pixels Per Inch, PPI, as a measure of display resolution. PPI represents the number of pixels per square inch, where a higher number leads to an enhanced display quality.
However, for VR headsets, we tend to use the angular pixel density Pixels Per Degree, PPD, instead of PPI. This is because PPI measures physical properties, while PPD measures visual properties.
In VR, the relationship between the PPI, and the actual visual clarity isn't significant. This is due to the close proximity of the screen to the human eye and the influence of the optical system. Therefore, PPD is a more suitable metric to measure VR resolution performance.
PPD takes into account several factors, including the resolution of the screen, its actual size, the distance from the screen to the human eye, and the performance of the optical system.
The performance of the optical system is often precisely measured by the Modulation Transfer Function, MTF. The MTF is a metric that integrates the influence of resolution, contrast, and sharpness. It determines the level of detail that the optical system can deliver. The higher the MTF value, the more refined the details of the image.
So, when it comes to VR, it's not just about the number of pixels. It's about how those pixels are perceived by the human eye, and how effectively they are delivered by the optical system. That's why we look at metrics like PPD and MTF, to get a more accurate understanding of the VR experience.
In essence, PPD is a composite metric that encapsulates the performance of an optical system. Therefore, it's not like PPI which is a single value.
For a typical display, the PPI is the same across all parts of the screen. However, PPD is often highest at the center and gradually decreases towards the edges.
At this point, you might be thinking that PPD sounds pretty great. But if you try to look up PPD, you'll find that things are not as straightforward.
While we can roughly estimate PPD using display resolution and the size of the field of view, the size of the field of view is usually just an estimate itself. As you can imagine, the accuracy of using an estimate to calculate another estimate leaves much to be desired.
Moreover, this estimation method fails to reflect the performance of the optical system. It only provides an estimated value for the central PPD, without offering any information about the PPD values in non-central areas.
VR manufacturers typically do not disclose PPD values unless they believe their product has reached an industry-leading level in terms of PPD and optical system performance. In such cases, visual clarity becomes a selling point for them, and they are willing to disclose this information. Even so, they only disclose a small portion, not all of the values. For instance, Varjo VR-3 and Pimax Crystal have revealed some of their PPD performance, and Quest Pro and Quest 2 have shared some of their PPD and MTF performance.
In addition, there are companies that manufacture and sell equipment specifically designed to accurately measure PPD and MTF. However, these companies mainly target the enterprise market, and the prices of their equipment are typically very high and not publicly disclosed. Therefore, it's unlikely that average consumers would purchase these expensive professional devices just to understand the visual clarity of VR devices.
This creates a somewhat awkward situation for the practical application of PPD: VR manufacturers only share the good news, choosing to disclose only the data that benefits them, while consumers don't have a low-cost, accurate measurement method. These factors have led to the PPD metric of VR becoming somewhat of a phantom in the public consciousness.
The normal visual acuity of the human eye is roughly equivalent to 60 PPD, and the upper limit of human vision is around 128 PPD. For VR devices, reaching a three-digit PPD is still a distant goal. However, some devices, such as the Varjo VR-3, have a central area PPD greater than 60. That's why Varjo is willing to disclose partial PPD values.
But what about devices that don't reach 60 PPD? Should VR manufacturers naturally choose not to disclose these values? This practice has led to a lack of public understanding of VR visual clarity, as they can't find a simple and effective measure of visual clarity. This has indirectly led to the popularity of inaccurate PPD estimation methods.
The absence of specific metrics also hinders scientific research related to VR visual clarity. When researchers want to mention the PPD of VR devices in their papers, they can't find any specific official values online, and they see a bunch of different third-party estimated values. In this situation, they can only choose to ignore PPD.
The discussion above mainly focuses on the visual performance of display resolution. But if you recall the comparison made at the beginning between display resolution and rendering resolution, you'll realize that PPD and MTF cannot explain the effect of rendering resolution on visual clarity.
Due to the limitations in performance and battery life, the default rendering resolution of standalone VR headsets is often much lower than in streaming mode. In PC VR and streaming, users can set the rendering resolution themselves. However, there currently isn't a simple way to measure the effect of rendering resolution on visual clarity.
Since VR manufacturers are all pursuing the goal of achieving 60 PPD, which is equivalent to normal human vision, why not apply the visual acuity charts used in optometry directly to the VR environment? This could measure the end-to-end visual clarity from the virtual environment to the human eye. Not only would this bypass the opaque PPD and MTF, but it could also explain the effect of rendering resolution on visual clarity. Plus, it's very cost-effective. Both VR manufacturers and ordinary VR users can use it to measure visual clarity.
One simple fact is that the human eye moves. Achieving 60 PPD in the center area alone doesn't satisfy normal human vision. The standard for normal vision in VR should be achieving 60 PPD in all areas, and no current device can do this. However, by using the method of visual acuity testing, we can understand how far current VR devices are from achieving normal omnidirectional virtual vision.
Nevertheless, in real life, visual acuity charts can only measure the visual acuity of the field directly in front. Therefore, in the VR environment, we need a new visual acuity chart to measure omnidirectional visual acuity.
In a recent paper, we introduced a new metric called Omnidirectional Virtual Visual Acuity, OVVA, which is designed to evaluate and quantify the visual clarity of VR headsets.
The testing method for OVVA is based on real-world visual acuity charts. It uses specific optotypes, which are symbols used in visual acuity charts, as recognition targets. The user's ability to identify these optotypes is then used to assess the visual clarity of the headset.
The OVVA measurement is divided into two stages. The first stage measures the central virtual visual acuity, CVVA, which evaluates the visual clarity in the central area. The second stage measures the degradation of CVVA in the noncentral area, which is used to evaluate the overall distribution of visual clarity across the headset.
In our experiment, we tested three types of VR headsets, and set up four different rendering resolution conditions for each device. We recruited 30 participants, who were divided into three groups, to measure the OVVA under different headsets.
To better compare the results, we converted the CVVA values into a format called logMAR. logMAR is a way of measuring visual acuity, representing the logarithmic value of the smallest recognizable visual angle. Essentially, it measures vision loss. In logMAR, 0 indicates normal vision (a decimal format of normal vision is 1.0). A positive logMAR value, like 0.2, indicates vision below normal, while a negative value, like -0.2, indicates vision above normal.
Across all test conditions, the results of the first stage of OVVA measurement, that is the CVVA value, showed a strong correlation with the pixel size of the rendering resolution. Therefore, we performed a linear regression analysis on the logMAR results. The results showed that the R-squared value for most test conditions was about 0.9. This means our regression model can very accurately explain most of the variability in logMAR and can be used to predict CVVA values at different rendering resolutions.
The results from the second stage of the OVVA showed a direct relationship between the field of view and the accuracy of optotype recognition. In the non-central area, the Quest Pro outperformed the other headsets, while in the central area, its performance was similar to the Quest 2.
The Quest 2 and Quest Pro have official central PPD and some MTF values. The OVVA results are consistent with these official parameters. The ideal VR headset should ensure normal central virtual visual acuity, with no degradation in the non-central area, and a field of view close to the human eye.
OVVA is a quick two-minute test that doesn't require any specialized equipment and can quantify the gap in visual clarity between current VR devices and the ideal VR device.
Users, developers, and VR manufacturers can all use OVVA to measure end-to-end visual clarity in VR environments. Furthermore, users can easily use OVVA to measure and understand differences in visual clarity under different conditions, such as comparing VR streaming and standalone modes.
Finally, let's discuss the ideal form of future VR devices.
Firstly, a lightweight design is essential for prolonged use. Currently, most standalone VR devices weigh around 500 grams. With technological advancements, we can expect future VR devices to be as light as today's lightweight optical see-through headsets.
The design of VR head straps is not like a helmet that evenly distributes the pressure. Most of the weight presses on the face, putting a strain on the neck. Wearing it for a long period can cause discomfort. Moreover, interacting with virtual objects can also increase the load on the arms, making it more tiring than using a computer.
For scenarios that depend on normal vision, such as virtual displays and precise VR surgeries, normal vision is indispensable.
The loss of virtual vision could be one of the main obstacles to the widespread application of VR. Insufficient virtual vision can cause more simulator sickness in VR environments. If your vision is normal, then the current VR headsets can be approximately considered as heavy glasses simulating myopia. If you wear glasses, you can imagine the troubles of living without them.
Passthrough, also known as video see-through, VST, is becoming increasingly prevalent. The visual clarity of VST is complex and closely related to the quality of the camera. There are online videos of people walking or driving outdoors with VR devices using the passthrough feature. This is akin to a person with myopia not wearing glasses - it's dangerous and uncomfortable. Moreover, VST inherently has latency, noise in low-light conditions, and overexposure issues, which degrade the actual experience. Driving under VST is as risky as drunk driving, not wearing glasses when nearsighted, and being night blind, with the added burden on your head, so it's very dangerous. In fact, people with myopia who have not corrected their vision cannot pass the physical examination for a driving license.
VR is a crucial component of the metaverse. If the metaverse becomes a part of people's future lives, people certainly do not want to experience the visual impairment, delayed response time, and night blindness brought by VR. People are accustomed to normal or corrected vision in real life. So, the abnormality of virtual vision can cause many inconveniences, just like being nearsighted but not wearing glasses. When using a mobile phone, if you can't see the details clearly, you can zoom in on the content or bring the phone closer. However, in a VR environment, if you can't see the details clearly, the only choice is to lean your head closer. This is because in the VR environment, the distance from the screen to the human eye will not change, and the field of view will not change either.
If you want to understand how far we are from perfect VR visual clarity, there's no need to refer to estimated PPD values or sift through the overwhelming promotions from VR manufacturers. All you need to do is spend two minutes to test your OVVA, and you'll get the answer. Simulating normal vision in VR environments is an important milestone for the future VR visual clarity.