The latest Segment Anything AI Model released by Meta has the potential to revolutionize the AR and VR industry.
Meta recently introduced SAM (Segment Anything Model), a model that has been trained on a large amount of data – 11 million different images and a billion semi-automated segmentations – with the purpose of segmenting various objects in the image.
Segmentation is the process of identifying which image pixels belong to an object – It can be used in various applications, such as analyzing scientific imagery or editing photos, Meta says.
“SAM has learned a general notion of what objects are, and it can generate masks for any object in any image or any video” the company states in the article. “In the AR/VR domain, SAM could enable selecting an object based on a user’s gaze and then “lifting” it into 3D“
How Can it be game-changing for VR and AR industry?
SAM has the potential to play a significant role in enhancing the technology behind Augmented and Virtual Reality. One way in which these devices could benefit from Meta’s AI is by enabling the selection of objects based on what the user is currently looking at and allowing interaction with those objects or showing information about them.
For example, with AR glasses, a person could look at a lamp in their living room. SAM would then recognize the object being viewed, segment it, and allow the user to turn the lamp on and off using voice commands or other controls.
But in my opinion, one of the most impressive demonstrations of its capabilities is its ability to segment objects, which can be incredibly useful. For instance, while playing VRChat with friends, one could switch to passthrough mode, use SAM to scan and segment a book, and then return to VRChat to show your real-life book to others.
It was just my example, but Meta itself came up with a similar idea. Meta in its article says about the possibility of how users with VR headsets could look at an object within virtual space, and SAM could automatically recognize the object and lift it into 3D space in AR.
Moreover, in AR or mixed-reality experiences, Meta’s Artificial Intelligence could enhance everyday activities such as shopping, navigation, or home improvement tasks by enabling users to easily identify and interact with objects and providing them with relevant information or assistance.
SAM has the potential to become a daily assistant when using AR or VR devices. Apart from AR, Meta’s AI could also understand both the visual and text content of a webpage or analyze scientific imagery.
It means that SAM could aid content creators by enhancing productivity in front-end programming or video editing. Meta also speaks about how it helps study natural occurrences on Earth, or even other planets, so scientists could make use of SAM’s abilities despite not using an AR device.
If the Segment Anything Model from Meta is as advanced as Meta states it is, we may see some of its features in the upcoming Meta Quest 3. According to rumors, the Meta Quest 3 will be a mixed-reality-focused device priced slightly higher than the current Meta Quest 2.
Artificial Intelligence from Meta could also improve developers’ work on mixed-reality games. I would like to give an example of an indie VR Developer, who works on an FPS mixed-reality game. In a video demonstrating the development process, the developer spends several minutes manually mapping the walls, objects, furniture, and other elements of the environment.
This entire process can be easily automated by using a depth sensor and SAM. It could even be enhanced by mapping smaller objects, such as a clock on the wall, and making it drop to the floor when shot from a distance in the game.
While the mixed reality capabilities of the Quest Pro are somewhat limited despite its mixed reality focus, the upcoming Meta Quest 3 is expected to feature two full-color passthrough cameras and a depth sensor, which the Quest Pro lacks.
The video depicts a person wearing an XR device and walking around. Upon closer inspection, it becomes noticeable that the objects in the video are being measured for the distance between themselves and the user. This indicates that the device being used is equipped with a depth sensor.
With the Meta Quest 3, the possibilities are immense – two full-color passthrough cameras and a depth sensor will give users the ability to see and interact with the augmented world in ways not previously achievable, at a reasonable price point – Even if Meta Quest 3 will cost $500 for the basic version, it is still 500$ less expensive than less-capable Quest Pro in terms of mixed-reality features.
Furthermore, the built-in AI algorithms allow for a more immersive experience, as the device can detect and recognize objects and faces in real time, providing an enhanced and more intuitive way to interact with the augmented world. We may see some similar features in the upcoming Apple Reality Pro, which is reportedly dated to release this year on June 5 at WWDC23.
Overall even if Meta’s Artificial Intelligence won’t come with Meta Quest 3, it will surely be present in the company’s future mixed-reality products. AI that Meta is working on can certainly change how we see AR now.