Meta has announced a new AI world model designed to better understand the 3D environment and movements of physical objects.
The tech giants' new open-source AI model V-JEPA 2 allows for visualisation and prediction in the physical world and was unveiled at the Vivatech conference in Paris.
It predicts the environment around itself without relying on labelled training data. Instead, it uses more than one million hours of raw video to learn patterns of movement, interaction and cause-and-effect.
“Allowing machines to understand the physical world is very different from allowing them to understand language,” Yann LeCunn, Meta’s chief AI scientist, said in a video presentation Wednesday at the Viva Tech conference in Paris.
“A world model is like an abstract digital twin of reality that an AI can reference to understand the world and predict consequences of its actions, and therefore it would be able to plan a course of action to accomplish a given task.”
An example of the model predicting behaviour is recognising that a ball rolling off a table will fall instead of float. It also recognises that an object hidden out of view hasn’t vanished.
Meta claimed that the V-JEPA 2 model would benefit machines like delivery robots or self-driving cars that need to understand their surroundings in real-time.
This comes as Meta focuses on AI to compete with players like OpenAI, Microsoft and Google. The company is set to invest US$14 billion into AI firm Scale AI and hire its CEO Alexandr Wang to bolster its AI strategy, according to CNBC.
The model is now available for research and commercial use.
Meta’s announcement follows leading AI researcher Fei Fei-Li, raising US$230 million for a new startup called World Labs, which aims to create “large world models” that better understand the structure of the real world.