What was our score last time? OUR four trends to watch in 2024 included what we called custom chatbots: interactive help applications powered by large, multimodal language models (check: we didn’t know that yet, but we were talking about what everyone now calls agentsthe hottest thing in AI right now); generative video (check: few technologies have improved so quickly over the past 12 monthswith OpenAI and Google DeepMind releasing their flagship video generation models, Sora and Veo, one week apart in December); and more versatile robots capable of performing a wider range of tasks (check: the benefits of large language models continue to trickle down to other sectors of the technology industryAnd robotics is at the top of the list).
We also said that AI-generated election disinformation would be pervasive, but here, thankfully, we were wrong. There were a lot of things to wonder about this year, but political deepfakes were rare on the ground.
So, what will happen in 2025? We’re going to ignore the obvious here: you can bet that agents And smaller, more efficient language models will continue to shape the industry. Instead, here are five alternative picks from our AI team.
1. Generative Virtual Playgrounds
If 2023 was the year of generative images and 2024 was the year of generative video-what comes next? If you guessed generative virtual worlds (i.e. video games), high five on all levels.
We got a little taste of this technology back in February, when Google DeepMind revealed a generative model called Genius which could take a still image and turn it into a 2D side-scrolling platformer that players could interact with. In December, the company revealed Genius 2a model capable of transforming an initial image into a complete virtual world.
Other companies are developing similar technology. In October, AI startups Decart and Etched revealed an unofficial Minecraft hack in which every frame in the game is scraped. generated on the fly while you play. And World Labs, a startup co-founded by Fei-Fei Li, creator of ImageNet, the massive photo dataset that sparked the deep learning boom, is building what it calls large world models, or LWMs.
An obvious application is video games. There is a playful tone to these early experiments, and generative 3D simulations could be used to explore design concepts for new games, turning a sketch into a playable environment on the fly. This could lead to completely new types of games.
But they could also be used to train robots. World Labs wants to develop what is called spatial intelligence, the ability of machines to interpret and interact with the everyday world. But robotics researchers lack reliable data on real-world scenarios to train such technology. Spin countless virtual worlds and let go virtual robots in them to learn through trial and error could help offset this.