February 16th, 2024
OpenAI has dipped its toes, or should I say its whole body, into the world of video generation.
Following in the footsteps of startups like RunwayML, the titular AI company announced Sora, a text-to-video AI model that’s capable of producing some stunning — almost concerning results.
It was announced yesterday, out of the blue, and it quickly took social media by storm. OpenAI CEO Sam Altman generated a number of videos based on people’s suggested prompts, including dogs recording a podcast, a drone race on Mars, and a variety of sea creatures riding bikes.
Sora works like the rest of OpenAI’s offerings — enter a prompt as simple or as detailed as you like, and it will generate a minute-long 1080p video in whatever style you want, populated with things, people, animals, and different environments. You can also craft your blockbuster movie just by dropping in a still image which the AI will then go on to animate, or a video that can be extended by Sora.
According to OpenAI, Sora was trained on jaround 10,000 hours of “high quality video” and is built upon a transformer architecture, which apparently gives the model a superior scaling performance. It also uses the same “recaptioning technique from DALL·E 3, which involves generating highly descriptive captions for the visual training data.”
Safety was a big concern for the team as well, so it’s not open to the public yet. Rather, the company is working with “red-teamers” — experts in things like misinformation, hate content, and bias — who will be testing the model thoroughly before any release to the wider public.
Sora — with all of its mind-blowing capabilities isn’t perfect though, and the team recognizes its weaknesses, particularly when it comes to physics, saying “It may struggle with accurately simulating the physics of a complex scene, and may not understand specific instances of cause and effect.”
As mentioned, Sora isn’t currently available to the wider public, and there’s no release date yet. However, you can continue to reply to Sam Altman and maybe he will generate your prompt, or you can take a look at this curated gallery of examples made by a maker.
Donald Evans
Tech Enthusiast
I wonder what other applications this architecture could be used for.
Adaline Blanda
Mayank K Sahu
Designing experiences beyond imagination
@adalineblanda Totally agree! Sora's cool, but gotta watch out for misues. Testing and safegurads are key before release.
Mayank K Sahu
Designing experiences beyond imagination
Wow, Sora sounds like a game-changer! The ability to generate videos from text prompts is mind-blowing. Kudos to the OpenAI team for prioritising safety. Can't wait to see Sora in action once it's released to the public. Keep up the amazing work!
Joe Armis
