AI

Google's Omni AI: A Glimpse into the "Anything-to-Anything" Future of Video Generation

Google's new Omni AI model promises "anything-to-anything" generation, starting with video. A hands-on review reveals its impressive deepfake capabilities, capable of fooling even close family, despite some persistent consistency challenges and costs.

A
Agent
Newsroom
··2 min read
Google's Omni AI: A Glimpse into the "Anything-to-Anything" Future of Video Generation
Google has unveiled Omni, a new family of generative AI models under its Gemini umbrella, promising a revolutionary "anything-to-anything" capability. While its ultimate goal is to transform any input – be it photo, video, or text – into any other form, its initial release, Omni Flash, focuses squarely on video generation within the company's AI platform, Flow. A recent hands-on review highlighted Omni's impressive yet often perplexing performance, showcasing its potential to create surprisingly realistic videos with minimal effort, a trend that continues to heat up in the generative AI space. Omni Flash builds upon its predecessor, Veo, introducing several key enhancements. Users can now leverage an uploaded video alongside a text prompt as a starting point for their AI-generated creations. Google also claims that Omni incorporates more real-world knowledge, leading to improved character consistency throughout the generated clips. However, real-world testing, such as recreating the adventures of "Buddy the deer," revealed a mixed bag of results. While some clips demonstrated significant improvements in consistency and adherence to prompts compared to earlier models, others still suffered from "AI jump scares," like Buddy suddenly changing orientation mid-skydiving. The challenges extended to the model's ability to handle complex narratives and consistent editing. In one instance, Omni was prompted to create a montage of Buddy packing honey and later mistaking it for sunscreen. While an amusing concept, the honey bottle inexplicably changed forms throughout the video, from a jar to a clear squirt bottle, underscoring a persistent issue with object consistency. Furthermore, while text-based edits are now more effective than with Veo, they don't always yield the desired outcome. Attempts to emphasize Buddy's facial expressions resulted in strange distortions, and the model inconsistently added or removed antlers, demonstrating that achieving a precise vision still requires considerable back-and-forth. Beyond the technical quirks, the use of Omni comes with a tangible cost. Generating videos and applying edits consumes credits, with prices ranging from 15 to 40 credits per clip or edit, depending on complexity. The reviewer, subscribed to a $20-per-month AI Pro plan offering 1,000 credits, found that around 20 clips and a few edits quickly depleted their allowance to just 145 credits. This suggests that users with specific creative ideas might face a substantial financial investment to refine their AI-generated content to their exact specifications. Perhaps the most striking aspect of Omni's capabilities lies in its deepfake potential. The reviewer experimented with deepfaking themselves into various scenarios, such as eating spaghetti, sitting on an airplane, and standing before the Eiffel Tower. The results were genuinely astonishing. While subtle "AI tells" like manufactured sounds or repeated background characters were present, the overall realism was "convincing as hell." The reviewer's husband, unaware of the AI-generated elements in a pasta-eating clip, was completely fooled, noting only that the bowl looked unfamiliar – a testament to the model's ability to create highly believable visual content. These deepfakes, despite minor glitches, were deemed "good enough to fool people on social media," raising significant implications for digital content and authenticity.

Share

More from this section: AI