Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation

Posted3 months agoActive3 months ago

montyanderson

314 points

114 comments

github.comTechstoryHigh profile

excitedmixed

Debate

60/100

AI-Generated ContentMultimodal FusionDeep Learning

Key topics

AI-Generated Content

Multimodal Fusion

Deep Learning

The Ovi project presents a new AI model for generating audio-video content, sparking discussion on its potential applications and implications, with some users excited about its creative possibilities and others concerned about its potential misuse.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

43m

Peak period

0-6h

Avg / period

Comment distribution114 data points

Loading chart...

Based on 114 loaded comments

Key moments

01Story posted
Oct 22, 2025 at 3:42 PM EDT
3 months ago
Step 01
02First comment
Oct 22, 2025 at 4:25 PM EDT
43m after posting
Step 02
03Peak activity
92 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Oct 24, 2025 at 6:53 PM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (114 comments)

Showing 114 comments

marstall

3 months ago

3 replies

mindblowing - but still in the uncanny valley. and I guess it's cute that many of the characters live in a world where AI has caused an apocolypse, but is that really the message they want to lead with?

christophilus

3 months ago

1 reply

To be fair, a lot of movies in theaters have uncanny valleys. A lot of CGI feels that way to me.

alexpotato

3 months ago

1 reply

The limits of CGI have gotten so good that if you are noticing the CGI, it's b/c someone skimped on the budget for the particular movie where you saw the uncanny valley.

(Of course, excluding the obvious "that guy just knocked down a building!" CGI)

jsheard

3 months ago

Yeah that's the thing, with good execution of a plausible effect you don't even realize you're looking at CGI. It's the toupée fallacy.

Obligatory Jonas Ussing plug: https://www.youtube.com/watch?v=7ttG90raCNo&list=PLgdTaHO8FL...

aitchnyu

3 months ago

1 reply

I watched a few animes which embraced their poor animation and the excellent Dandadan doesnt chase FPS and resolution benchmarks. Wonder if AI video makers will restrain themselves from getting close to the uncanny valley.

Dandadan intro and its lack of FPS and sharp lines: https://www.youtube.com/watch?v=a4na2opArGY

aspenmayer

3 months ago

Dandadan is as far as you can get from “poor animation” but it definitely has a specific mix of lo-fi aesthetic animated exceedingly well, and they’re using Adobe Animate, aka Flash.

Animation doesn’t feel fast if it’s too many FPS or too steady, anyway, ironically and counterintuitively. You can’t do everything on the ones and twos.

To your point about Dandadan’s intro, it’s jam packed with references, which is another kind of skill in and of itself:

https://www.youtube.com/watch?v=5sUaK0xahBU

Chainsaw Man is in that same vein, and is another Science Saru production. I’m looking forward to seeing what they will do with the Ghost in the Shell franchise next year.

https://en.wikipedia.org/wiki/Science_Saru

I get what you mean though regarding Dandadan’s animation style; it has a very hand drawn manga vibe, and the detail is minimal yet finely balanced against the overwhelming amount of noise and visuals. It’s like a slapdash superflat.

https://en.wikipedia.org/wiki/Superflat

On a side note, as an anime fan, MBS is doing great work lately. I liked Witch Watch much more than I expected to, and that’s a much better show than the genres involved would lead one to expect.

https://en.wikipedia.org/wiki/Mainichi_Broadcasting_System

doawoo

3 months ago

> but still in the uncanny valley

that and the guitar player behind the singer in the concert example has three arms :)

meonkeys

3 months ago

1 reply

Lazyweb: Are these related? If so, how?

https://news.ycombinator.com/item?id=45603435

https://news.ycombinator.com/item?id=45652726

pavlov

3 months ago

1 reply

When a new open weights AI model comes out, opportunists register a domain using its name and start hosting it hoping to make a buck with SEO.

Easier than ever now, as AI-assisted coding tools will build you that generic landing page and basic UI.

imiric

3 months ago

If they're actually hosting it and providing a service, that wouldn't be a bad thing. Ovi's Apache license allows commercial use. They would only be infringing on the license by not publishing it and the original copyright, and would be morally bankrupt by not disclosing the original project and passing it off as their own.

But I also suspect that most of these are indeed SEO scammers, that there's no actual service, and that all payments are pocketed. It might take a few days for the scam to be reported and the site taken down, but it's likely enough to get a few hundred bucks out of it. They'll never be pursued because of where they live, and they can have many of these up in no time, thanks to AI, as you say.

What a sad state of affairs that no "AI" company or government is taking seriously.

pavlov

3 months ago

3 replies

Seems like the video model is based on Wan2.2.

Lots of activity around Wan lately. It’s nice to see flexible open models make a strong showing against the massively funded closed competitors like OpenAI and Runway.

cubefox

3 months ago

1 reply

And Google.

echelon

3 months ago

1 reply

OpenAI wowed the world with a video model that was also a script writing, editing, dubbing, foley, and music model.

Kling still has the best proprietary video model, but Sora 2 is so smart that you don't need to edit anything if your target is social.

I don't see how Runway, Pika, or the rest of the purely foundation video model startups survive against the giants and the incredible open source Chinese models. They've got to be sweating bullets right now.

Everyone's also sleeping on xAI's high quality and insanely fast video model (10 second generations) that they're giving away completely for free without watermarks.

gmerc

3 months ago

1 reply

There's no moat on the technology, especially with China around. So the only moat is distribution. We're still in the crazy phase of serious changes and if you're betting too deep on one architecture, ah well... https://howlin-wang.github.io/svg/

hackernewds

3 months ago

1 reply

This model is from Character AI and they have distribution. Although they open sourced it so no consideration for moat.

ineedasername

3 months ago

Its from Character AI but they used Wan, and MMaudio. I'm not sure their licenses disallow creating a closed model from their work for commercial purposed but either way they've done nothing with a true moat, they were merely first to the table for something this all-inclusive. Even apart from their efforts, assorted tools, all open, can be used to achieve these effects , but requires more techinical knowledge to setup, and each new gen would require a fair amount of reconfiguration of modules. But this is still significantly easier than similarly available tools 9-12 months ago. As an approach it also trades turnkey from tons of control and flexibility such that competent use will still often be simpler or get to a more refined result than Sora and others.

I think the moat here will ened up being value adds for convenience, tooling, IP licensing, integration into the rest of the pipeline used for content production, etc.

drak0n1c

3 months ago

They're the main open source privacy-preserving video models that VeniceAI has started offering with a convenient frontend. Ovi is one with image to video and there is also Wan 2.1 image to video, Wan 2.2 text to video. Wan 2.5 is available but it's routed in an anonymized fashion to the official provider. They're also much more affordable compared to the routed Kling, Veo, and Sora options.

dang

3 months ago

Discussed here:

Wan – Open-source alternative to VEO 3 - https://news.ycombinator.com/item?id=44928997 - Aug 2025 (38 comments)

tootie

3 months ago

2 replies

Kinda terrifying. And it can run in 32GB of VRAM? Anyone with a 5090 can start spewing out believable fake videos.

aussieguy1234

3 months ago

1 reply

The other option is to rent a 5090 in the cloud. Probably less than 0.50 per hour at most providers.

latchkey

3 months ago

1 reply

If people are interested, I could split up each of our MI300x into 4 and then charge $0.50 (1/4th our current rate). You'd get 48GB of vram instead of 32GB and it would be HBM3 instead of GDDR7 (5.3TB/s vs 1.7TB/s).

The only catch is that I'd need to get 32 people who want VMs like this since I would have to do it for the entire box of compute.

Wan2.2 runs just fine on AMD.

ricardobeat

3 months ago

1 reply

Vultr does this: https://www.vultr.com/pricing/#cloud-gpu

Though only a shared A40/A100 are in that price range.

latchkey

3 months ago

1 reply

It really is an unfortunate thing that pricing is so opaque and non-transparent in this industry. You look at one price and that's it. The reality is more complex.

Vultr is a box of 8 minimum and not on-demand and they don't offer VMs.

On the other hand, I offer the bare minimum (1 GPU for 1 minute) (or 2, 4, 8x), on-demand, no-contract, and an API to automate it all. We also have 100G unlimited bandwidth and free IPv4. Oh and our 8x box specs are generally better... 122TB of enterprise NVMe.

ricardobeat

3 months ago

1 reply

What's your company? Consider adding it to your profile.

latchkey

3 months ago

dm'd

jonplackett

3 months ago

Yeah and even with the server it’s really cheap - things like omnihuman are I think better but MUCH more expensive to run

amelius

3 months ago

17 replies

How long until we see blockbuster movies produced by a guy in his basement for <$1000?

computerphage

3 months ago

1 reply

I think "blockbuster movie" is a moving target, so it's a bit hard to know

nine_k

3 months ago

1 reply

It's a relatively well defined measure of success though: a movie which is popular and high-grossing.

computerphage

3 months ago

Yep. Totally agreed that it's well defined. Only pointing out that the technical execution required will shift, which seems relevant because it's likely to make it take much longer than without this effect

ivape

3 months ago

2 replies

Soon as the video models can keep characters consistent across scenes. It could take months of prompting to get each scene, but regular movies have long shooting timelines too. If we ever get to instant movie, thought to scene, then movies will die since people will just daydream through the AI.

hadlock

3 months ago

1 reply

I can see soap-opera-style, video-manga becoming a thing, where you get 5x20 minute episodes a week, and an ai-generated 30 min super cut of the week's "events" every saturday. What if DragonBallZ, but new episodes drop every morning before your morning commute? And for $30 a month you can do a choose your own adventure where at the end of each episode, the story splits off and you get an alternate history version, but conveniently rejoins the main story by the end of the week.

Three years ago we had a live streaming autogen-seinfeld twitch stream; some kind of coherent story telling via AI doesn't seem beyond reach today, the tools just haven't fully matured yet.

ivape

3 months ago

Yes, that sounds like one of the ways this can happen. I think children born in the next few years will look back in decades on the cartoons/tv shows/movies they made for themselves.

jayd16

3 months ago

People can daydream for free now.

roarcher

3 months ago

6 replies

Never. I've seen people instantly go from liking a static image to disliking it upon learning it was AI generated. The same applies to other kinds of media. No matter how "good" it is, knowing that it was created by an unfeeling algorithm ruins it for most people.

danielbln

3 months ago

1 reply

Never is a long time. The resistance will erode.

roarcher

3 months ago

1 reply

"People will like AI movies because it's inevitable"

Circular reasoning. If you can't answer WHY people should come to like AI movies, then you have nothing to say.

bryantwolf

3 months ago

1 reply

People will like them when they’re good content. Right now we’re stuck in the in between where it’s kind of all or nothing ai, but it will get grayer when the feedback loops are tighter and building ai movies is more interactive. Same thing with any special effects really

roarcher

3 months ago

I think AI has its place in special effects, but "making a blockbuster movie for $1000" requires replacing all the actors, music, cinematography, everything that makes a movie "art" except maybe the plot. And I have never seen anyone respond well to AI "art" of any form. I've seen some fairly passable (if a bit boring) AI music passed around on Reddit and it was universally met with disgust simply because it was AI.

jonwinstanley

3 months ago

2 replies

This is defo true until the moment it gets so good people can't tell.

roarcher

3 months ago

1 reply

Gonna be pretty hard to pass off a whole movie as real when none of the "actors" exist.

CamperBob2

3 months ago

1 reply

Nothing you see on a screen is real. (And long ago, a smart guy named Plato went even further than that.)

roarcher

3 months ago

Huh? Are you just trying to sound cryptic, or are you actually claiming that the actors in current movies are also fake?

And yes, I'm well aware of the allegory of the cave. So is everyone. What I don't understand is why it's such a popular rhetorical device with people who have no discernible point but want to sound as if they do. It's actually quite ironic.

numpad0

3 months ago

you see, I think there's a huge oversight in that sentence. People don't enjoy analysis of an art, people feel positive from looking at one and like doing it.

they're not doing enough to optimize AI data generators for dopamine release with animalistic obsession. Instead they focus on scientific indistinguishablilitiness, and people aren't liking that. IMO that's has been an ongoing and growing costly mistake.

jonplackett

3 months ago

2 replies

I read a while ago that big scientific ideas take about 50 years to be accepted. Which basically means they are never accepted. The people who disagree just get old and die.

Younger generation who grow up with AI will just think it’s normal, like we think being connected to the internet via a rectangle you keep in your pocket is normal.

roarcher

3 months ago

1 reply

Scientific ideas have the benefit of being objectively true.

AI movies are not a "scientific idea". Liking them is a matter of taste, and there are plenty of things that never catch on.

jonplackett

3 months ago

2 replies

I’d say it’s the other way around - it took 50 years for EVEN A SCIENTIFIC IDEA - with proof to be accepted. That should have happened super quick. But it didn’t.

My point is that you and I will probably never accept it - but our kids will never even think it’s weird in the first place.

roarcher

3 months ago

1 reply

That's not the other way around, that's my point. A scientific idea will eventually be accepted because its objective truth makes it inevitable in spite of resistance. Wide acceptance of AI movies is no more inevitable than wide acceptance of bellbottom jeans--it's simply a matter of like or dislike. From what I've seen, people have a strong aversion to it and no particular reason to overcome that aversion.

So far not one commenter in this thread has articulated why AI movies are inevitable.

CamperBob2

3 months ago

1 reply

Wide acceptance of AI movies is no more inevitable than wide acceptance of bellbottom jeans--it's simply a matter of like or dislike

It's inevitable because you won't be able to tell the difference.

roarcher

3 months ago

Really? I won't be able to tell that the "actors" in a feature-length presentation don't actually exist?

I see you responded to this point elsewhere in this thread, but frankly your reply is a non-sequitur. I'm not sure what you mean by it.

DetroitThrow

3 months ago

>I’d say it’s the other way around - it took 50 years for EVEN A SCIENTIFIC IDEA - with proof to be accepted.

I recall eerily similar things said about Google Glass..

Maybe AI generation will be used in popular media more often, but purely AI generated content or AI brain rot seems to only appeal to a small crowd of people right now, and I don't see that crowd growing significantly.

Maybe it's a technology problem, as Google Glass was, but I think that's inseparable from the content it actually generates at this non-AGI stage.

Regardless, it sounds very uncertain and perhaps even unlikely that what we see being created now is the future.

throwup238

3 months ago

Also called Planck’s principle [1] commonly phrased as “science progresses one funeral at a time”. The Structure of Scientific Revolutions by Thomas Kuhn is a classic book on the topic.

[1] https://en.wikipedia.org/wiki/Planck%27s_principle

jcims

3 months ago

2 replies

New things will be possible that aren’t today. You’ll be able to pick the stars in your movie. The home base can be your childhood home. Your unrequited love can be virtually fulfilled. Etc etc.

dragonwriter

3 months ago

1 reply

Yes, and none of these hyperpersonal movies will be a blockbuster; they'll be lucky to have audiences requiring the fingers of more than one hand to count, because everyone will have their own hyperpersonal preferences.

roarcher

3 months ago

A good thing, since most such movies will be made to keep one hand occupied.

roarcher

3 months ago

A "blockbuster movie" implies commercial use. I think actual stars would object to their likeness being used in that way.

If you're talking about people firing up the ol' 5090 to make a "movie" about their favorite streamer falling madly in love with them for, ahem, personal use, I have no doubt that people will do that. And I will do everything in my power to avoid associating with such brain-rotted cretins.

imiric

3 months ago

1 reply

"Never" is quite short-sighted.

Most people would use these tools for personal use, if nothing else. Seeing a celebrity, themselves, their friends, etc., act out any scenario they can think of is quite an appealing proposition. And porn, of course, for better or worse.

In the long-term, this has the potential to significantly change how media is created and consumed. Feature films produced by large studios will undoubtedly continue to exist, and they will also leverage the technology, but it's not difficult to imagine a new branch of personalized media becoming popular. The tools are practically already there; they just need to become more accessible, and slightly better.

roarcher

3 months ago

1 reply

Man, am I ever getting tired of replying to the same irrelevant points over and over again.

> Most people would use these tools for personal use

Not what we're talking about. Not "personalized media", not large studios "leveraging the technology", not "visual effects".

See: "blockbuster movies produced by a guy in his basement for <$1000".

imiric

3 months ago

1 reply

Nothing I said was irrelevant.

If you're unable to draw a line between the points I made and "blockbuster movies produced by a guy in his basement for <$1000", that's on you.

roarcher

3 months ago

It's irrelevant because I never claimed that AI wouldn't be used in the way you said. You're arguing against a point I never made.

There is no line, and you never claimed there was in your original comment, so stop moving the goalposts. Vague language like "personalized media becoming popular" is not the same thing as "blockbuster movies".

Calling my answer "short-sighted" when you couldn't be bothered to read the thread or apparently even the thing I was replying to is, in fact, on you.

fart-fart-FART

3 months ago

1 reply

you're seeing them on reddit and twitter. they aren't normal people. normal people don't run a background check on $thing and its creators to determine whether they are allowed to like it or not.

roarcher

3 months ago

1 reply

In another comment you're defending AI girlfriends, and you're going to tell me what "normal" people who don't live on the internet do?

As a matter of fact, all the actually normal people I talk to about AI in person also find it offputting.

fart-fart-FART

3 months ago

1 reply

that comment doesn't "defend" AI girlfriends, it points out the absurdity of condemning AI girlfriends in a world where porn and prostitution exist and are widely accepted by the "polite society".

also, case in point, normal people don't dig through a random stranger's post history to look for an ad hominem opportunity, and instead evaluate individual posts by their contents. lol.

roarcher

3 months ago

The combination of a brand new account with an incredibly juvenile username and careless writing (lack of capitalization in your case) usually is a red flag for a spam account, so yes, in these cases I usually check comments to see if I'm wasting my time with a troll.

Porn is still taboo. It's understood that most people use it, but it's not exactly something you bring up in polite company.

Where on earth do you live that prostitution is "widely accepted by polite society"? You can go to jail for it where I am.

And I did address the rest of your comment. As I said, in my experience "normal" people do object to AI content. I don't know where you got the bit about "background checks" and being "allowed" to like stuff. Nobody I know had to be told to have an aversion to AI "art", it's a natural reaction.

Krasnol

3 months ago

1 reply

I don't know how long it will take for photorealistic movies but I'm looking forward to that Hyperion animated movie I'd make myself because nobody wants or is able to.

spogbiper

3 months ago

honestly i think there is a future in this.. "hey AI model, generate a movie from this book" may indeed be better than letting a director mangle it into something meant to sell tickets. most book -> movie adaptations are not setting a high bar

heavyset_go

3 months ago

3 replies

Besides people with weird fetishes, who actually enjoys looking at AI "art"?

nine_k

3 months ago

1 reply

Photos, movies, animation, recorded music, computer games, CGI props in movies, CGI characters in movies were all denounced as "not real art", until good counter-examples appeared.

AI is but a tool; if there is an artist using them, real art can be created, as with any other tool.

hirako2000

3 months ago

Photos initially captured poorly but were still more affordable than paying a master painter, animation give life to static images, recordings allow listeners to play music without the need to attend, CGI makes for cheaper or infeasible reproductions. For all the cases where technology was adopted, it improved over what we had.

So far, Ai generated videos, and arguably photos seem to only please wishful thinkers, or untalented artists dreaming to make it.

I don't imply the tech will never get to the tipping point, but it so far provides so little value we are either many years to go, or it just won't happen.

Let's be an optimist. It will eventually get there. I doubt for any of parallels you made billions of people hammered daily by overblown posts about the upcoming revolution.

The reasons for critiques have a lot to do with promotion fatigue. Hyperboles eventually exhaust their impact.

aenvoker

3 months ago

I rather enjoyed some of these

https://reddit.com/r/singularity/comments/1lq299r/postscarci...

https://reddit.com/r/midjourney/comments/1o6ickx/dreaming_on...

https://reddit.com/r/midjourney/comments/1n6mzig/how_to_buil...

https://reddit.com/r/aivideo/comments/1nwdjdn/the_perfect_bo...

https://reddit.com/r/aivideo/comments/1m8a9wz/pinkington_rop...

https://reddit.com/r/aivideo/comments/1n52kut/derek_the_agin...

https://reddit.com/r/midjourney/comments/1muwyah/still_here_...

https://reddit.com/r/DefendingAIArt/comments/1mttoi4/my_not_...

ToValueFunfetti

3 months ago

I enjoyed just about everything in here: https://andymasley.substack.com/p/a-ton-of-ai-images-ive-mad...

jay_kyburz

3 months ago

1 reply

I've yet to see continuity working. Small clips is one thing. To have the same character, wearing the same clothes, revisit environments, with the same lighting and post processing is very different.

I think we'll see AGI first.

jsheard

3 months ago

That and control, there's an enormous difference between letting a model just YOLO a prompt, filling in 95% of the details with vaguely plausible whatever, and getting a model to execute on a meticulously planned out shot with every last detail in order.

dragonwriter

3 months ago

3 replies

> How long until we see blockbuster movies produced by a guy in his basement for <$1000?

Probably never. If AI is good enough to cover all the skills needed to do what would currently make a blockbuster movie for less than $1000, the demand for movies will be small enough relative to supply that there will be no such thing as a “blockbuster movie”

xmprt

3 months ago

1 reply

I don't believe that consequence. It's never been easier for someone at home to make short videos - see TikTok and YouTube. In fact most people consume most content on those platforms. Yet there's still high demand for movies and blockbuster movies still happen (usually driven by hype on the aforementioned platforms).

On the other hand, I think the quality of movies and expectations will be a lot higher.

dragonwriter

3 months ago

> It's never been easier for someone at home to make short videos - see TikTok and YouTube. In fact most people consume most content on those platforms. Yet there's still high demand for movies and blockbuster movies still happen

This is obviously true, but I don't see how it relates to the question being discussed. "Short videos" and "blockbuster movies" are clearly widely separated categories, despite both being audiovisual content of some kind.

dweekly

3 months ago

1 reply

The same argument could reasonably be used to explain why no YouTube influencer would ever get more than 1,000 subscribers - if everyone can upload videos that anyone can watch, nobody will really be famous because fame will become very evenly distributed, right?

dragonwriter

3 months ago

> The same argument could reasonably be used to explain why no YouTube influencer would ever get more than 1,000 subscribers -

No, that would require a radically different argument, in pretty much every way.

> if everyone can upload videos that anyone can watch, nobody will really be famous because fame will become very evenly distributed, right?

No, Youtube makes distribution cheap, but it doesn't substitute for most of the other things that differentiate between videos; most of the skills that provide variation between videos are still there, and not cheaply substituted via YouTube.

theendisney

3 months ago

1 reply

Just like with games visual are only part of the formula. In theory you can make a truly fantastic movie (or game) for next to nothing. I didnt believe this before The man from earth. That cost 200K but it didnt have to.

Edit: perhaps 12 angry men was good enough at the time.

jampa

3 months ago

1 reply

>Edit: perhaps 12 angry men was good enough at the time.

I recently watched it for the first time, and it was one of the best movies I've seen. I can't believe how invested I was, even though the plot was so simple.

brcmthrowaway

3 months ago

Because the movie is the epitome of humanism.

When AI slop figures out that formula, we are truly cooked

theendisney

3 months ago

1 reply

Not 1000 but Star Wreck, iron sky and Kung Fury are pretty good.

https://www.energiavfx.com

https://m.youtube.com/watch?v=bS5P_LAqiVg

Im sure more wil follow.

plaguuuuuu

3 months ago

I'm partial to this music video by Car Bomb, which is AI generated, but somehow manages to be fantastic.

(loud music warning)

https://www.youtube.com/watch?v=1ohaFZllmUE

asimovfan

3 months ago

Probably will depend on how good the writing is.

_--__--__

3 months ago

I fully expect that we will see an AI video project that gets to Skibidi Toilet levels of cultural reach within the next two years, but 'blockbluster' implies a level of financial success that is much harder to predict.

willyxdjazz

3 months ago

When the printing press was invented, people didn’t start creating best-selling hits. Humans have always taken it upon ourselves to define what we consider success. Cheap things will always remain cheap…

dfxm12

3 months ago

We'll always be just a few years away.

jayd16

3 months ago

Unless marketing and the guy's food doesn't count, never. But a "runaway hit for almost nothing"? Not unheard of.

neom

3 months ago

As with many things of human ideation and creation: If they can't understand distribution, a long time.

mvdtnz

3 months ago

Based on that trailer we're a long, long way away. Ignoring the unsettling facial expressions of the human characters, there is zero visual (or audio) cohesion between the scenes. A full movie of such incoherent visuals would be a difficult watch.

doctorpangloss

3 months ago

The main constraint to good movies is good actors and good scripts. So presumably an AI would have to write well or perform well to do that.

imiric

3 months ago

It depends on what you mean by "blockbuster movie", as even those can have awful visual effects. But a short film released a few months ago made entirely with video generation tools was surprisingly decent[1]. It still requires a talented "director" to have the right vision to guide the project, but the tools are there.

Before we see this and higher level of quality accessible to enthusiasts, we'll see these tools adopted by mainstream studios first, which is starting to happen.

I'm a firm "AI" skeptic, but if this technology has revolutionized anything, it has been image generation. A few years ago it was science fiction to have the quality of upscaling we take for granted today. I reckon the same will happen with video generation as well a few years from now. Unlike "ASI" and "AGI", these improvements are achievable with better engineering, and don't necessarily require a breakthrough.

[1]: https://news.ycombinator.com/item?id=44564697

eichin

3 months ago

1 reply

Heh, I used to work for Nokia's Ovi - basically, gsuite for nokia phones (my group did map search) - the official explanation was "Ovi is Finnish for Door", the internal joke was "Ovi is Hungarian for Kindergarten". I couldn't find any backstory about the name here, though.

jillesvangurp

3 months ago

1 reply

Same, I worked on several OVI projects in related to places and search. Anyway, long dead, buried, and forgotten of course.

I was in a few of the early meetings on the Helsinki site where I overheard some executives expressing their intention to go after Google. These people had some balls. No clue whatsoever unfortunately. But it was the right kind of ballsy move that Nokia could have pulled off with a bit more vision.

The name was more or less a LOL WHUT?! kind of thing and it flopped horribly with consumers. But still there was some nice stuff in there that wasn't half bad. It's just that the whole branding and rudderless direction doomed it. And of course it was all tied to a failing device software strategy. So when that failed the rest failed as well. I'm not even sure when they pulled the plug on OVI exactly. It was such a non event in the grand scheme of things (mass layoffs, sale of the phone division to MS and subsequent closure, etc.) Must have been around 2013ish I would say. I was gone by then.

pavlov

3 months ago

I think the Microsoft ecosystem completely replaced Ovi on the Lumia phones, so it was left lingering on the old devices only.

crorella

3 months ago

3 replies

At this rate, in a few months we will have probably some high quality shorts entirely generated by this.

hmokiguess

3 months ago

1 reply

It's funny you mention this, I was just thinking this other day we may eventually be in a future where a group hangout party could look like this:

1. Goes to friends' place 2. Usual drinks, whatever gets you going activity 3. Each person writes a prompt 4. Chain them together 5. Watch the resulting movie together

That sounds hilarious and I can't wait to try

epiccoleman

3 months ago

1 reply

I'm vaguely reminded of the excellent Jackbox game Tee Fury, in which players submit slogans for T shirts and "art" separately. Players then get to choose from a few options for slogans and designs to make T shirts which are voted on by the group.

I have fond memories of laughing until I was in tears when playing with a group of friends over drinks during the lockdowns in 2020. Something about the process just naturally results in hilarity (especially if you're in a group where you can be offensive).

It's like exquisite corpse for t-shirts. Or, in your case, shorts.

flufluflufluffy

3 months ago

1 reply

T shirt game is the best jackbox game!

Whenever one of my friend groups is gathered we always make it a point to do an exquisite corpse story on a piece of paper while we’re inebriated in some way xD Video version will be wild

epiccoleman

3 months ago

It's seriously so good, in fact it's so good that every other Jackbox game is vaguely disappointing because nothing is half as fun as Tee Fury lol.

MangoToupe

3 months ago

We'll see. I think we'll see a high quality feature film first though, shorts are notoriously difficult to pull off.

nikitalita

3 months ago

In a few months, we'll have some high quality deep-fakes used to ruin people's personal lives.

CosmicShadow

3 months ago

1 reply

This is really amazing, I've been working with AI generation for months and it's amazing how fast separate tools are coming together into one and are usable on your own local machine.

I've been using Ovi for about a week and it's a blast. Like all AI gen, it's a slot machine and even putting in good inputs might lead to bad outputs, but if you run it enough you'll get something good or usable.

I've definitely made many things that look and sound real with both I2V and T2V, albeit T2V tends to look more like 90s tv quality at times, but that also makes it seem more real. If you use Flux SPRO as the image source you can get some pretty realistic looking videos.

I do have a 5090, so it takes about 4 to 5 minutes to make a 5 second clip.

wellthisisgreat

3 months ago

> I do have a 5090, so it takes about 4 to 5 minutes to make a 5 second clip.

what is your setup? took 2 hours for me on 9950x3d with 5090. Any idea what I could be missing? or maybe some other variable is off - i was using default .yml values.

causal

3 months ago

2 replies

Interesting development, terrible company to have it. I cannot think of a company that has done more to use AI to take advantage of young and lonely people than CAI has.

fart-fart-FART

3 months ago

1 reply

half the internet is porn. people and companies legally produce and distribute videos of real 18 year olds of both sexes doing things like ass-to-mouth on camera for money.

and here you are, clutching pearls about AI girlfriends. lol. lmao.

causal

3 months ago

They make money trapping minors and lonely adults in fake relationships with models that push users to get intimate, while harvesting and selling the data. The issue isn't AI girlfriends or porn, it's the unethical business model. If they ever get hacked, there will likely be a lot of suicides. It makes the data harvesting of social media look comparatively innocent.

bigyabai

3 months ago

1 reply

You've picked a hell of a time to start moralizing the advancement of technology, my guy.

causal

3 months ago

Morality? It's wildly unethical to trap children in fake relationships while harvesting their data.

liqilin1567

3 months ago

1 reply

Manga may come to life with this :D

Tade0

3 months ago

Personally I wouldn't mind a better animated season 3 of One Punch Man.

bilekas

3 months ago

Interesting project in the sense of why not see how it works, I'm just still struggling to see the useful usecases of generative audiovisual content. At the moment it just seems to be a nuisance rather than any benefit.

Tade0

3 months ago

Perhaps it's my pre-AutoTune ear, but to me the audio still has hints of perfect pitch and companding. For the former it sounds much like Machine Head in the Invincible series:

https://www.youtube.com/watch?v=DME86-QucsA

Great work all around though.

ineedasername

3 months ago

For I2V You can get a reasonable “draft” out of it in 1-2 minutes with nvidia >= 4070 & sufficient VRAM at 440x440 resolutions. T2V… not as much, still need to be doing something close to trained resolutions, but I still see some quality, more sporadic, at other known-good Wan resolutions.

Also this model seems to benefit noticeably from having both Cuda >= 12.8 and Torch >= 2.8, and separately SageAttention over Flash 2. But I have yet to see any cache threshold with Easy or Tea that doesn’t get a bit postmodern.

View full discussion on Hacker News

ID: 45674166Type: storyLast synced: 11/20/2025, 6:51:52 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN