Do Not Write Docs Manually, Let Video2docs Do It for You

Posted3 months agoActive3 months ago

alexattt

3 points

3 comments

video2docs.comTechstory

supportivepositive

Debate

20/100

DocumentationAutomationVideo Processing

Key topics

Documentation

Automation

Video Processing

The post promotes a tool called video2docs that automatically generates documentation from videos, with the community showing interest and some discussion around its potential applications and limitations.

Snapshot generated from the HN discussion

Discussion Activity

Light discussion

First comment

N/A

Peak period

0-2h

Avg / period

Key moments

01Story posted
Oct 24, 2025 at 3:59 PM EDT
3 months ago
Step 01
02First comment
Oct 24, 2025 at 3:59 PM EDT
0s after posting
Step 02
03Peak activity
1 comments in 0-2h
Hottest window of the conversation
Step 03
04Latest activity
Oct 25, 2025 at 1:22 PM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (3 comments)

Showing 3 comments

alexatttAuthor

3 months ago

1 reply

Hey, I am creator of video2docs and this is my first "big" launch, I am nervous (no, I am fine actually) :D

But the idea is quite simple, recently I had to write more and more docs (more often - how-tos and guides for company systems) and got a bit tired of it. Decided it would be cool if I could just record a video clicking through the app (or multiple apps, does not matter) and then analyse the video content even without audio narration. That is how video2docs was born! I plan to add audio analysis too, for even better quality documentation, but for now, I am happy with how it works even without it.

You can choose from 10 LLM models for video analysis. Choose documentation style (tutorial, how-to, quickstart...) And, of course, choose whether to include screenshots in generated markdown docs. Yay, no need to make screenshots manually! :)

I hope someone else might find this useful. I will continue working on this project!

eevmanu

3 months ago

1 reply

Looks awesome!

Is there anything you can share about the architecture or pipeline you used for it? A high-level overview would be enough.

I’m guessing you’re doing video-to-image, image-to-text, and then text-to-docs, right? Since not all of the models you mentioned are multimodal.

alexatttAuthor

3 months ago

Thanks! :)

More or less, I have python worker that does the video processing job. Video into frames, frame deduplication, frame LLM analysis and then generating docs from that information. Soon audio narration analysis will be added too!

View full discussion on Hacker News

ID: 45698504Type: storyLast synced: 11/17/2025, 9:14:05 AM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN