Show HN: Dayflow – A git log for your day
github.comThe feature isn't the problem.
i.e. apple watch for sleep, running, activity levels? it could really give a 360 view of your life
> Gemini leverages native video understanding for direct analysis, while Local models reconstruct understanding from individual frame descriptions - resulting in dramatically different processing complexity.
For people like me who haven't dabbled much with AI video processing and have no intuition for it, could you clarify the drawbacks of such a local-only approach vs what Gemini offers? I don't mean the performance or power/battery impact (that part is clear), just in terms of end-result and quality what the practical differences are.
I'm in the only-100%-offline camp here but would like to know what I'm missing out on since I won't even try Gemini here.
Some things I would like to be able to do with software like this:
- Identify the 'spark' of a distraction. For example, opening my email inbox to read a specific email also shows me many unrelated emails. These can easily be the cause of a 5-15 minute distraction. This information is often actionable. I installed browser plugins to hide my youtube suggested videos and my distractions went down. I made sure to close all unused windows to avoid catching a glimpse of unrelated work.
- Identify repeated tasks, and the cadence of those tasks. Do I manually make an invoice once a week for a particular edge case? Is the process basically identical every time. Could this be automated?
- How was I feeling before, during and after a task. (This is a very broad and intentionally not well-defined question, but I think it has the most promise for improving procrastination and task initiation).
FWIW it’s “poring over” when reading carefully.
From Merriam-Webster
“As a verb, pore means "to gaze intently" or "to reflect or meditate steadily." The verb pour has meanings referring to the falling or streaming of liquid (or things that move like liquid).”Knowledge work is knowledge work, no point belittling colleagues in a different profession.
In particular, it was a document management system built as a plugin for MS Outlook. (ew)
Most users, had no issue. However for one user, a lawyer, in particular, she would open and close a bunch of documents (using the built in pdf viewer) and then the application would crash, taking outlook with it, often requiring a restart.
I went over to view the behavior, and she was some kind of robot. Unlike her peers, she had 12 documents open at once, and she could update and bill (in minimum 6 or 7 minute increments) 12 customers cases in 15 minutes. It was like meeting the Usain Bolt of law practitioners. My back of the napkin math is that she billed like 3-4 hours for every hour she was online.
Open Email
Load Attachment
Review Attachment
Reply to email
Assign Email thread to case number
Close attachment.
12 times in 15 minutes.
The bug was that, after loading ~6 pdfs, the application would back off and wait to deallocate the memory. It would then later, randomly decide to write to that memory when another pdf was loaded, and go kaput.
Just to replicate the issue, I had to close and reopen pdfs so quickly my hands hurt.
It took 3 revisions of the bug report to get the software company to accept it and resolve it. And even then I think the pdf limit just increased, before we submitted another report and had it resolved permanently.
On that note, the principal of another law firm I supported would require us to cleanse his personal laptop of porn themed golf games he had downloaded on a regular basis.
The impression I get is that, lawyers work but the work is just unevenly distributed.
Sounds like a game Epstein and Trump (and some "enigmas") could've played at Mar-A-Lago. With cheating, though.
Not sure if the pun was intended, but I'm here for it
Anybody have a different experience?
She had been complaining the day before about having to reconstruct a huge bunch of little 0.1 entries involving e-mails to various individuals in cases. If it could be done automatically, through a local LLM? chef's kiss
Trust me, law is definitely where you want to land this thing.
In all honesty, I have absolutely no negotiating power or decision-making authority for my firm, but it's a big one -- if that's a direction you want to go, can't guaranty I can swing enough weight, but I probably could find you the right people to talk to, give you an introduction.
This could help battle procrastination, organize your time in a long run, bill your clients more efficiently, etc. 20 years younger, hyper productive me would kill for such product.
But then I recall when I accidently suggested TimeRescue to my boss at one time, and suddenly he was skimming though everyones daily logs to see if they're spending 100% of their times in business facing apps.
When I first heard about "covid mouse mover devices" that faked activity for remote workers I thought it was a joke. Seriously.
But I'm afraid this is the dystopian future. Employers constantly looking at your screen and getting spreadsheets with your daily effort.
Overall, very disturbing product.
Another approach is to run OCR on 1FPS screenshots. Everything runs locally without draining the battery like an LLM would.
I'm reading through papers that suggest it should be possible to get SOTA performance on local models via distillation, and that's what I'll experiment with next.
Kudos particularly for the efforts you've gone to on explaining privacy implications.
If it’s recording 15 seconds, how often are you doing that? Once every 15m as the analysis interval is 15m?
So I'm not sure I buy the lightweight/low-impact claim.
edit Nvm, it seems it always records the display that is currently in focus. That is probably the better way to handle it, since it automatically solves the "ignore what's shown but not interacted with on secondary displays" problem.
github.com/mediar-ai/screenpipe
kind of sad it's macos only, i'm mostly windows user now :)
This is the error I got: reason: No valid observations generated from frame fallback
Replace PostgreSQL with Git for your next project for git data storing. https://news.ycombinator.com/item?id=4535144 https://devcenter.upsun.com/posts/why-you-should-replace-pos...
Consumer.today day-logging single user microsite. https://consumed.today/ https://news.ycombinator.com/item?id=45351446
Cute serendipity, rule of three. Neat project too; conceptually it sounds like an amazing ability to be able to better watch ourselves. Doing it via screenshots & AI feels like a fun sense-making adventure that actually makes a lot of sense, that can maybe try to pick through & discern what the screen is doing in a lot of different scenarios.
1. "Create a reminder for reading this email at 5:00 pm" and this could infer what to do from the screen shot's description(plus a local MCP tool for calendar)
2. "Can you fetch that file form that project in that workspace and implement the pattern in the code on my vscode terminal?" It can lower cognitive fatigue of typing and clicking a bunch of place.
3. Take notes as I describe something on the screen. It could be for prompt composition e.g. get the link from my browser and the file on vscode and write code that does XYZ.
just wow
2025 is getting surreal online
Going a step further, "real time" (given processing delay) to help stay on task when the focus has shifted to something unrelated (maybe allow the individual to define this or say yes/no to train the prompts as it goes).
Anyways, it looks great. I also liked the _idea_ of Windows Recall, so to see something like this that can be privacy first is really nice.
Maybe patching https://github.com/JerryZLiu/Dayflow/blob/main/Dayflow/Dayfl... to say "Describe what you seen in this computer screen in the style of Werner Herzog" would do it...
Funny enough, I had a similar idea a few weeks back; I jotted it down in my idea sketchpad. It felt a bit ambitious for an open-source side project, and I wasn't sure if it could even work with a local LLM. I was genuinely excited about it, nonetheless.
Now that I know it's totally viable, I've got even more reasons to build a Linux version myself.
Right?
Compliments for the Wizard - that one works perfect at least with Gemini. One little detail: You have a Github Star button in it, that really was at a non-logical place and made me think.
As already seen in the comments there are lots of desires to add more data compared to just screen input.
Could be things like:
- Apple HealthKit / watch - custom apps - Phone logs
Also you stated, and true, that there is much focus needed on improving your core feature.
It might be interesting to allow some kind of API / plugin area. So that people can expand on your core feature and add the desired parts. Might in the future expand to some kind of AppStore like feature with plugins.
That would keep your work focused and allows others to make it complete in their vision, and for others.
Feel like something of this shape should have existed for a while, but this is very well executed!