Ask HN: Is it likely AI training models could start training on personal files?
No synthesized answer yet. Check the discussion below.
There's allegations that gemini is already trained on this data.
>Does anybody believe Google (and other companies) might soon start scanning personal files we hold on their storage facilities? Is that a legal possibility for them?
Free accounts already have agreed to be used.
>It seems to me that it's a huge pool of fresh training data that they would inevitably want to get their hands on. And given how much they have already trained on, it seems the next logical step from a business standpoint.
Im actually not so sure they have or ever will do. The problem isnt quantity, it's quality. Sure it could train on a bunch of trash in people's but then when inferring, it'll produce trash.
>Clearly they would need to change their privacy policies and terms of agreements and inform users of these changes. Is it possible they could slip this sort of change in without much notice?
you've been agreeing to them being able to read the content of the files for antivirus and antispam reasons for a very long time. To start doing it for AI requires no change.
>I was also wondering if anybody might have pointers for the best strategy to securely backup offline. I don't want to just shift my family photos from one company to another where business execs are training their own model. Anybody else handled this recently?
One of the useful apps I found was 'foldersync' which makes backup to cifs shares possible.