wikimedia:

UNLIMITED WIKIPEDIA SIPHONING STARTING 100K USD PER YEAR. AND BRO, TRAIN WHATEVER YOU WANT WITH IT IT'S FREE CULTURE! LOL. CHECK OUR COLLAB WITH HUGGING FACE MY DUDES. AI4LYFE!

https://enterprise.wikimedia.com

also wikimedia:

oh noooes... nasty ai generated content is littering our commons, who has time on the weekend to volunteer and clean up wikipedia pages? *sniffle*

https://en.wikipedia.org/wiki/Wikipedia:WikiProject_AI_Cleanup

what. a. circus.

It's worth pointing out the role of #commoncrawl in all of this. Their aim was "beneficial": instead of every research group scraping the web separately (hammering all our servers), they decided to do it once as a public pool of data for research. But:
(a) they did nothing to help respect authors' licensing (e.g. "no-derivatives"/"share-alike" #creativecommons choices);
(b) they hide behind US "fair use" law, but they do nothing to ensure the data will only be used for fair-use purposes.