Hacker News new | past | comments | ask | show | jobs | submit login

Yeah, also this means the death of archival efforts like the Internet Archive.



Welcome scrapers (IA, maybe Google and Bing) can publish their IP addresses and get whitelisted. Websites that want to prevent being on the Internet Archive can pretty much just ask for their website to be excluded (even retroactively).

[Cloudflare](https://developers.cloudflare.com/cache/troubleshooting/alwa...) tags the internet archive as operating from 207.241.224.0/20 and 208.70.24.0/21 so disabling the bot-prevention framework on connections from there should be enough.


That's basically asking to close the market in favor of the current actors.

New actors have the right to emerge.


They have the right to try to convince me to let them scrape me. Most of the time they're thinly veiled data traders. I haven't seen any new company try to scrape my stuff since maybe Kagi.

Kagi is welcome to scrape from their IP addresses. Other bots that behave are fine too (Huawei and various other Chinese bots don't and I've had to put an IP block on those).


No they don't.

There's no rule that you have to let anyone in who claims to be a web crawler.


So who decides that you can be one? Right now it's Cloudflare, a litteral monopoly...

The truth is that I sympathize with the people trying to use mobile connections to bypass such a cartel.

What Cloudflare is doing now is worse than the web crawlers themselves and the legality of blocking crawlers with a monopoly is dubious at best.


which is why they will stop claiming to be one.


so what happened to competition fostering a better outcome for all then?


a large chunk of internet archive's snapshots are from archiveteam, where "warriors" bring their own ips (and they crawl respectfully!). save page now is important too, but you don't realise what is useful until you lose it.


This sounds like it would be a good idea. Create a whitelist of IPs and block the rest.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: