Hacker News new | past | comments | ask | show | jobs | submit login

[flagged]



> No, put up a loginwall or paywall, authenticate users, and go private.

We know for a fact that AI companies don't respect that, if they want data that's behind a paywall then they'll jump through hoops to take it anyway.

https://www.theguardian.com/technology/2025/jan/10/mark-zuck...

If they don't have to abide by "norms" then we don't have to for their sake. Fuck 'em.


[flagged]


>the law explicitly allows scraping and crawling.

Nepenthes also allows scraping and crawling, for as long as you like.


this is a very US-ian view of the world

my site is not in the US, I am not a US citizen. US law does not apply to me.

under UK law: robots.txt is an access control mechanism (weak or otherwise)

knowingly bypassing it is likely a criminal offence under the Computer Misuse Act

good luck suing me because you got stuck when you smashed my window and climbed through it




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: