pg_flo looks like a very interesting project we might end up using. I like the archival to s3 and cheap versions of tools that get done at lower scale like GBs instead of TBs worth of data (Which is what I think Debezium does and something I can easily test locally)
The work/effort I need to put in with kafka etc., w/ Debezium is a short term effort, but evaluating the hassle.
better yet, see if bots access /robots.txt, find them from there. no human looks at robots.txt :)
add a captcha by limiting IP requests or return 429 to rate limit by IP. Using popular solutions like cloudflare could help reduce the load. Restrict by country. Alternatively, put in a login page which only solves the captcha and issues a session.
I... I do... sometimes. Mostly curiosity when the thought randomly pops on my head. I mean, I know I might be flagged by the website as someone weird/unusual/suspicious, but sometimes I do it anyway.
Btw, do you know if there's any easter egg on Hacker News' own robots.txt? Because there might be.
ENV variables having the names with PASSWORD or SECRET should be ignored by logging and monitoring systems. Most of the web has been built on trust of following conventions.
common secrets used on server side - `JWT_SECRET`, `DATABASE_PASSWORD`, `PGPASSWORD`, `AWS_SECRET_TOKEN` etc.,
Being a long time developer, this breaks the standard of backend apps which mostly uses 12 Factor App[1]. This approach introduces a new dependency for fetching secrets. I see all new open-source projects using "paid" or "hosted" solutions. It is no longer easy/simple to host a full open-source app without external dependencies. (I understand -- things are getting complicated with s3 for storage etc.,).
AWS shut down their service, if AWS can "easily" integrate with Gitlab, I see a lot of potential on the deployment side to increase AWS revenue.