Hacker News new | past | comments | ask | show | jobs | submit login

My current architecture is built around batch ingestion[1], and doesn’t (yet) have a way to do incremental updates. This is great for getting coverage—there are a lot of long-tail results in my search engine but not in Google!—but it does mean there’s more lag and the results aren’t instantly up-to-date.

[1]: e.g. for StackOverflow, I download an XML dump of the entire site once a quarter: https://search.feep.dev/blog/post/2021-09-04-stackexchange




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: