My current architecture is built around batch ingestion[1], and doesn’t (yet) have a way to do incremental updates. This is great for getting coverage—there are a lot of long-tail results in my search engine but not in Google!—but it does mean there’s more lag and the results aren’t instantly up-to-date.
[1]: e.g. for StackOverflow, I download an XML dump of the entire site once a quarter: https://search.feep.dev/blog/post/2021-09-04-stackexchange