I'm not a data guy, can you explain this part a bit more if you have time?
> you have realtime data ingestion, you will also have to build tooling to re-ingest older data that has changed or needed to be amended. This will end up looking like a 'lambda architecture'
Lambda architecture for data processing, as popularized by Nathan Marz et al [0], has two components, the Batch layer and the Stream layer. At a high level, Batch trades quality for staleness whilst Stream optimises for freshness at the expense of quality [1].
I believe what GP means by Lambda is that, you'd need a system that batch processes the data to be amended / changed (reprocess older data) but stream processes whatever that's required for real-time [2].
An alternative is the Kappa architecture proposed initially by Jay Kreps [3][4], co-creator of Apache Kafka.
In theory this sounds great, but you have to account for processing capacity.
While compute is getting cheaper, one of the key reasons streaming in lambda sacrifices quality over throughput is compute capacity (as well as timing). If you have to feed already stored data through the same streaming pipe, you either have to have a lot of excess capacity, be willing to pay for that additional burst or accept latency in your results (assuming you can keep up with your incoming workload and not lose data). There is no free lunch.
> you have realtime data ingestion, you will also have to build tooling to re-ingest older data that has changed or needed to be amended. This will end up looking like a 'lambda architecture'