Yep, absolutely. But the side effect I am referring to (and probably wasn't clear enough about) is that the oplog is globally shared across the replica set. So even if your queue collection tops out at like 10k documents max, if you have another collection in the same deployment thats getting 10mm docs/min, your queue window is also gonna be artificially limited.
Putting the queue in its own deployment is a good insulation against this (assuming you don't need to use aggregate() with the queue across collections obviously).
I do agree, but listen... this is supposed to be handy solution. You know, my app already uses MongoDB, why do I need another component if I can run my notifications with a collection?
Also, I am firm believer that you should not put actual data through notifications. Notifications are meant to wake other systems up, not carry gigabytes of data. You can pack your data into another storage and notify "Hey, here is data of 10k new clients that needs to be processed. Cheers!"
The message is meant to ensure correct processing flow (message has been received, processed, if it fails somebody else will process it, etc.), but it does not have to carry all the data.
I have fixed at least one platform that "reached limits of Kafka" (their words not mine) and "was looking for expert help" to manage the problem.
My solution? I got the component that publishes upload the data to compressed JSON to S3 and post the notification with some metadata and link to the JSON. And the client to parse the JSON. Bam, suddenly everything works fine, no bottlenecks anymore. For the cost of maybe three pages of code.
There is few situation where you absolutely need to track so many individual objects that you have to start caring if they make hard drives large enough. And I managed some pretty large systems.
> I do agree, but listen... this is supposed to be handy solution. You know, my app already uses MongoDB, why do I need another component if I can run my notifications with a collection?
We're in agreement, I think we may be talking past each other. I use mongo for the exact use case you're describing (messages as signals, not payloads of data).
I'm just sharing a footgun for others that may be reading that bit me fairly recently in a 13TB replica set dealing with 40mm docs/min ingress.
(Its a high resolution RF telemetry service, but the queue mechanism is only a minor portion of it which never gets larger than maybe 50-100 MB. Its oplog window got starved because of the unrelated ingress.)
You have a single mongo cluster that's writing 40M docs a minute? Can you explain how? I dont think I've ever seen a benchmark for any DB that's gotten above >30k writes/sec.
Sorry for the late reply here, just noticed this. You're correct that figure was wrong, that metric was supposed to be per day, not per minute. Its actually closer to 47mm per day now, so roughly 33k docs/min.
> I dont think I've ever seen a benchmark for any DB that's gotten above >30k writes/sec
Mongo's own published benchmarks note that a balanced YCSB workload of 50/50 read/write can hit 160k ops/sec on dual 12-core Xeon-Westmere w/ 96GB RAM [1].
Notably that figure was optimized for throughput and the journal is not flushed to disk regularly (all data would be lost from last wiredtiger checkpoint in the event of a failure). Even in the durability optimized scenario though, mongo still hit 31k ops/sec.
Moving beyond just MongoDB though, Cockroach has seen 118k inserts/sec OLTP workload [2].
Putting the queue in its own deployment is a good insulation against this (assuming you don't need to use aggregate() with the queue across collections obviously).