Hacker News new | past | comments | ask | show | jobs | submit login

> If you have large messages and use keepalives (and you'll need keepalives), you need to write your own message fragmentation.

I'm confused by what you mean by that. Do you mean "large" as in "take a long time to process in the consumer"? If so, and if your consumer is not issuing heartbeats concurrently with message processing, then that is true.

> There are no python libs that just work.

Completely agree. Having hacked on and patched the code inside Celery, it's really quite a bummer. I think this is because the Python libs try to abstract over things that ... just straight up can't be abstracted away given the semantics of AMQP: specifically connection-drop-detection, "resumption" of a consume (not really possible; this isn't Kafka), and the specific error code classes (connection-closed vs channel-closed vs information).

> If you have a single open connection, it will get stuck from time-to-time.

Are you talking about publishing connections? Consuming connections? One used for both? What does "stuck" mean? I'd be interested in hearing more about this.

> exactly-once delivery isn't a thing

Kinda pedantic, but exactly once delivery is possible in some very restricted situations (see Kafka's implementation of this guarantee: https://www.confluent.io/blog/exactly-once-semantics-are-pos...). Exactly once processing is what's tough-née-impossible. So yeah, idempotence is great.




> I'm confused by what you mean by that.

By large, I mean 10+ MByte.

> Completely agree. Having hacked on and patched the code inside Celery, it's really quite a bummer.

I don't understand what the point of celery is. Literally everything I do requires /some/ persistent state in the workers, and there's no way to do that with celery.

> Are you talking about publishing connections? Consuming connections? One used for both? What does "stuck" mean? I'd be interested in hearing more about this.

TCP connections. As in, a connection to the server from a consumer. High latency connections seem to exacerbate the issue.

I think the issue is the state machines server-side and client-side get out of sync, and things just stop until the keep-alives/heartbeat cause the connection to reset, but that's a bunch of time to wait with no messages.

I also ran into the issue that basically every python library had at least one or two locations where `read()` was called without a timeout, but that was at least easier to fix.

> Kinda pedantic, but exactly once delivery is possible in some very restricted situations (see Kafka's implementation of this guarantee: https://www.confluent.io/blog/exactly-once-semantics-are-pos...). Exactly once processing is what's tough-née-impossible. So yeah, idempotence is great.

Well, it isn't really a thing, so you at least shouldn't depend on it being a thing for your architecture if possible.


> By large, I mean 10+ MByte.

OK. Did Rabbit or your client libraries bug out when sending single giant messages? What does message fragmentation (by which I assume you mean splitting one logical message up over multiple AMQP messages? Or something else?) have to do with keepalives (and what do you mean by keepalives? Connection heartbeats? TCP keepalives?)?

> Literally everything I do requires /some/ persistent state in the workers, and there's no way to do that with celery.

Sure there is. In-memory caches persist between requests. And there's always sqlite and friends. Celery's more intended for the "RPC/fire-and-forget" case than stateful workloads, but it's not too painful to use those with it. And you get the benefits of its (reasonably) hardened connection/heartbeat management, which may help with some of your other issues.

Basically every time I've seen code that rolled its own bespoke consumer loop for RabbitMQ, it was wrong in some fundamental ways; the state machine on the consumer side did indeed get out of whack, and badly. Best to outsource the "keep the connection alive, establish subscription, detect failures" work to a higher-level library (like Celery) that provides a long-lived consumer so your code can just be occupied with data processing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: