Scaling to 1M active GraphQL subscriptions on Postgres

aidos · on May 31, 2019

Just another little shout out for Hasura here. It’s a really lovely system to use and the team behind it are awesome to deal with. Super friendly and helpful.

We evaluated a lot of stuff recently before settling on Hasura. Prisma, aws appsync, postgraphile, roll our own socketio backend etc. I had a few hesitations about Hasura. In particular I was hoping to lean on Postgres RLS for the authorisation, and to be able to customise the graphql endpoints a bit more. After hearing the rational from the guys about how their system is better than RLS and seeing the patterns they have for implementing the logic I’m really happy we made the leap.

We’re still in early days, but it’s been really solid and a real pleasure to use.

(No affiliation, but happily paying for their support licence)

ndarilek · on May 31, 2019

Out of curiosity, what is the rationalle for abandoning RLS?

In a previous project, I used Postgraphile and enjoyed it. Sure, RLS and PLPGSQL were a bit awkward, but I knew they were well-tested and had lots of eyes on them, so I felt comfortable that if an RLS policy blocked access to a row, then that particular role wasn't getting access. I also enjoyed working with a text-based format that more clearly fit within my other workflows of, well, working with other text-based formats. :)

Another part of it is that I'm blind, and Hasura seems to make doing things outside of its web interface somewhat painful. Sure I can write a YAML migration by hand, but the YAML migration format seems more machine-friendly than user-friendly. Yesterday I needed to create a function, and the SQL web interface wasn't showing me the line number on which my function had an error. Ultimately I dropped to the psql command line, cut-and-pasted the function in manually, and right away got back the line number and fixed the issue.

Please don't get me wrong. I'm not trying to hate on Hasura. But the fact that I can't just drop to a non-YAML text-based format and throw down a few checks to secure my tables has been an endless source of frustration for me. So if there is a non-NIH reason for abandoning RLS in favor of a separate security system, then maybe knowing that might help me be a bit less annoyed. :)

tango12 · on May 31, 2019

(I work at Hasura) Thank you for the note! This can definitely be made easier, and we’re working on this.

You can also add sql migrations as a sql file and that’s something we should document better! https://docs.hasura.io/1.0/graphql/manual/migrations/referen...

Coming back to RLS, doing what we’re doing to fetch multiple results for different clients is not easy. Connection variable vs being in the same sql statement essentially.

https://github.com/hasura/graphql-engine/blob/master/archite...

Further, owning RLS helps us target more hosted Postgres vendors, and other Postgres flavours that don’t support RLS well. RLS on Heroku doesn’t work: https://devcenter.heroku.com/articles/heroku-postgresql#conn...

Owning RLS does have a few other advantages. Having a unified experience in bringing that authz experience to “remote schemas”, for example.

(Typing on my phone, apologies for typos etc)

ndarilek · on May 31, 2019

Thanks for these explanations!

As a blind person who works more productively in a text editor than a web interface[0], here are a few things that might help: * Give me the ability to enter JSON directly in the permissions screen. I'm sure the query-builder is visually nice, but I can accessibly edit JSON strings better than you can accessibly visually render them, so let me just give you a string and have your interface validate it. Maybe just parse it, then transpose the values directly onto the form inputs in the visual interface. * Give me a command to validate YAML migrations without applying them. This could even be a --dry-run flag to `hasura migrate` telling me what SQL would run and the JSON for my new/changed permissions. * Some basic documentation on hand-coding migrations would help. I can sort of reverse-engineer them by pouring over the file formats, but the time taken to work more slowly in the web interface hasn't quite inspired me to take more time learning the undocumented format. :)

I'll see about filing these as issues soon if I can remember to do so. :) Either way, thanks for a cool product! Even if it frustrates me to use, tools like Hasura really hit a pain point for me in building apps.

0. Obviously that isn't universally applicable, but it is easier for me to solve logic problems than "why is my CSS all weird?" problems. Tools like Postgraphile and Hasura make it possible to test more of an app's functionality by writing just SQL rather than lining up my ORM with my language with my SQL schema. I guess just as Node supposedly reduces context-switching by placing more logic in a single language for front-end developers, tools like Hasura and Postgraphile let backend developers focus mostly on SQL when building out the app logic. And while I can't make something look good, damn can I think through how it should work and how someone might break it. :)

tango12 · on June 1, 2019

No worries, and thanks for these detailed thoughts again.

I've created an issue here for now: https://github.com/hasura/graphql-engine/issues/2310

Feel free to add more thoughts/suggestions!

aidos · on May 31, 2019

To be totally honest, I haven’t implemented anything using RLS, it just seemed the way to go since it’s native Postgres. The Hasura team gave me the same explanation as they give in this article - when you use RLS, you have to rerun every query for every client to get your results. By doing it in the application layer, you can run a single query instead.

In terms of the setup and migrations, I also have the same reservation about it being harder to do with text. So far I’ve left it to one of the other guys to implement so it hasn’t been painful for me, yet!

Edit to add: I believe they built their version of RLS before Postgres had it. So it was more a case of Not Invented Yet

ndarilek · on May 31, 2019

Good points, thanks. I keep forgetting that RLS is a fairly recent Postgres feature--2016 or so if I'm not mistaken.

ZitchDog · on June 1, 2019

I can't speak for the author, but having used RLS for authentication I would not recommend it. It complicates nearly every EXPLAIN plain, and it can cause performance headaches in even the simplest of queries.

We had to mark many internal functions as LEAKPROOF in order to get them to perform. It was not possible to move our database into RDS, so in the end we had to abandon the RLS authentication layer.

ndarilek · on June 1, 2019

Thanks for this. Lots of developers fall into the trap of discarding concepts because they aren't Facebook/Google scale solutions. The app on which I used Postgraphile/RLS was likely to have only a handful of users, so I thought more about solving the problem effectively than about whether the solution would scale, so RLS seemed like an airtight system. Knowing that it kills performance puts things into perspective. It at least suggests which tool might be best for which job. :)

BenjieGillam · on June 1, 2019

I suspect many of these concerns were with PostgreSQL 9.5 and 9.6 where RLS did indeed have many performance issues. PG 10 fixed a large number of these issues, and PG11 even more. I would recommend re-evaluating RLS performance on these latest PostgreSQL versions.

aidos · on June 1, 2019

In Hasura’s use case it’s a little different. They’re working on the assumption that you have loads of identical queries that you want to run that differ only in the session variables (and hence the rows they see). Because they’re outside Postgres, they can collate it all and execute a single query for multiple users. Within Postgres you don’t have that luxury. It’s a trade off, but not using RLS suits some use cases better.

ndarilek · on June 1, 2019

Oh good, glad RLS performance got a boost. Unfortunately my Postgraphile implementation for that project was replaced with Drupal, but I'm glad RLS performance is less of a concern under version 10 and up.

cdaringe · on June 1, 2019

I'm yet to observe high perf hits on account my RLS impl. I captured data at some point for a 1M record ___domain with a couple thru tables, and query perf with and without RLS was indiscernible. Perhaps your check clauses we're expensive? I made sure to keep them as light as possible.

marton78 · on June 1, 2019

Hasura sure looks interesting, but two things make e hesitant:

1. The overly complicated schema it generates. For example, given an integer id, one can (and must) use various operators to query that id, e.g. `user(id: { _eq(1) })` rather than just `user(id: 1)`, where other operators are `_gt`, `_lt`, etc. To make this possible, hasura defines a comparison object, which results in a very long schema. However, often the only comparison that makes sense is equality, e.g. for IDs, so this should be optional.

2. More importantly, using Hasura establishes coupling between the DB and the GraphQL schemas. What if at a later stage one decides to remove information from the DB and move it to an external source? GraphQL was designed to hide such implementation details. With Hasura this is no longer possible.

sbecker · on June 1, 2019

Regarding your second point, I believe it is still possible to query an external source. Read the docs on “Remote Schemas” - https://docs.hasura.io/1.0/graphql/manual/remote-schemas/ind...

indigo945 · on June 1, 2019

And anyway, if the database schema changes, unless for some reason the turnover is very fast, you can still manually offer views for clients that require the legacy schema.

pyankoff · on June 1, 2019

Re 1: You can do `users_by_pk(id: 1)` for a while now.

tango12 · on June 1, 2019

Yep yep.

We’re also making everything in the Hasura schema aliasable via metadata so this becomes a more elegant name if you’d like!

marton78 · on June 1, 2019

Really? I can't seem to find that anywhere in the docs.

k__ · on June 1, 2019

Does AWS AppSync even play the same game as Hasura?

aidos · on June 1, 2019

Not entirely but it crosses over. Appsync (from what I could tell) has a hacked version of Apollo for the front end and then it’s a configuration based system to provide graphql on top of dynamodb (someone can correct me in all of the above). But if you think of the backend of appsync as being a graphql engine with subscriptions and and a permissions system then there are a lot of similarities.

k__ · on June 1, 2019

AppSync is a managed service and you use whatever GraphQL client you like.

Thaxll · on May 31, 2019

I was curious about the code so I went to check it, it's written in Haskell, never seen Haskell before, it's really hard to follow for someone not used to it: https://github.com/hasura/graphql-engine/blob/master/server/...

Not sure how people enjoy FP.

lpage · on May 31, 2019

Haskell and ML languages look like symbol soup when you're used to mainstream/C style languages. Once you understand some of the basic syntax they become quite readable and easy to reason about as a series of transformations.

Step one, which applies to most functional languages, is knowing how to read a Hindley-Milner type signature [1].

Armed with that you can often gain a surface level understanding of what some function does - and what the program does in aggregate - just by reading type signatures and seeing what calls what.

Hoogle [2] is a good resource for figuring out Haskell syntax, like what's up with all of those $ signs.

Converting OCaml to ReasonML syntax with sketch or reason-tools [3] makes OCaml less foreign.

With basic knowledge of Haskell and ML syntax you'll find many FP languages readable.

[1] https://drboolean.gitbooks.io/mostly-adequate-guide-old/cont...

[2] https://hoogle.haskell.org/

[3] https://reasonml.github.io/docs/en/extra-goodies

jakear · on June 1, 2019

“Quite readable” is a stretch... I know of no other language that needs a hoogle equivalent.

That being said, my experience with Haskell is limited to working on some meta-meta-programming research in college, where every file used at minimum 3 different GHC extensions. So I may be biased.

FridgeSeal · on June 1, 2019

I find Haskell less of a “needs” a hoogle, and more of a “nothing else has this, this is so useful”.

The number of times I have wished for something like Hoogle when writing Python...

SomeOldThrow · on June 1, 2019

Scala could definitely use hoogle, as can c++/boost. My lesson from all this is to never ever overload an operator except, maybe, the comparison operators.

Exception: apl.

whateveracct · on June 1, 2019

You don't need hoogle -.- It is a very cool tool tho

mlevental · on June 1, 2019

>at minimum 3 different GHC extensions. So I may be biased.

are you kidding? most files i see have all of these

http://dev.stephendiehl.com/hask/#the-benign

jakear · on June 1, 2019

Again, I don’t have a clue what I’m talking about. But I seem to remember seeing a lot of those “dangerous” ones. If that’s just the state of Haskell, my bad.

seer · on June 1, 2019

Thank you so much for the reference to “mostly adequate guide to FP” book. It looks incredibly easy to read. As someone who is fluent in JS and dabbled in FP this is right down my alley, and the style of text itself is actually quite funny and conversational - I love that.

I’ve noticed that a _a lot of_ stuff written about FP tries to emulate this fun and easy going style - I wonder why that is. I still refer to clojure’s “re-frame” docs as the epitome of technical literature - a text so good you can read it in its own right just for the fun of it.

I don’t want to diminish other languages’ communities of course, as this is totally anecdotal. I’ve found awesome prose in other places as well (mostly ruby/python) but I just encounter those funny gems more often reading about FP.

robotmay · on May 31, 2019

I've said it a few times on HN in the past but Haskell is by far and away the hardest language I've tried to learn. I have no formal grounding in maths/CS and I think a lot of the people who enjoy it really struggle to communicate with those of us who don't come from that sort of background.

In contrast I picked up Rust and Elixir in a couple of weeks apiece, so I don't feel like I'm particularly dense; I just suspect that it needs a certain way of thinking to get the best from Haskell :)

andy_ppp · on June 1, 2019

I’m along the same opinions, if there was a language as complex as Haskell that gave more consideration to beginners above terseness [naming concepts better] I’d be able to accept more of it. The guy next to me is determined to hatchet in all of the functional paradigms into Typescript and he gets to decide this. I can’t help think you’re average JS dev is going to look at Maybe.may2 and be completely bemused...

coldtea · on May 31, 2019

>it's really hard to follow for someone not used to it

So like anything else?

>Not sure how people enjoy FP.

By virtue of actually having tried it?

bishala · on June 2, 2019

Not really. Some other functional languages (eg. F#) don't seem as arcane as Haskell.

mrkurt · on May 31, 2019

It's worth writing something in an ML or Haskell sometime. Functional programming is great, and it makes my code in things like JavaScript much more robust.

Cyph0n · on May 31, 2019

Scala is probably as far as I’ll get when it comes to FP.

estsauver · on June 1, 2019

Even within scala, you can dabble in quite a bit of functional programming concepts. Have you used typelevel's cats or scalaz?

~Earl

tirumaraiselvan · on June 1, 2019

> Not sure how people enjoy FP.

See The Blub Paradox [0]

0 : http://www.paulgraham.com/avg.html

jose_zap · on June 1, 2019

Anything you are unfamiliar with will obviously look alien and difficult.

I actually find their code quite readable, but that’s because I have experience writing Haskell.

hota_mazi · on May 31, 2019

And FP developers in general tend to push conciseness to an extreme, especially with this kind of shenanigan (from the very source you linked):

    import qualified Data.ByteString.Lazy as BL

It's ironic that after all this obfuscation, the author of the source still finds it useful to do ASCII art and align everything neatly. Talk about misplaced effort.

stoksc · on May 31, 2019

Pretty sure that’s done automatically like go-fmt or other similar tools. It’s zero effort and makes it easily to read. Also lots of languages have that import syntax (python import pandas as pd).

londons_explore · on May 31, 2019

> we’ve currently fallen back to interval based polling to refetch queries.

This fancy "subscribe to events when the results of this query change" system boils down to just polling the database...

aidos · on May 31, 2019

Sometimes the simple option is the best. Having said that, it’s not that simple, for them. I believe it relies on being able to fetch and dissect different results for different clients in the same query (all while applying their own form of RLS).

josephg · on May 31, 2019

Yep. I’d love to see an implementation of this in the wild built on top of logical decoding:

https://www.postgresql.org/docs/9.6/logicaldecoding-explanat...

Liron · on June 1, 2019

Me too. Here are my thoughts on the current state of "realtime databases": https://hackernoon.com/why-eve-will-be-perfect-for-realtime-...

ZitchDog · on June 1, 2019

Interesting idea. Logical encoding would give you the updated row, but how would you check it against each query to see if the query needed to be rerun? Sounds tricky.

GorgeRonde · on May 31, 2019

    Future work:
    Reduce load on Postgres by:
    - Mapping events to active live queries
    - Incremental computation of result set

Fake it till you make it.

Meanwhile: https://opensource.janestreet.com/incremental/

aidos · on May 31, 2019

I’m not entirely sure how the library you posted is related. Because it does incremental computation? It doesn’t seem to have anything to do with Postgres. As far as I can tell it looks more like mobx.

GorgeRonde · on June 1, 2019

They built other libraries on top of this [1] [2], and when compared to mobx [3] it indeed looks very similar, however I think the target ideal of having a small core on top of which one would build more complex computation is essential.

Since Clojure is pervasively built on laziness, there has been some discussions in the community about doing the same but for incremental computation. And doing it pervasively most importantly guarantees it will work at any granularity without any manual fiddling with observables and reactions.

A recent contribution to the solution field in Clojure is micro_adapton [4][5] which looks like it is to incremental computing what miniKanren is to logic programming, i.e. a tiny core for a a challenging problem.

Also, considering the literature around it it seems to be pretty tricky to do properly, especially when it comes to databases [6].

[1] https://github.com/janestreet/incr_map [2] https://github.com/janestreet/incr_select [3] https://github.com/mobxjs/mobx [4] https://github.com/aibrahim/micro_adapton [5] https://arxiv.org/abs/1609.05337 [6] https://scholar.google.fr/scholar?hl=fr&as_sdt=0%2C5&q=incre...

village-idiot · on June 1, 2019

> Since Clojure is pervasively built on laziness

Only some of Clonure’s data structures are built on laziness, everything else in the language is strict. If you want something actually built on laziness, you need Haskell.

themccallister · on June 1, 2019

I’ve been using Hasura for around 3 months now. It has changed the velocity and way I approach our projects. I have a small team that works as a startup inside of a larger organization and this was crucial to delivering some key projects... plus very little code. The event trigger really help reduce our costs as well, using the Serverless.com framework with Lambda Go has been a huge success in our organization!

Thank you to the Hasura team for all you do and in turn allow me and my team to do!

thegoleffect · on May 31, 2019

When I first saw Hasura, it was AGPL licensed which makes it unusable in a lot of use cases so we passed on it. Looks like they converted it to Apache license two months ago. I suppose it is worth another look, I'm glad they made the change.

dmitryminkovsky · on May 31, 2019

Hasura is such a promising application platform. I tried it a few months ago and was really pleased. In a way it kind of felt like RethinkDB, except GraphQL and Postgres. Was very easy to turn on with Docker Compose and play around. Looking forward to 1.0!

z3t4 · on May 31, 2019

I guessed it would use the pub/sub capabilities of postgres. But it seems from the readme that it use polling, although using the same query which allows cache optimizations. The naive implementation can often be surprisingly effective.

tango12 · on May 31, 2019

Yep. We experimented around a lot with pub/sub (listen/notify, wal-lr). But it was getting hard to scale and heavy/spiky write loads needed careful thought. This seemed like the right way to get started esp. given that Hasura can get the entire sql query including authz, before moving to tackling that problem more effectively[1].

For folks that want to play around with wal and listen/notify:

https://github.com/hasura/pgdeltastream https://github.com/hasura/skor

[1] https://github.com/hasura/graphql-engine/blob/master/archite...

rishav_sharan · on June 1, 2019

As an Indian engineer, Hasura is one the few indian startups that i really am proud of. They are innovative, distinct, product focused and super young. The only other company which made me feel like this was Notion Ink, but those guys seem to have faded away.

I wish Hasura the best and would be cheering for them from the sidelines

gunnarmorling · on June 1, 2019

I'm impressed by the numbers, but I'm having some doubts about the general idea of exposing your database via GraphQL.

It seems to expose your internal data structures, and any change to those will immediatly impact that public GraphQL API. Also it seems that this approach would only work well for applications without any business logic, or with business logic solely implemented in the database. At least the applications I've been working on wouldn't fall into this category.

But in any case, impressive work and demos!

aleksei · on June 1, 2019

You aren't limited to a single database though, so you could stream your main database to a public facing one with a more stable structure.

But you'd have to handle changes through another API, or at least route them differently, and then worry about consistency so probably not ideal for all business cases.

tango12 · on June 1, 2019

True!

We’ve added event triggers to Hasura to kind of support this pattern via Hasura itself. So you can create “action” tables that basically have a log of request data (a mutation inserts an action). Hasura will then call an event handler which can run with the action data, user session information, related data in case there are any relationships etc. This handler can then go update the tables that can be queried from.

Ofcourse, if the eventing system is in-order / exactly once quite a few use-cases become feasible ;)

agrippanux · on May 31, 2019

Hasura is fantastic. I am using it for several projects atm. Coming from Prisma it's refreshingly easy to use and well thought out. The security/ACL model is esp. nice and easy to grok.

pier25 · on May 31, 2019

We evaluated Hasura and the only problem we found is the integration with our own authorization system. Otherwise Hasura is quite frankly amazing.

Our auth system has many roles per user, then each role has its permissions, pretty common setup. The problem is that Hasura expects a single default role per request to then evaluate against its permissions. The Hasura team has been looking into accepting an array of roles and merging permissions and whatnot, but AFAIK this hasn't been solved.

If you start with Hasura from scratch this is not really a problem, it happened to us because we had to figure out how to integrate Hasura with our current permissions.

contrasti · on June 1, 2019

> the only problem we found is the integration with our own authorization system

But integration with external authorization systems is an extremely basic requirement for every piece of software, certainly on the top 5 on the requirements list for every serious solution - I can not understand how did you forget about this?

sk5t · on June 1, 2019

I haven't found this ("integration with external authorization systems is an extremely basic requirement") to be the case at all; most solutions ship with tightly-coupled access control mechanisms, and it is an exercise for the user to figure out how to wield them effectively.

In Hasura's case, it looks like the flow of data to the authorization bits shows a simple/naive approach, although it also appears this is something Hasura are working to correct. Even as it stands, moderately complex authz logic could be implemented via x-hasura headers or in-database associations.

lubesGordi · on May 31, 2019

I'm curious about the underlying push tech. Is Hasura using firebase to handle that? Then firebase relies on different push services depending on the client, like Apple's push service for iOS and http/2's push for web (which is supported by what percentage of browsers)?

pier25 · on May 31, 2019

Firebase? No, not at all. It's all Hasura and Postgres.

lubesGordi · on May 31, 2019

What I'm getting at is what is the underlying tech handling push to clients? Reading further I see that the graphql libraries handle subscriptions via websockets. But elsewhere in this github I was seeing firebase. So it's a little confusing. Websockets are pretty ubiquitous now but but from what I understand maintain a tcp connection to the server. That has scalability implications.

tango12 · on June 1, 2019

I think you were probably looking at docs/sample-code for a firebase auth integration.

Hasura GraphQL subscriptions are over websockets. This post is a benchmark that should address those scalability concerns. It’s about getting to 1M concurrent active websocket connections.

StreamBright · on May 31, 2019

Nice scaling numbers. Can you use Hasura without Kubernetes or Docker?

tango12 · on May 31, 2019

You can compile the binary yourself, but otherwise docker is the way to go.

What’re you looking at?

jazoom · on May 31, 2019

I know there's an issue about this but it doesn't really indicate Hasura's intentions regarding it:

Is supporting CockroachDB something Hasura is actively working on?

tango12 · on May 31, 2019

It’s on the cards!

Don’t want to spread ourselves too thin too early. Also working with the yugabyte folks closely to have Hasura supported with their 2.0 release.

https://twitter.com/karthikr/status/1128003465370693636?s=21

jazoom · on May 31, 2019

> Don’t want to spread ourselves too thin too early.

I certainly understand that, but I wonder since CockroachDB is essentially Postgres, if it wouldn't be easier to ensure compatibility from the start. It seems like it'd be easier to just avoid using special Postgres features that aren't in CockroachDB than to go back later and re-write all your code that relies on those features.

tango12 · on May 31, 2019

Definitely. That’s why cockroach & yugabyte that speak Postgres but are black magic underneath are a great target. However, it’s not fully compatible quite yet https://github.com/cockroachdb/cockroach/issues?q=is%3Aopen+...

Citus / timescale / pipelinedb are much easier and we already support timescale and pipelinedb well. Citus with native support for their distribution columns is going to happen soon too! These are packaged as Postgres extensions which is definitely first priority for us before moving to databases that speak Postgres and eventually to other databases as well. :)

jazoom · on May 31, 2019

> it’s not fully compatible quite yet https://github.com/cockroachdb/cockroach/issues?q=is%3Aopen+....

So you're waiting for CockroachDB to implement features before you suppoort it?

_vafj · on May 31, 2019

Write Ahead Logs would have been amazing - hate that AWS RDS does not support WAL - could have saved so much time for some of the applications we build.

gunnarmorling · on June 1, 2019

WDYM by "RDS does not support WAL"? I wouldn't know how Postgres works without them, and they are accessible using logical decoding. wal2json is supported on RDS and used by tools such as Debezium for streaming data changes into Apache Kafka (Disclaimer: I work on Debezium).

_vafj · on June 3, 2019

Thank you, had missed this announcement.. I did check Debezium a while back, found it appealing

he0001 · on June 1, 2019

How does it handle transactions and locking? It seem to support multiple databases, does it handle distributed transactions?

rj5 · on June 13, 2019

I can’t wait to try this out!