More

WinDoctor · 2024-11-27T08:23:35 1732695815

Pull Requests originate from a low trust environment (opensource) and bring very costly context switching when forced into high trust environments (PRs reviewed by team-members)

WinDoctor · 2024-09-17T06:52:14 1726555934

> why the SQL API even allows injection vulnerability

How would one implement this?

"SQL APIs" use prepared statements. Meaning you have a string for SQL and some dynamic variables that inject into that string via $1, $2 etc.

BUT now if developer makes that string dynamic via a variable, then you have SQL injection again.

magicalhippo · 2024-09-17T13:37:10 1726580230

> How would one implement this?

The low-level API could simply not allow SQL statements as strings, and instead provide separate functions to build the queries and statements.

It would provide entry points which could be used to ensure proper escaping and such, and would still allow for easily generating queries dynamically in the cases where that is needed.

Of course, it doesn't completely guard against Bobby Tables[1], one could imagine someone including a run-time code generator and feed it unprotected SQL as input.

But it should make it a lot more difficult, as it would be much more "unnatural", requiring going against the grain, to inject unprotected user data. Also, the "query_execute" function could raise an error if there's more than one statement, requiring one to use a different function for batch execution.

Pseudo-codish example off the top of my head, for the sake of illustration:

   is_active = str_to_bool(args['active']); // from user
   qry = new_query(ctx);
   users_alias = new_table_alias(qry, 't');
   query_select_column(users_alias, 'id');
   query_select_column(users_alias, 'username');
   query_from_table(users_alias, 'users');
   filter_active = query_column_eq_clause(users_alias, 'active', is_active);
   where = query_where(qry);
   query_where_append(where, filter_active);   
   cursor = query_execute(qry);

[1]: https://xkcd.com/327/

lmz · 2024-09-18T00:09:35 1726618175

"Gee, this new programming language / API makes it hard to copy my SQL queries across. Better use something else."

magicalhippo · 2024-09-18T06:37:56 1726641476

If that's all what the datanase drivers supported...

EGreg · 2024-09-17T07:26:47 1726558007

Easy. Don’t write queries in a language (SQL) which interpolates content without escaping it for the enclosing structure.

Go one level up.

For example statements that are prepared should not allow strings in the SQL, but rather variables, and then bind them to values like PDO does

scotty79 · 2024-09-17T09:03:45 1726563825

It would be a bit annoying to have to prepare outside and pass in every SQL literal you need to use in your query.

I'd rather have SQL API taking not strings but a special type that string can't be directly converted into without escaping (by default).

In C++ tagged literals could be used to create this special type easily. Similar constructs exist in some other languages

EGreg · 2024-09-17T09:58:52 1726567132

Literally a library can generate SQL statements and compile them

JS and PHP has tagged literals

But they have to be “escaped” properly before being interpolated!

scotty79 · 2024-09-17T13:45:57 1726580757

That's the whole point of having a separate type for queries. Whenever you try to glue a string to a query the string gets escaped.

WinDoctor · on April 30, 2024

Literally what PostgreSQL does.

E.g. you can do manual selects from internal tables to see the same content as `\dt` command for example.

infogulch · on April 30, 2024

Oh, cool. Where's the schema?

WinDoctor · on April 29, 2024

Everything in PostgreSQL is a role.

It's just named such that when a ROLE allows `login` it's considered a user

orthoxerox · on April 29, 2024

Someone was feeling very clever when they came up with this idea.

paulddraper · on April 29, 2024

It makes a lot of sense, especially since in the SQL standard, you can grant privileges to "users" or "roles."

Might as well simplify the mental model and make them the same.

skissane · on April 29, 2024

This is part of what many people find so confusing. In most systems “role” is a group (or something closely resembling a group), not a user. The weird terminology confuses beginners

WinDoctor · on April 29, 2024

It's a bit confusing and legacy.

All roles function like you would expect groups to function

A role that is not allowed to login is a `group`.

While the CREATE USER and CREATE GROUP commands still exist, they are simply aliases for CREATE ROLE.

cqqxo4zV46cp · on April 29, 2024

To me, the comment you are replying to is saying that you should^ DIFFERENTIATE roles by service, not ‘end user’.

WinDoctor · on April 29, 2024

Here's your cookbook: https://stackoverflow.com/questions/78401179/how-do-postgres...

badcppdev · on April 29, 2024

Oh it's really worrying that you post a link to a SO page as the recommended 'cookbook' for something that's referred to as a powerful feature. And both of the answers only have a single up vote each. Seems like there's a serious gap here.

WinDoctor · on April 29, 2024

You can help out by posting answer / upvoting good answer.

badcppdev · on April 30, 2024

I don't think Stack Overflow is a good place for good documentation. I think the Postgres documentation should be better.

WinDoctor · on April 29, 2024

I agree with the sentiment that IAM in PostgreSQL is complex.

What makes it complex is that there are 3 layers of objects (Database, Schema, Tables) and also implicit grants given to DB object owners

To be able to select from a table you need:

* CONNECT on the Database

* USAGE on the Schema (Given implicitly to schema owner)

* SELECT on the Table (Given implicitly to table owner)

To see these privileges we need to understand acl entries of this format

`grantee=privilege-abbreviation[]/grantor:`

* Use \l+ to see privileges of Database

* Use \dn+ to see privileges of Schemas

* Use \dp+ to see privileges of Tables

Privileges are seen [here](https://www.postgresql.org/docs/current/ddl-priv.html)

e.g. in the following example user has been given all permissions by postgres role

`user=arwdDxt/postgres`

If the “grantee” column is empty for a given object, it means the object has default owner privileges (all privileges) or it can mean privileges to PUBLIC role (every role that exists)

`=r/postgres`

Also it's confusing when Public schema is used. You have CREATE permission on schema so when the tables are created with the same user you select data with and you have owner permissions out of the box.

echion · on April 29, 2024

> To be able to select from a table you need

> * CONNECT

> * USAGE

> * SELECT

Isn't LOGIN (https://www.postgresql.org/docs/16/role-attributes.html) also needed?

  Only roles that have the LOGIN attribute can be used as the initial role name for a database connection

briffle · on April 30, 2024

You can log in and then change your role (if you permissions) with SET ROLE NEW_USER;

We managed to kludge our way to defaulting to read only, then using set role to do writes if you need to.

WinDoctor · on April 30, 2024

Good point. Connecting to the Cluster is indeed actually a 4th layer.

There you do need user with LOGIN, valid password & SSL.

jamessb · on April 30, 2024

> What makes it complex is that there are 3 layers of objects (Database, Schema, Tables) and also implicit grants given to DB object owners

The main privileges systems includes Columns, as well as Databases/Schemas/Tables. You can SELECT from a table if you have been granted SELECT on the table, or if you have been granted it on the specific columns used in your query. ("A user may perform SELECT, INSERT, etc. on a column if they hold that privilege for either the specific column or its whole table. Granting the privilege at the table level and then revoking it for one column will not do what one might wish: the table-level grant is unaffected by a column-level operation." [1])

There's also a system of Row Security Policies [2].

[1]: https://www.postgresql.org/docs/current/sql-grant.html

[2]: https://www.postgresql.org/docs/current/ddl-rowsecurity.html

vasco · on April 29, 2024

I've seen funny things happen when developers think their GRANT is doing what ALTER DEFAULT PRIVILEGES does, and a day later break production because their user doesn't have access to the new tables.

WinDoctor · on April 29, 2024

What do you mean exactly?

If roles have INHERIT, then doing the following works, no?

* Role A creates table * GRANT A TO B; * ROLE B can read from table just like A can.

Also if Role A creates new table, Role B can read that too no?

vasco · on April 29, 2024

Remove any "if" from your comment and it seems like you understood what would fail! There's more ways to make misunderstanding grant vs default privileges will create issues as well.

WinDoctor · on April 29, 2024

Then I would argue for trying to centrally ensure that all roles have INHERIT in your DBs. :thinking:

Doing ALTER DEFAULT PRIVILEGES could be another future footgun of it's own.

csnweb · on April 29, 2024

I think you probably don’t want the application user to inherit from the superuser you may use for migrations since some migrations require a superuser.

marcosdumay · on April 29, 2024

IMO, the permission management of the MS SQL Server is the best one around. It gets all the basics right, without any big issue. Other DBMS should copy it, or use it as a starting point if they want to go further (and there is plenty of stuff to improve on it).

TLDR, the container objects and the contained ones all share the same kind of permissions. Permissions of the container are applied to the contained unless explicitly changed.

So, if you grant select on the schema dbo to a, a will get select on all tables there. If you want to remove some table, you revoke the select on that specific table. And there is both metadata to discover where a specific privilege comes from and specific commands that edit the privileges on a specific level.

WinDoctor · on May 9, 2022

> data is effectively modelled in a nosql style.

What do you mean by this? Any references I could read?

potamic · on May 9, 2022

You can probably search around for references on distributed data management. But basically, when your data is split across different instances, you can no longer do joins or transactions or rely on the database for acid guarantees. All these things must instead be handled in code which is exactly how you would work with data using a nosql database as well.

ceras · on May 9, 2022

This might help: https://microservices.io/patterns/data/database-per-service....

Varies by actual DB but: with this architecture, if the DBs are NoSQL then you didn't lose as much by splitting up the databases, but if they were SQL then among other things you lose transactions across those DBs (or now need costly distributed transactions), effectively making the collective SQL system behave more like a NoSQL DB.

DeathArrow · on May 9, 2022

I think he means the data is not relational anymore and it is not normalized.

tremon · on May 9, 2022

Probably means that the data isn't modelled at all, just a dump of the application architecture.

WinDoctor · on Feb 24, 2022

I would assume Putin hopes that a war unites the Russian people and increases his own power more than it loses trust.

It's a gamble for sure.

WinDoctor · on Feb 24, 2022

My take on this: https://akselallas.xyz/posts/russia-geopolitics/

WinDoctor · on Dec 20, 2021

Incredible! :)

What do you mean by hand-tuned Postgres & hand-tuned ZFS. What did you change and why?

leijurv · on Dec 20, 2021

If I remember correctly, for ZFS it was some stuff about cache size and RAM usage, since we were in a VM and couldn't be too greedy. Something about a NVME drive needing slightly different settings. I didn't spend much time on this. For Postgres it was so so so many things, I played with it on and off for months, benchmarking certain key queries and such. And I read a LOT of blog posts about how to make Postgres and ZFS work together ideally, one thing I remember in particular was waffling back and forth about logbias=throughput vs logbias=latency. Just google "postgres zfs logbias" and you'll get MANY conflicting opinions. (edit: when I google it now in incognito, my reddit post is at the top lol)

But for what I actually changed, off the top of my head, I set the recordsizes in ZFS large enough that Postgres could safely (because of ZFS CoW) have full_page_writes off, and combined with synchronous_commit off, that really sped up the overall system and made the WAL logs much smaller. After looking just now at postgresql.conf, various other things were tweaked, such as seq_page_cost, random_page_cost, effective_cache_size, effective_io_concurrency, max_worker_processes, default_statistics_target, dynamic_shared_memory_type, work_mem, maintenance_work_mem, shared_buffers, but those were not quite as important. (plus some uninteresting tweaks to WAL behavior, since we had replicas that got WAL logs shipped every few minutes with rsync)

WinDoctor · on Dec 20, 2021

Thanks for the in-depth response!