But at least if I write a transform in pandas it's straightforward to unit test it: create a DataFrame with some dummy data, send it through the function which wraps the transform, test that what comes out is what's expected.
Validating a transform done in SQL is not nearly as straightforward. For starters it needs to be an integration test not a unit test (because you need a database). And that's assuming there's even a way to hook unit tests into your framework.
I'm not a huge fan of Pandas — it's way too prone to silent failure. I've written wrappers around e.g. read_csv which are there to defeat all the magic. But at least I can do that with Python code instead of being stuck with the big-ball-of-SQL (e.g. some complicated view SELECT statement that does a bunch of JOINs, CASE tests and casts).
But at least if I write a transform in pandas it's straightforward to unit test it: create a DataFrame with some dummy data, send it through the function which wraps the transform, test that what comes out is what's expected.
Validating a transform done in SQL is not nearly as straightforward. For starters it needs to be an integration test not a unit test (because you need a database). And that's assuming there's even a way to hook unit tests into your framework.
I'm not a huge fan of Pandas — it's way too prone to silent failure. I've written wrappers around e.g. read_csv which are there to defeat all the magic. But at least I can do that with Python code instead of being stuck with the big-ball-of-SQL (e.g. some complicated view SELECT statement that does a bunch of JOINs, CASE tests and casts).