joelwilsson's comments

joelwilsson · 2024-10-10T18:59:46 1728586786

The Forbes article is from April, while The Local article you're quoting is from October. Back in April, the proposal was different, as The Local explains.

And the biggest problem for startup founders remains: you're taxed, on leaving the country, on unrealized gains. Being taxed on 5 millions of profit sounds fair, being taxed on 5 millions (or 30 millions) of valuation used for raising capital, in a startup that then fails and is worth nothing after a few years, maybe not so much. Neighboring countries do not have this kind of taxation.

joelwilsson · on April 30, 2024

Looks like a competitor/alternative to Smithy, https://smithy.io/2.0/index.html. Since at least one person from the TypeSpec team is here, do you have any thoughts on how they compare?

jen20 · on April 30, 2024

This was my thought too - since smithy is already out there and used in a similar ___domain, it would be useful to have a comparison. “Doesn’t have Kotlin and Gradle all over the show” seems like a significant advantage in favour of TypeSpec.

joelwilsson · on Dec 26, 2023

"Design in Practice" by Rich Hickey: https://www.youtube.com/watch?v=c5QF2HjHLSE

vasergen · on Dec 26, 2023

Thank you, this is a great talk! Even though I am not using clojure, Rich Hickey is one of my favorite speakers.

joelwilsson · on July 27, 2020

Just for Apache Arrow itself, https://arrow.apache.org/docs/java/ compared to https://arrow.apache.org/docs/js/ or https://arrow.apache.org/docs/cpp/ doesn't look promising in terms of the documentation being usable.

That could be improved without fixing the whole JVM data ecosystem, but that's mostly up to JVM developers. It's unfortunate if the Spark developers using Arrow aren't contributing in this area (especially since many of them are being paid), but it's all open source and undoubtedly pull requests are welcome.

Congratulations on the 1.0 release, it's only going to keep getting better! Really exciting to be able to share data in-memory across languages.

joelwilsson · on March 3, 2020

Sure - but the comment you're replying to made no mention of NoSQL. It just said Clickhouse lacks OLTP by design, that doesn't mean it won't be widely used, just that it will perhaps be limited to analytics workloads.

If you need deletes and transactions, look elsewhere, but Clickhouse seems to be great for what it's been designed for.

joelwilsson · on Nov 10, 2017

BigQuery uses a lot of tricks to get efficiency, but this post emphasizes Apache Arrow and open data formats like it as the way forward (in particular the last point, "Be open, or else…") which are not currently supported by BigQuery.

If Apache Arrow takes off I hope BigQuery will support it as a data interchange format in the future. Zero-copy is pretty awesome, as are open standards in general. This feature does not exist in BigQuery today (as far as I know - definitely not as discussed in the source).

xoogler_thr · on Nov 11, 2017

One thing people commonly miss about this is this is all meaningless if you don’t have the corresponding runtime integration, and these techniques very much imply co-design, and therefore tight coupling, between the format and the runtime. To give a concrete example, to make any of this efficient and fast you must have predicate pushdown directly into decoder such that filters could skip the data they don’t need to decode. Some aggregations could be handled the same way.

So it’s a little incorrect to think of this as a “file format” in the first place. If you end up designing it like that, you’d not be able to have a lot of the gains that the Abadi paper (and others like it) alludes to. My suggestion would be to go whole hog and push down as much filtering and aggregation in there as is feasible, exposing a higher level interface with _at least_ filtering predicate support, and do it in C++.