Hacker News new | past | comments | ask | show | jobs | submit | jeffbee's comments login

Yeah, just an example of a QoL issue with DuckDB: even though it can glob files in other cases, the way it passes parameters to GDAL means that globs are taken literally instead of expanded. So I can't query a directory with thirty million geojson files. This is not a problem in geopandas because ipython, being a full interactive development environment, allows me to produce the glob any way I choose.

I think this is a fundamental problem with the SQL pattern. You can try to make things just work, but when they fall then what?


I think this is just cause it hasn't been implemented in spatial yet. DuckDB is currently going through a pretty big refactor of the way we glob/scan/union multiple files with all the recent focus on data lake formats, but my plan is to get to it in spatial after next release when that part of the code has stabilized a bit.

> fundamental problem with the SQL pattern.

SQL is a DSL and yes, all Domain Specific Languages will only enable what the engine parsing the DSL supports.

But all SQL database I'm aware of let you write custom extensions, which are exactly that: they extend the base functionality of the database with new paradigms. I.e postgis enabling geospatial in postgres or the extensions that enable fuzzy-matching/searching.

And as SQL is pretty much a turing-complete DSL, there is very little you can't do with it, even if the syntax might not agree with everyone


You can use DuckDB in ipython to solve the globbing issue. Then you don't have to worry about OOMs with geopandas.

Ehh I tried to do some spatial stuff but there just wasn't enough there, or I could not figure out how to use it. Loading spatial information into ipython and fiddling with it is well-traveled and it doesn't seem to me that SQL is an inherently lower hurdle for the user.

Recently some socialist lackwit on Twitter was railing that YIMBYs only lowered prices from $600k to $500k. This person apparently did not understand that the difference - $20k up front and $700/mo for 30 years - is the margin between affordable and not affordable for tens of millions of ordinary people.

In case anyone reading thinks otherwise: you can absolutely be a socialist and a YIMBY. Maybe your goal is Vienna style housing. The only thing you need to acknowledge is that cities need 'enough' housing.

Yeah but there's this entire genre of hammer-and-sickle-in-bio online guy who thinks socialism is about ideals, not the material experience of the proletariat. Those guys think we should outlaw home building until after the revolution.

Yeah, I've seen those people. They remind me of this scene

https://youtu.be/tx02tY8ABfA?si=JbSiDMTdkHs5LrOk&t=62

They don't want to take that next small step that will help somebody, they want to change everything all at once.


> when baby boomers sparked a construction surge as they moved out of their childhood homes.

Man, boomers had smarter parents than boomers' children had. When the much more numerous millennials wanted houses, boomers told them to go jump in a lake.


Boomers = The Selfish Generation

Poor stewards.


It does help and is helping. Here's an article published this week in Berkeley about how a modest building boom has stabilized and lowered rents in existing buildings, after rents doubled in the 2010s.

https://www.berkeleyside.org/2025/05/01/berkeley-housing-ren...

Note: I developed the data used in the article.


If you look at the trend for all Bay Area cities over a similar period [1], you'll see Berkeley is pretty much in line with the overall trend.

Now you can argue that's because of greatly increased development, but I don't think that applies to SF (or SJ) and it doesn't match the Berkeley timeline anyway.

[1]: https://www.sfchronicle.com/projects/sf-bay-area-rent-prices...


Yeah, the specific contribution here is tracking a stable cohort of existing housing, not the entire market, to show the effect of new construction on the price of older housing.

If you use k8s qos levels "guaranteed" cpu resources will be distinct — via cpu sets — from the ones used by the riff-raff. This is a good way to segregate latency-sensitive apps where you care about latency from throughtput-oriented stuff where you don't.

Guaranteed QoS isn’t perfect:

1. Neighbours can be noisy to the other hyperthread on the same CPU. For example, heavy usage of avx-512 and other vectorized instructions can affect a tenant running on the same core but different hyperthread. You can disable hyperthreading, but now you are making the same tradeoff where you are sacrificing efficiency for low tail latencies.

2. There are certain locks in the kernel which can be exhausted by certain behaviour of a single tenant. For example, on kernel 5.15 there was one global kernel lock for cgroup resource accounting. If you have a tenant which is constantly hitting cgroup limits it increases lock contention in the kernel which slows down other tenants on the system which also use the same locks. This particular issue with cgroups accounting has been improved in later kernels.

3. If your latency sensitive service runs on the same cores which service IRQs, the tail latency can greatly increase when there are heavy IRQ load, for example high speed NIC IRQs. You can isolate those CPUs from the pool of CPUs offered to pods, but now you are dedicating 4-8 CPUs to just process interrupts. Ideally you could run the non-guaranteed pods on the CPUs which service IRQs, but that is not supported by kubernetes.

4. During full node memory pressure, the kernel does not respect memory.min and will reclaim pages of guaranteed QoS workloads.

5. The current implementation of memory QoS does not adjust memory.max of the burstable pod slice, so bursable pods can take up the entire free memory of the kubepods slice which starves new memory allocations from guaranteed pods.

Dont even get me started on NUMA issues.


There isn't any way on Linux to deal with processes that create dirty pages. It is folly to try. The only way to deal is to put I/O stuff on a whole box/node by itself, and outlaw block I/O on all other nodes.

10000000%

The people who are out here saying "totalitarianism was the inevitable consequence of federal standards for feces in bologna" are exactly the problem in America today.


Funny because PBS and NPR bend over backwards to coddle Trump and his circle and to not seem oppositional. NPR national daily shows in particular cover Trump's shenanigans like the are reviewing the new BTS singles.

They too get to learn that appeasing bullies does not work. At best you're sending them away for today, but they'll be back for your lunch money tomorrow.

Yep, I stopped listening in the run up to the 2016 election as they made clear they were going to make space for both siding misinformation, lies, and the emotions of those who were fully bought into the propaganda machine.

I find it a shell of its former self.


The thing that made me want to smash my car radio was Nina Totenberg reporting on Supreme Court cases. Just like so much of the rest of the American political center, Nina Totenberg is twenty years past her expiration date in ability to rise to the moment.

> stability control doesn't control the engine

Stability control is tied to power in all modern systems.


It is also the case that Waymo will be dramatically better than all humans in ice because it is going to take the aviation approach and stay in the depot, rather than fooling itself into believing it is competent at driving on ice.

Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: