Hacker News new | past | comments | ask | show | jobs | submit login

>LINQ is an extremely common way to write code in C#

It is extremely uncommon in performance contexts. It is actively discouraged and removed when writing performant C# code.

It is incredibly common in your "run of the mill" enterprise apps, or contexts where performance can slow down a bit for the sake of programmer happiness.




It isn’t uncommon when you know what it is doing. Wholesale removal or discouragement of LINQ is a sign of fake cargo-cult performance “optimization”.

It’s perfectly fine to use if you learn about how it works and how to use it properly.


LINQ is going to add overhead, regardless of "properly" using it or "cargo-culting" things; save the platitudes for the Monday zoom meeting.

LINQ adding overhead is a _technical reality_, it's how it works and that is fine. It's a fine tool in many difference contexts, but when we talk about performant code the context is obviously one in which every cycle matters.

And those of us with enough experience know that LINQ performance and implementation details varies over time in the runtime, and those shifts aren't always positive.

So when writing code where performance is fundamental to the success of the application, avoid LINQ since it WILL add overhead and it will remove implementation control from your team. It is a risk without much benefit when you're in the performance arena. That doesn't mean it's not useful in many other contexts.


There are cases where using the straightforward LINQ code would be a lot faster than a lower level alternative. For example when the code can be vectorized and use AVX instructions, which is implemented for quite a few LINQ methods. A straightforward non-LINQ version of the code would likely be slower as most developers would not or can't write the low-level AVX version.

I'd certainly be careful about LINQ in certain performance-sensitive code, e.g. about creating unnecessary copies of the data and allocating too much. But I would not trust myself without measuring to really know whether it actually makes a difference or if my "optimized" code might be even slower.


A lot of LINQ was also optimized and improved with the move to .NET Core (now just .NET). It's definitely important to actually profile the code, rather than just assume LINQ is slow/less-performant. Most of the time, unless the developer has added unneeded code (such as calling .AsEnumerable, or most anything that evaluates the entire collection), the difference between LINQ and standard iterator based code will be nearly non-existent, with some of the cases you mentioned where LINQ has been optimized beyond what a developer can do by hand.


> AVX instructions, which is implemented for quite a few LINQ methods

Are you sure? Any examples of such methods? And does AVX actually helps?

I don’t think that’s possible because IMO AVX and other SIMD can only help for dense inputs. The C# type is ReadOnlySpan, however ReadOnlySpan doesn’t implement IEnumerable and therefore incompatible with LINQ.

There’s even an alternative LINQ to workaround https://github.com/NetFabric/NetFabric.Hyperlinq but that thing is a third-party library most people aren’t using.



Interesting, I never heard about that. Was merged to master in February 2022, I wonder which release version includes the changes?

Still, the support seems very limited. They simply probe argument type for arrays and lists. Any other IEnumerable gonna return false from TryGetSpan, which reverts to the legacy scalar implementation.


pretty sure thats dotnet 7 only stuff. I can vaguely remember a blog post about it: https://devblogs.microsoft.com/dotnet/performance_improvemen... (I think this is the one?)

it's faster in bigger arrays/lists but smaller ones barely make a difference, even the linq vs non-linq make basically only noise difference as far as I remember.


Algorithms that work on non-contiguous or synthesized data are not subject to vectorization in any language, aside from select cases where LLVM is able to inline and auto-vectorize loops in Rust for certain simple cases of map->collect.

What is your argument then?


My point is the support is very limited, only works for a few trivial use cases, like sum/min/max/average of arrays of a few selected types.

I believe it’s technically possible to vectorize more complicated stuff in C#, just the runtime library is not doing that. For an example, look at how Eigen C++ library https://eigen.tuxfamily.org/index.php?title=Main_Page does their math. Under the hood, they wrap inputs into classes which supply SIMD registers, then do math on these registers. Eigen does that in compile-time with template metaprogramming. A hypothetical C# implementation could do similar things using generics and/or runtime code generation. LINQ from the standard library was never designed for high-performance compute, but I think it might be possible to design similar API for that.


How is that relevant to the vast majority of the code targeted by LINQ?

The niche scenario you have outlined is partially covered by a recent System.Numerics.Tensors package update (even though I believe it would have been best if there was a community-maintained package with comparable quality for a variety of reasons).

The goal of LINQ itself is to offer optimal codepaths when it can within the constraints of the current design (naturally, you could improve it significantly if not for backwards compatibility with the previous 15 or so years of .NET codebases). The argument that it's not good because it's not the tool to do BLAS is just nonsensical.

There is, however, an IL optimizer that can further vectorize certain common patterns and rewrite LINQ calls into open-coded loops: https://github.com/dubiousconst282/DistIL


> How is that relevant to the vast majority of the code targeted by LINQ?

The people I responded to were discussing applicability of LINQ. I think very fast sum of List<int> collections doesn’t compensate for suboptimal performance of pretty much everything else.

For 80% of problems that “suboptimal” is still fast enough for the job, but for other 20% it’s important. Using the same C# language it’s often possible to outperform LINQ by a large factor, using loops, SIMD intrinsics, and minimizing GC allocations.

> partially covered by a recent System.Numerics.Tensors package

They don’t generate code in runtime, they treat C# as a slower and safer C. I’m pretty sure the higher-level parts of the runtime allow more advanced stuff, similar to expression templates in Eigen, but better because runtime codegen could account for different ISA extensions, and even different L1/L2 cache sizes.


I see, my bad, should not have responded in the first place.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: