>>> A top U.S. Justice Department official, Makan Delrahim, praised the deal as being carefully sculpted to avoid antitrust concerns, signaling federal regulators were unlikely to block it. That stance is all the more notable given his continued efforts to kill on antitrust grounds the AT&T acquisition of Time Warner, a combination of a telecoms giant and an entertainment powerhouse that have little overlapping commerce. The Justice Department is appealing a judge's ruling that swept aside all objections to the AT&T/Time Warner merger.
Im curious as to how he's using vim + ensime. I've been trying off and on for the last 2 years to figure out a good workflow using ensime but i typically end up just turning it off after a few days. EnType only works for me when its completely obvious. EnImport seems kind of useless. After these 2, i just give up, tbh.
Still, I wont give up my text editor :) Any emacs folk like to comment on its usefulness? Its seems more fully implemented there.
Unfortunately it doesn't. Even though Scala and Java output the same kind of bytecode, these languages have different language specifications. We currently only support what's in the JLS.
This is really an overly simplistic way to code. I urge people to think deeper up front about performance.
Know your performance goals going in and code accordingly. If you require 100 micro average latency and you coded in Node.js, step 3 will be a rewrite.
Every single line of code I write, I can tell you my performance goals. If indeed it is a simple crud screen by a user, the goal may be "meh, document.ready called when viewed from 100ms browser lag within 1 second". Backend trading code would have different goals..
Yeah, when starting a new project I tend to write a POC, throw it out, make a ton of notes on the problem and possible solutions, write a second POC, throw it out again, refine my notes, make sure i really understand the problem and my solution is actually working, and if everything looks good, write it "for real".
The management question, therefore, is not whether to build a pilot system and throw it away. You will do that. […] Hence plan to throw one away; you will, anyhow.
Then the problem is an academic endeavor and hopefully not something to be sold to a customer. If it is open source, then it is likely to fall into disuse like 99.9% of open source endeavors. If it is in the 0.1% of open source projects then the community will find the effort for a rewrite, but it could be painful like python 2/3 or perl 6/7
I usually iterate. Building upon what was already built.
All to often a 'prototype' becomes the finished product without the intervening iterations. If that was the plan from the start it works more smoothly.
Exactly. Completely ignoring performance considerations until after you got everything working right potentially means a substantial or complete rewrite of much of the code.
My posit was that 80% of the software an engineer will write will not require optimization and based on the responses i've been getting I should have used "rule of thumb" vs mantra. My intention was to warn the ones that read this and say, oh i gotta do all of this for every piece of software that I write, which will undoubtedly lead to overly complex code when a simple solution would have worked just as well.
"Know your performance goals going in and code accordingly."
I this is key sentence here and its worth repeating. Know your performance goals before your fingers touch the keyboard.
> My intention was to warn the ones that read this and say, oh i gotta do all of this for every piece of software that I write, which will undoubtedly lead to overly complex code when a simple solution would have worked just as well.
No that is not true at all.
I am writing a crud app to be used by 1 person, some manager of a widget factory. I say "my perf goal is to have page loads in 10 seconds or less". How will that make my code more complex? If anything it will make my code MORE simple, as I can relax all kinds of constraints like "making 72 database queries per page is generally bad".
I believe the GP's point was that if you know that your performance goals are very relaxed, then you can make the code appropriately simple from the beginning. Conversely, if your performance goals are stringent, then you can take an appropriate approach from the beginning.
I disagree completely with your 80% number. I haven't worked on a project that had 0 performance work ever.
Performance is a first class design constraint just like development time, budget & functionality, if you don't treat it as such from the beginning you are asking for trouble.
But different parts of a project usually have different performance goals. Typically just the important/frequently used parts need to be optimized. Infrequently used things (maybe setup, admin interface, options dialog) can usually just be "good enough."
So one could say that within a project, 80% of the code isn't performance critical.
No kidding. Who ever said anything about not knowing what the requirements are?
The point you were disagreeing with here was "80% of the software an engineer will write will not require optimization". ie, 80% of any given system hitting the performance requirements naively. So... why are you still arguing?
Because blind adherence to the "Make it work then make it fast" advice has been the bane of my career.
Generally speaking, I have not found it to be true that you can make something fast if you didn't think about performance first. If you thought about it, and came to the decision "it will be fast enough no matter what we do" bully for you, but for me that happens way less than 80% of the time.
That's all well and good, but still "make it work" should always come before "make it fast". For example the code I'm writing right now in my other window is terribly inefficient, quite naive and may very well have to be re-written, but I don't care at the moment. All I care about right now is: Is what I'm trying to do actually practically possible in the general case. If so what approach will give the best/good enough results. And finally, if I manage to do what I'm trying to do, will this particular approach offer a better solution to the higher level problem I'm dealing with than my other approach.
Once I've answered "yes" to all those questions then I can think about heading back and trying to make it fast. Writing really high performance code that doesn't solve the problem you have or give you the results that you need is, of course, a waste of time.
You misunderstand "make it work" includes performance goals. The difference is if you consider them upfront it includes the explicitly not implicitly.
You'd never say that the you've made the code "work" if you knew it would take 3x of your budget to get there. Similarly with performance, it doesn't "work" if it doesn't meet the perf goals of the project, no matter how relaxed they might be.
I guess we're arguing semantics at this point. For me "make it work" means that I'm able to write a piece of code that takes the input I want and returns the output I want eventually, doesn't matter if it takes a week instead of a minute to run. Before I get to that point then any optimization I do is probably not the best use of my time. Also if I can't get to that point then it means I really don't understand the underlying problem and I'm probably not in a position to start reasoning about making it fast.
Generally speaking I find it much easier to take slow, working code and making fast, as opposed to fast, broken code and making it work.
Generally speaking I don't. The only way I've ever been able to be successful on the performance front is to conclude that code that is slow is broken, not that it is working before then.
Backing into acceptable performance after the fact just doesn't work for the problem domains I work in (which on first blush have not been exclusively performance based).
Even better is: Know your performance goals going in and choose your tools and approaches accordingly.
You still mostly want to follow the rough priority above. You absolutely may prototype to convince yourself a performance goal can be met early on, but if the goal is high performance code you are still far better off making the first pass for correctness. Skipping this step often leads to highly performant code that is wrong, and is a pain in the ass to debug.
The "make it work, then make it right, then make it fast" mantra is both overly simplistic and deeply true.
How do you know the speed expectations? Do you always talk concrete numbers with the stakeholders before each change you make, or just use reasonable rules of thumb you determine?
Not before each change but before any major new initiative or refactor. Having those numbers up front is the only way to make appropriate trade offs.
Having this conversation with stakeholders often educates them on the costs of performance as well. Getting 100% of responses sub 200ms is frequently orders of magnitude more expensive than getting 99% of them there, and stakeholders usually get that fast when you show them budget info.
Replying to the second paragraph, there often is a real value in maintaining strong upper bound for the latency, especially in distributed real-time systems (which are most of the real systems, anyway).
E.g. (99% sub-200ms and 1% _unbounded_) vs (80% sub-200ms and _always_ sub-500ms) means 1% of potentially unanticipated crashes (a hell to debug and explain to customers!) vs a highly reliable system and happy customers.
For sure, thats the definition of a real time system after all. But having conversations about what the long tails do to the "normal" path and what the costs (both in money and performance in the "normal" path) is quite simply something that you can't back into.
Maybe I didn't understand you correctly, but you can at least have a "return error on timeout" and process that with a predictable logic. Or maybe you do have an architecture when any individual tardy request absolutely cannot impact others. After all, I come from stream processing systems where there's only few "users" with constant streams of requests, and these users are interdependent (think control modules in a self-driving car).
What I'm suggesting is the decision on what you do in the case of long tail performance problems, is not something you can back into.
If you are going to have timeouts with logic, that has down stream implications. If you are going to have truly independent event loops, that is a fundamental architectural decisions.
None of those things match the "make it work, then make it fast". You literally have to design that into the system from jump street as it is part of the definition of "works".
Except I also see crud apps that take 8 seconds to load, and you can't just fix it with a cache. EVERY app has performance considerations. Literally every app. Some may be very loose.. but then spell it out, and use that to think about it.
If you're following basic REST guidelines regarding idempotency then, yes, you can "just fix it with a cache." That's the whole point of the guideline. Updates can still take awhile, but you can fix that if you need to with a backend jobs server.
Performance should always be in your mind somewhere, but it doesn't always need to be at the forefront.
Unless it's taking 8 seconds to load because your user is on 2G. Then the only way you can avoid that hit is by not making the requests in the first place.
What you'll often find however, is that in the last step ("make it fast") you have limited room to maneuver, because of the stuff you did in the first 2 steps. If you really need high performance, you need to design for high performance, not just leave it as an afterthought.
Usually it's almost impossible to predict where your design flaws would be until you actually use the thing in production. Because you make a lot of assumptions and some of them will inevitably be false. So, make it fast is mostly about changing design.
Because they stopped at step one (make it work) or two (make it right). Or the first of my three (borrowed from Joe Armstrong), make it work, make it beautiful.
It's not simply that they failed to plan for performance, it's that after making it work -they never measured and removed obvious bottlenecks-.
Most pieces of software don't need high performance. Drawing some widgets on screen just isn't that demanding. But it -does- require you to go back afterwards and remove places you introduced inefficiencies. You don't need to code in C and optimize against cache misses for that; you just need to take time after things work to make them not suck, and thats something a lot of software development doesn't take the time to do.
Most client software doesn't need performance. Drawing widgets is a client-side thing. Scaling is really the hot point that discovers bottlenecks. Server have to scale. And nobody has time to plan/code very far for scaling when they haven't succeeded yet.
I think its perfectly normal to write slow, non-scalable code as proof of concept. Then continue to attack bottlenecks as you grow (if you grow). Its a lucky startup that has to deal with performance. They can afford to dedicate a couple smart developers just to that issue.
Michael Dell said every time your company doubles in size, you have to reinvent your processes. True for software too.
Yeah, I just gave a random example of something that is very common in a generic software application; at some point, you draw some basic GUIs. Those can lead to performance bottlenecks as well (loading and displaying 10 data points in your test bed; easy. Loading and displaying 1 million data points in a production situation, a little harder). There's also basic network operations, some DB/disk writes, actual CPU usage for whatever processing has to occur, etc. Any of those may need to be super performant for some apps, some will be front end, some backend, but for many applications and uses, at least initially, it's not worth worrying about at first.
That said, I'd draw a distinction between vertical scaling and horizontal scaling. The former you should address as needed, as the gains are comparatively limited, and the bottlenecks are unknown (you think you're CPU bound; whoops, nope, I/O. Or whatever); the latter should be designed for if there's a chance you'll need it. Because oftentimes, things that are merely decisions early on (no difference in amount of work) can lead to savings of months of effort and churn down the line if you go for something that scales. Decisions like deciding what data needs to have strong consistency, versus what data can be eventually consistent (and choosing data stores based on that), trying to avoid shared state, thinking about "what happens if there are more than one of these?" and designing/implementing with that in mind (even if some aspects are super hard and you punt on them, there's plenty of low hanging fruit that you can address with minimal effort early, rather than massive later on).
Because there is no user story for "make it performant", and there is little thought put into dynamic design vs static design. If money comes in with crappy performance and it is hard to predict how much more money will come if the developer improved performance, then it won't be a business priority.
I've seen more performance problems caused by people not having clean code than I ever have from people not thinking about performance from the get go.
I've also seen plenty of performance issues ironically caused by performance hacks wedged in early on.
You will often find clean code and fast code converge on very similar places. It is often a false dichotomy to think code needs to be either clean OR fast.
Now this does break down, if you need to get to the point where you are bit twiddling, it is not going to be clean as using something higher level.. but you can often put the nasty parts in a static method somewhere and still have the code be very easy to read.
this reminds me of the post about the jvm code cache a few days ago where if they had left the jvm to optimize the code cache by itself, the would not have ended up with the problem situation where the had to spend time to figure it out their optimization was the cause of the problem.
Sure; but a working version is still really really useful to compare against (e.g. build up a suite of test cases) even if you have to redesign large parts for performance later.
My personal approach is to do some order-of-magnitude performance testing as early as possible, to validate that the approach chosen to solve a particular problem is at least tenable.
If you ignore performance completely until late in the project, you can paint yourself into a corner. This includes cases like knowing that performance is 50x slower than will be acceptable throughout development, but saying "we can add X later for an easy performance win". If you don't actually test that X gets you within reach of your performance target at an early stage, you can end up with a fully built system that is unusable.
One should put performance considerations under "Make it Work", but that is probably obvious. The definition of "It works" (or eventually "Its correct") should include certain latency or throughput requirements and sometimes can't be left out for "Make it Fast" phase.
On the one hand I understand that most programs don't need to be specially fast. But on the other it leads to such of waste of time for the users. Where it really matters we usually see some kind of rewrite or new program that is designed to be fast and it can take significant slice of the pie. At the same time there is also a case for ease of use - ease of use can make even slow programs not only feel fast, but also take shorter time from decision to install/run said program to achieving user's goal.
There is no silver bullet, but few (or many?) rules of thumb ;)
As with any engineering problem, there is usually a trade-off involved. Premature optimization is bad but software does need to be fast enough to start with and flexible enough to be improved later.
You don't want to code yourself into a corner by accident. It's fine to knowingly take on technical debt, as long as you have a plan to fix it in the future. Even if this is never required.
I always try to take a pragmatic approach, as things are usually not as binary as these simple rules of thumb assume. The real world is typically very nuanced, which is why engineering can sometimes seem like more of an art than a science, and why experience is so valuable.
Shameless plug - I hope this attitude comes across in my book that focuses on web application performance issues: "ASP.NET Core 1.0 High Performance" (https://unop.uk/book/).
Yes and no; we all know the Knuth quote etc, but there are a lot of design decisions you have to make up front (language, database, server, framework, etc.) that you can't change later, but which will set the floor and the ceiling for what your performance looks like.
For instance, if you're working in a resource constrained environment, garbage collected vs not garbage collected is a big decision. Or if you're working on a web app, how you layout your database tables or your nosql equivalents is going to have a huge effect, and is much harder to change later.
There's a huge difference between premature optimization vs making decisions that will have performance consequences down the road. If you wait till the end of a project/release cycle to think about performance, the amount you can do about it will usually be disappointing.
Edge cases. When something "works" it handles the requirements of your customers in the normal case, but may misbehave in edge cases. When it is right it won't.
If this is your mantra, you aren't really working on code that require performance. If you where, the "make it fast" and "make it right" would be the same thing.
* If you have transactions, keep your transactions to a single machine/instance/db as much as possible. Muti-machine or software transactions are the LAST solution you should try.
* Pay attention to payload sizes. Make sure you dont come close to saturating the network. Which leads to weird "app is slow" problems.
* Design for testability and diagnosability of production systems. If this is java, use JMX extensions EVERYWHERE for EVERYTHING.
* Time (and timing) is your enemy. Make it your friend.
EDIT: Side note. IMO, JMX extensions are one of the most under-appreciated things that java and jvm devs have but keep forgetting about but its so powerful.
IMO, JMX extensions are one of the most
under-appreciated things that java and
jvm devs have but keep forgetting about
but its so powerful.
This is an excellent point which I cannot agree with more. When doing distributed systems using JVM's, I almost always reach for the excellent Metrics[0] library. It provides substantial functionality "out of the box" (gauges, timers, histograms, etc.) as well as exposing each metric to JMX. It also integrates with external measuring servers, such as Graphite[1], though that's not germane to this post.
I'd hesitate to call Metrics 'excellent'. It does quite a bit of processing on numbers before it emits them - averaging over timespans and so on - so if you're feeding them into something like Graphite which does further processing, you're averaging twice, and getting the wrong numbers.
If you're just going to print the numbers out and look at them, it's very handy. But if you want to handle them further, which you really do, I'd avoid it. Put individual measurements in your logs, then scrape them out with Logstash or something and send them to Graphite for aggregation.
Use of Metrics functionality is going to vary based on need. So if a system is using something such as Graphite, then the measurements sent should reflect that. This is independent of a particular library IMHO.
So when you say:
... if you're feeding them into something
like Graphite which does further processing,
you're averaging twice, and getting the
wrong numbers.
This would be an issue with sending Histogram, Meter, and/or Timer values, but I'd categorize that as a design defect. Using Metrics to send Gauge and Counter data to a Graphite-esque system shouldn't be a problem at all. For systems which do not have a metrics aggregator or when some need to be exposed to external agents (such as client systems), using the types provided by Metrics can be very useful.
As for using log files to "scrape out" system metrics, I realize this is a common practice yet is one which I am fundamentally against. What I've observed when taking the "put it in the logs and we'll parse it out later" tactic is that a system ends up with sustained performance degradation due to high log output as well ops as having to incorporate yet-another system just to deal with profuse logging.
Treating system metrics as a first-class concern has shown itself to be very beneficial in my experience. Persisting them is a desirable thing, agreed, and a feature point fairly easily met when metrics are addressed separately from logging.
I agree. I'm a fan of pull model for metrics, where app logs very granular numbers and whoever wants metrics at whatever granularity will pull from these logs into splunk/tsdb etc. Let the app be simple in emitting metrics
While I agree deeply nested calls are bad, I've also had the pleasure of inheriting a an application with several, muti-thousand line functions and those are not fun either.
Couldn't agree with you more. I get the impression that a lot of people in this thread have never worked on behemoth, million-line enterprise code bases. Cognitively speaking, 4 nested thousand-line function calls is equally as bad as 15 nested 10-line function calls.
> In code as in life, all in moderation.
Beautifully said. Both methods (procedural vs. object oriented) give you different ways to shoot yourself in the foot. Instead of picking one or the other, it's more important to manage the scope at which your project grows. You have to avoid falling into the traps of each style.
The worst example of this I ever saw was an old macOS helpdesk application (@1995) that had a 14,000 line main event loop and the entire application was a single 29,000 line C file. Just typing at the bottom of the file was agonizing. We tore the whole thing apart and put it back together as a real app with include files and everything.
Sometimes that is just inherent in the problem. If a a few thousand lines are actually needed to solve the problem, it will be somewhat hard to deal with whether they are all in one big function or broken up into many functions. One option may be marginally less annoying than the other, depending on what you are doing, or preference perhaps.
But sometimes it is just a "Such is life" situation.
I think it depends on whether the problem lends itself to abstraction or not. If I have a problem that requires 1000 lines to solve several large system of linear equations, I'm absolutely going to be separating that out. On the other hand, if I have 1000 lines of "if in country A, apply this tax rate", then I'm not going to separate it, because there is no underlying abstraction to use.
That's actually one of the more reasonable uses for OO; have an interface that defines 'apply_tax_rate', and a bunch of implementations for each, so you only have a single line in your function (instance).apply_tax_rate. Of course, that leads to your actual implementing code being scattered in different classes, and that can be a pain, too.
If a functional language, I'd prefer each being its own function, with the case matching in a single, standalone file that returns the appropriate function based on whatever. That way you still get just apply_tax_rate(type, amount) in your main calculation function.
You are saying something I highly doubt you want to be saying.
My "problem" is an online store. It is about 75,000 lines of code. It needs all 75,000 to solve the problem. What you are saying is that it would be "somewhat" hard to deal with whether it is all in one big function or many functions.
But in reality, one big function would be incomprehensible. It is vastly easier to deal with broken out into many functions.
A problem that genuinely takes a few thousand lines to solve doesn't sound "hard to deal with". It sounds just the right size for a module or tool that solves a problem big enough to be non-trivial, but small enough to be solved well.
However you end up solving it, it is way to big for a single function. On the other hand, a moderate bunch of functions can do it nicely -- and at that scale you don't need big complicated object-oriented abstractions other than the external interface itself.
I suspect the bugs like this come about because of a patch not because of the original development. Devs and dev teams tend to get sloppy after the first push and budgets tend to shrink dramatically.
One time I found a bug that was running for a few years and the result of it was the company was under reporting by millions of dollars per quarter (the running total was near to $100mm, and im sure it crossed it after my contract ended).
This was VERY well tested software in the beginning (one of the best test suites i've seen actually) and audited up to high heaven. The problem started when the patches rolled in and those, are not tested anywhere near as much.
https://en.wikipedia.org/wiki/Second-system_effect
2.0 is the hardest version to write.