When I read stuff like this it strikes me that probably, by far, their largest operational expense is their staffing cost to orchestrate all of this. I come from a background of running small startups on a shoe string budget. I need to make tough choices when it comes to this stuff. I can either develop features or start spending double digit percentages of my development budget on devops. So, I aim to minimize cost and time (same thing) for all of that. At the same time, I appreciate things like observable software, rapid CI/CD cycles, and generally not having a lot of snow flakes as part of my deployment architecture. I actually have worked with a lot of really competent people over the past two decades and I like to think I'm not a complete noob on this front. In other words, I'm not a naive idiot but actually capable of making some informed choices here.
That has lead me down a path of making very consistent choices over the years:
1) no kubernetes and no microservices. Microservices are Conways Law mapped to your deployment architecture. You don't need that if you do monoliths. And if you have a monolith, kubernetes is a waste of CPU, Memory, and development time. Complete overkill with zero added value.
2) the optimal size of a monolith deployment is 2 cheap VMs and a load balancer. You can run that for tens of dollars in most popular cloud environments. Good enough for zero down time deployments and having failover across availability zones. And you can scale it easily if needed (add more vms, bigger vms, etc.).
3) those two vms must not be snow flakes and be replaceable without fanfare, ceremony, or any manual intervention. So use docker and docker-compose on a generic linux host, preferably of the managed variety. Most developers can do a simple Dockerfile and wing it with docker-compose. It's not that hard. And it makes CI/CD really straight forward. Put the thing in the container registry, run the thing. Use something like Github actions to automate. Cheap and easy.
4) Use hosted/managed middleware (databases, search clusters, queues, etc). Running that stuff in some DIY setup is rarely worth the development time and operational overhead (devops, monitoring, backups, upgrades, etc). All this overhead rapidly adds up to costing more than years of paying for a managed solution. If you think in hours and market conform rates for people even capable of doing this stuff, that is. Provision the thing, use the thing, and pay tens of dollars per month for it. Absolute no brainer. When you hit thousands per month, you might dedicate some human resources to figuring out something cheaper.
5) Automate things that you do often. Don't automate things that you only do once (like creating a production environment). Congratulations, you just removed the need for having people do anything with teraform, cloudformation, chef, puppet, ansible, etc. Hiring people that can do those things is really expensive. And even though I can do all of those, it's literally not worth my time. Document it, but don't automate it unless you really need to and spend your money on feature development.
But when I need to choose between hiring 1 extra developer or paying similarly expensive hosting bills, I'll prefer to have the extra developer on my team. Every time. Hosting bills can be an order of magnitude cheaper than a single developer on a monthly basis if you do it properly. For reference, we pay around 400/month for our production environment. That's in Google cloud and with an Elastic Cloud search cluster included.
Other companies make other choices of course for all sorts of valid reasons. But these work fine for me and I feel good about the trade offs.
Agree entirely. I think system design interviews are partly to blame because they select for people who think that the only way to design a system is the cargo cult method that interview prep books and courses preach, which is:
- break everything into microservices
- have a separate horizontally scalable layer for load balancing, caching, stateless application server, database servers, monitoring/metrics, for each microservice.
- use at least two different types of databases because it's haram to store key-value data in a RDBMS
- sprinkle in message-passing queues and dead-letter queues between every layer because every time you break one system into two, there can be a scenario where one part is down but the other is up
- replicate that in 10 different datacenters because I'll be damned if a user in New York needs to talk to a server in Dallas
And all this for a service that will see at most 10k transactions per second. In other words, something that a single high-end laptop can handle.
99.9% of the time your architecture does NOT need to look like Facebook's or Google's. 99% of tech startups (including some unicorns) can run their entire product out of a couple of good baremetals. Stop selecting for people who have no grounding of what is normal complexity for some given scale.
I can't agree more on this. Most products out there with medium to low traffic can be handled just fine like this. The cost of automation is often not worth the financial effort.
There's a dangerous trend in putting microservices everywhere. Then having the same level of quality as a monolith requires an infinite amount of extra work and specialized people. Your product must be very successful to justify such expenses!
My rule of thumb; monolith and PaaS as long as your business can afford to.
I mean it all makes sense if you know nothing of k8s or ansible.
Most companies these days had moved to k8s so there are a portion of hi tech workers that have prior knowledge of k8s model and deployment.
Whether you want to go monolith or not doesn't matter because you need to replicate the process at least to 2 environment: dev and prod. Not to mention it's good to be prepared had your prod env got compromised or nuked.
Where, oh god where, are there more sensibly thinking people like you! This is pragmatic and straight forward. There is very little room for technical make work nonsense in your described strategies. Most places, and many devs I meet cannot imagine how to do their jobs without a cornucopia of oddly named utilities they only know a single path of use.
This is actually a really interesting post to me. I'm currently working at the opposite of a startup with a shoestring budget. We're a medium-sized company with 100 - 150 techies in there. As a unique problem, we're dealing with a bunch of rather sensitive data - financial data, HR data, forecast and planning data. Our customers are big companies, and they are careful with this data. As such, we're forced to self-host a large amount of our infrastructure, because this turns from a stupid decision into a unique selling point in that space.
From there, we have about 7 - 12 of those techies working either in my team, saas operations, our hardware ops team, or a general support team for CI/images/deployment/buildserver things. 5 - 10% of the manpower goes there, pretty much.
The interesting thing is: Your perspective is our dream vision for teams running on our infrastructure.
Like - 1 & 2 & 3: Ideally, you as the development team shouldn't have to care about the infrastructure that much. Grab the container image build templates and guidelines for your language, put them into the template nomad job for your stuff, add the template pipeline into your repository, end up with CD to the test environment. Add 2-3 more pipelines, production deployments works.
These default setups do have a half life. They will fail eventually with enough load and complexity coming in from a product. But that's a "succeed too hard" kinda issue. "Oh no, my deployment isn't smooth enough for all the customer queries making me money. What a bother" And honestly, for 90% of the products not blazing trails, we have seem most problems so we can help them fix their stuff with little effort to them.
4 - We very much want to standardize and normalize things onto simple shared services, in order to both simplify the stuff teams have to worry about and also to strengthen teams against pushy customers. A maintained, tuned, highly available postgres is just a ticket, documented integrations and a few days of wait away and if customers are being pushy about the nonfunctional requirements, give them our guarantees and then send them to us.
The only point I disagree with is Terraform. It is brilliant for this exact scenario because it's self documenting. When you do need to update those SPF records in two years time, having it committed as a Terraform file is much better than going through (potentially stale) markdown files. It's zero maintenance and really simple. Plus its ability to weave together different services (like configuring Fastly and Route53 from the same place) is handy, too.
What if I do this with Terraform using AWS Serverless and staying in the free tier for this workload that you are referencing instead of VMs and a load-balancer?
I just don't see why people prefer the VM based approach over serverless.
That has lead me down a path of making very consistent choices over the years:
1) no kubernetes and no microservices. Microservices are Conways Law mapped to your deployment architecture. You don't need that if you do monoliths. And if you have a monolith, kubernetes is a waste of CPU, Memory, and development time. Complete overkill with zero added value.
2) the optimal size of a monolith deployment is 2 cheap VMs and a load balancer. You can run that for tens of dollars in most popular cloud environments. Good enough for zero down time deployments and having failover across availability zones. And you can scale it easily if needed (add more vms, bigger vms, etc.).
3) those two vms must not be snow flakes and be replaceable without fanfare, ceremony, or any manual intervention. So use docker and docker-compose on a generic linux host, preferably of the managed variety. Most developers can do a simple Dockerfile and wing it with docker-compose. It's not that hard. And it makes CI/CD really straight forward. Put the thing in the container registry, run the thing. Use something like Github actions to automate. Cheap and easy.
4) Use hosted/managed middleware (databases, search clusters, queues, etc). Running that stuff in some DIY setup is rarely worth the development time and operational overhead (devops, monitoring, backups, upgrades, etc). All this overhead rapidly adds up to costing more than years of paying for a managed solution. If you think in hours and market conform rates for people even capable of doing this stuff, that is. Provision the thing, use the thing, and pay tens of dollars per month for it. Absolute no brainer. When you hit thousands per month, you might dedicate some human resources to figuring out something cheaper.
5) Automate things that you do often. Don't automate things that you only do once (like creating a production environment). Congratulations, you just removed the need for having people do anything with teraform, cloudformation, chef, puppet, ansible, etc. Hiring people that can do those things is really expensive. And even though I can do all of those, it's literally not worth my time. Document it, but don't automate it unless you really need to and spend your money on feature development.
But when I need to choose between hiring 1 extra developer or paying similarly expensive hosting bills, I'll prefer to have the extra developer on my team. Every time. Hosting bills can be an order of magnitude cheaper than a single developer on a monthly basis if you do it properly. For reference, we pay around 400/month for our production environment. That's in Google cloud and with an Elastic Cloud search cluster included.
Other companies make other choices of course for all sorts of valid reasons. But these work fine for me and I feel good about the trade offs.