Windows + Ubuntu + CentOS ... I'm curious to know how the OS selection was made and specifically why the HAProxy servers are Ubuntu and the Redis servers are CentOS. Do your admins work on all three OSs?
It's pretty simple, we started off on Ubuntu for our linux boxes. Then made a decision to move to CentOS for the stability and vendor package compatibility for our core services. However, We are going to stick with Ubuntu on some of our back end servers - management and utility servers for the larger package repositories.
Our Core Q&A engine is written in C# with MS-SQL databases so obviously we need to run windows for that.
There are only 2 sysadmins so yes we work on all OS's
I have also ran .NET backed stuff that was larger than this... It's not that uncommon. In my experience the guys running .NET backed stuff just aren't usually as vocal about things. ;) hi5 (the social network) also runs a lot of .NET and SQL Server nowadays. They're also running Windows Server: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?...
A (large) previous company of mine used .NET for an internal operation which was a high volume database oriented application. Assuming it's still in use, I'm reasonably certain it did quite a lot more volume than that (orders of magnitude more), but it wasn't a website.
I found it curious that there is so much hardware that is at around 20% utilization. Wouldn't it be better to have less hardware with higher utilization and buy cheaper/better hardware later if warranted?
Absolutely not. Servers and disk space are inexpensive. The closer you are to capacity, the less burst room you have. Pushing your hardware to near max capacity is a recipe for disaster.
R610s are going to have dual PSUs, and they have redundant storage arrangements, so potentially they could experience hardware failure but no downtime.
Chance of hardware failure in a modern server in a colo, assuming it's been stress tested to find DOA hardware before being put in production? Low.
A quick spec on Dell's site, rough guess for the SSDs, the web servers are running around $4,000 each, or $4,500 with better warranty. Saving of 6 servers is >$20,000.
It's entirely up to their funding and growth plans and risk acceptance if that's a useful saving, but if they were a ramen-profitable-is-goal-one startup, it would be months of runway.
They apparently have two developer/sysadmins, so the $20K savings is less than a month of their salaries. For the redundancy and growth that affords, it's a no-brainer.
Probably, but considering the amount of bragging (not meant pejoratively) they've done about how cheap their hardware is considering their traffic, it's probably not a huge bank-breaking issue. Also, they could almost certainly be significantly more energy efficient with hardware running near 100% utilization.
Well, when you're putting your own hardware in your own datacenter it can take hours (or days) to scale capacity. If you have a huge traffic spike that can really hurt the user experience. It's much better to have it on standby. We learned this the hard way at Hive7 with our first social game...
We backup our databases nightly and restore them to two different locations. One local to our NY data center for our devs to work against, and one remote in our OR data center.
This is a nice overview.
The article mentions database pairs for the MSSQL servers. What high availability mode is used, database mirroring or cluster?
Could someone clarify this: the post says that the servers are in _our_ Data Center in NY and OR. I have always thought that DCs host 100s to 100,000 machines! Is this a colo?
EDIT: What is the typical network utilization for such web-services?
One may comment about where they live by calling it "my apartment complex", for example, even though they merely rent a single unit rather than own the entire complex. I think that's the same verbal construction as here.
Does 9.5-10 Million hits a day make them the largest .Net backed site?