The Stack Exchange Architecture

barranger · on Sept 30, 2011

Interesting to see one of these Architecture blog posts for a site running on the .Net framework.

Does 9.5-10 Million hits a day make them the largest .Net backed site?

smoyer · on Sept 30, 2011

Windows + Ubuntu + CentOS ... I'm curious to know how the OS selection was made and specifically why the HAProxy servers are Ubuntu and the Redis servers are CentOS. Do your admins work on all three OSs?

gbeech · on Sept 30, 2011

It's pretty simple, we started off on Ubuntu for our linux boxes. Then made a decision to move to CentOS for the stability and vendor package compatibility for our core services. However, We are going to stick with Ubuntu on some of our back end servers - management and utility servers for the larger package repositories. Our Core Q&A engine is written in C# with MS-SQL databases so obviously we need to run windows for that.

There are only 2 sysadmins so yes we work on all OS's

thwarted · on Oct 1, 2011

Any good Linux admin should be able to work on any distribution... and will... as long as you allow them to complain about it.

braden_e · on Sept 30, 2011

Other than microsoft themselves: myspace is still fairly huge but probably a mixed stack. Plentyoffish is up there too - http://highscalability.com/plentyoffish-architecture

jconley · on Oct 1, 2011

I have also ran .NET backed stuff that was larger than this... It's not that uncommon. In my experience the guys running .NET backed stuff just aren't usually as vocal about things. ;) hi5 (the social network) also runs a lot of .NET and SQL Server nowadays. They're also running Windows Server: http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?...

michaelcampbell · on Oct 1, 2011

A (large) previous company of mine used .NET for an internal operation which was a high volume database oriented application. Assuming it's still in use, I'm reasonably certain it did quite a lot more volume than that (orders of magnitude more), but it wasn't a website.

brugidou · on Sept 30, 2011

We do many more hits a day with IIS and .Net on windows. And we're far from the biggest.

ComputerGuru · on Oct 1, 2011

Mind if I ask who 'we' is?

mikerg87 · on Sept 30, 2011

I found it curious that there is so much hardware that is at around 20% utilization. Wouldn't it be better to have less hardware with higher utilization and buy cheaper/better hardware later if warranted?

getsat · on Oct 1, 2011

Absolutely not. Servers and disk space are inexpensive. The closer you are to capacity, the less burst room you have. Pushing your hardware to near max capacity is a recipe for disaster.

biot · on Oct 1, 2011

Time for the sysadmin math quiz:

10 servers running at 20% utilization. Two servers experience hardware failure. What's the resulting utilization per server?

4 servers running at 50% utilization. Two servers experience hardware failure. What's the resulting utilization per server?

Which approach affords greater redundancy and room for growth?

jodrellblank · on Oct 1, 2011

R610s are going to have dual PSUs, and they have redundant storage arrangements, so potentially they could experience hardware failure but no downtime.

Chance of hardware failure in a modern server in a colo, assuming it's been stress tested to find DOA hardware before being put in production? Low.

A quick spec on Dell's site, rough guess for the SSDs, the web servers are running around $4,000 each, or $4,500 with better warranty. Saving of 6 servers is >$20,000.

It's entirely up to their funding and growth plans and risk acceptance if that's a useful saving, but if they were a ramen-profitable-is-goal-one startup, it would be months of runway.

biot · on Oct 1, 2011

They apparently have two developer/sysadmins, so the $20K savings is less than a month of their salaries. For the redundancy and growth that affords, it's a no-brainer.

tshaddox · on Sept 30, 2011

Probably, but considering the amount of bragging (not meant pejoratively) they've done about how cheap their hardware is considering their traffic, it's probably not a huge bank-breaking issue. Also, they could almost certainly be significantly more energy efficient with hardware running near 100% utilization.

latch · on Oct 1, 2011

We wouldn't leave our cars on 80% of the time idling, but servers, well, why not? Every little bit helps guys.

jconley · on Oct 1, 2011

Well, when you're putting your own hardware in your own datacenter it can take hours (or days) to scale capacity. If you have a huge traffic spike that can really hurt the user experience. It's much better to have it on standby. We learned this the hard way at Hive7 with our first social game...

kitsune_ · on Sept 30, 2011

Ram is pretty cheap these days...

jodrellblank · on Oct 1, 2011

and less RAM is even cheaper. You don't address the point at all.

mkup · on Oct 1, 2011

the point is: traffic is 50x expensive compared to RAM

jodrellblank · on Oct 1, 2011

I don't understand what you are saying.

The original comment is based on the assumption that less hardware would still easily support the same traffic level.

michaelcampbell · on Oct 1, 2011

But not the same burst-over-average traffic, nor failure tolerance.

Goladus · on Oct 1, 2011

Do you have additional datacenter environments for staging and development?

fleaflicker · on Oct 1, 2011

From the end of the post:

We backup our databases nightly and restore them to two different locations. One local to our NY data center for our devs to work against, and one remote in our OR data center.

brunnsbe · on Oct 1, 2011

This is a nice overview. The article mentions database pairs for the MSSQL servers. What high availability mode is used, database mirroring or cluster?

xtacy · on Sept 30, 2011

Could someone clarify this: the post says that the servers are in _our_ Data Center in NY and OR. I have always thought that DCs host 100s to 100,000 machines! Is this a colo?

EDIT: What is the typical network utilization for such web-services?

gbeech · on Sept 30, 2011

yes, we are in two colo's one in Oregon and one in NYC

InclinedPlane · on Oct 1, 2011

One may comment about where they live by calling it "my apartment complex", for example, even though they merely rent a single unit rather than own the entire complex. I think that's the same verbal construction as here.

DodgyEggplant · on Oct 1, 2011

everything: DB, Web servers etc aprox 600GB RAM