Hacker News new | past | comments | ask | show | jobs | submit login

In my opinion the correct answer is 255Gb. (i.e. AWS r3X8 High Memory instances ).

While one can purchase servers with larger memory most likely you will run into limitation on number of cores. Also note that there is at least some overhead in processing data, so you would need at least 2X the size of raw data.

Finally while its a good thing to tweet, joke about and make fun of buzzword while trying to appear smart. The reality is that purchasing such servers (> 255 Gb RAM) is costly process. Further you would ideally need two of them to remove single point of failure. it is likely that the job is batch and while it might take a terabyte of RAM you only need to run it once a week, in all these cases you are much better off relying on a distributed system where each node has very large memory, and the task can easily split. Just because you have cluster does not mean that each node has to be a small instance (4 processors ~16 Gb RAM).




> Further you would ideally need two of them to remove single point of failure.

That's assuming that everything needs to be 'high availability' and buying two of everything is a must. This is definitely not always the case. In plenty of situations buying a single item and simply repairing it when it breaks is a perfectly good strategy.


Its not about having two of everything at all times, but rather about having a capacity whenever you need it. At 244Gb you hit a sweet point where you can have access to large capability at a flexible price (Spot Market / On Demand / On Premise). This is what separates engineers with business acumen from run of the mill "consultants" with a search engine.


You mentioned 'single point of failure'.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: