Hacker News new | past | comments | ask | show | jobs | submit login

I simply do not see how competing cloud vendors can keep up with this. Most of them are still struggling to provide anything beyond a simple API to start/stop machines.



Well, Azure has been offering a similar NoSQL service for a few years now.

http://sigops.org/sosp/sosp11/current/2011-Cascais/printable...


I've been using Azure Table Storage since the beginning and this doesn't seem to be the same. Like others have mentioned TS is more similar to SimpleDB. Now, I would love for someone to give me a tl;dr on the the feature set of DynamoDB so I can make an accurate comparison.

Table Storage does not allow any other indexes other than the main primary ones (Row Key and Partition Key). You also cannot store complex object within fields and use them in a query. You basically just serialize the data and stuff it into the field.

The dynamic schema is very nice if you can leverage it but the actually query support is TERRIBLE. (Sorry Microsoft, I'm a fanboy but you blew it here). There is no Order By or event Count support which makes a lot of things very difficult. Want to know how many "color=green" records there are? Guess what, you're going to retrieve all those rows and then count them yourself. They're starting to listen to the community and have just recently introduced upserts and projection (select). I would love to see them adopt something like MongoDB instead :)

For more issues check out: http://www.mygreatwindowsazureidea.com/forums/34192-windows-...

Edit - For what its worth. We've moved more things to SQL Azure now that it has Federation support. Scalability with the power of SQL. http://blogs.msdn.com/b/windowsazure/archive/2011/12/13/buil...


No. They offer the equivalent of Amazon S3, SimpleDB and SQS, but nothing comparable to this


Can you elaborate how this is different and not comparable? Azure's table service offers the same automatic partition management, unlimited per-table scalability, composite keys, range queries, and availability guarantees. The linked paper goes into more details.


Besides the points I already complained about before... How about 200ms response times even when performing a query using the Row & Partition Keys. I'm not sure if by composite keys you were referring to something other than the RK & PK because those are the only indexes you get.


ATS response times within the Azure data center are pretty impressive in my experience.

Your partition keys can be composite, have a look here:

http://blogs.msdn.com/b/windowsazurestorage/archive/2010/11/...

I agree with your other pain points - in terms of not being able to get counts, secondary indices etc. However, you can easily simulate some of those - maintain your own summary tables, indices and so on. These ought to emerge as platform features pretty soon though. It's not perfect, but its feature set is close to Dynamo.

As for Mongo DB, I guess this service has been built from ground-up to provide the availability guarantees and automatic partition management features. I don't know if Mongo provides those. You could run Mongo yourself on Azure if you wanted to; there's even a supported solution done recently.


Hmm, I guess when I think about composite keys I think of ways to indicate a specific field/column as being part of the key. Data duplication along with string concatenation aren't really an elegant way to do it. If I remember right you also can't update the key values once the record has been saved. This is coming from a big SQL guy though :)


Balakk is correct. There are a lot of similarities between Windows Azure Tables and DynamoDB, and the release of DynamoDB validates the Data Model we have provided for a few years now with Azure Tables

• They both are NoSQL schema-less table stores, where a table can store entities with completely different properties

• They have a two attribute (property) composite primary key.One property that is used for partitioning and the other property is for optimizing range based operations within a partition

• Both of them have just a single index based on their composite primary key

• Both are built for effectively unlimited table size, seamlessly auto scale out with hands off management

• Similar CRUD operations

How Windows Azure Tables is implemented can be found in this SOSP paper and talk: http://blogs.msdn.com/b/windowsazurestorage/archive/2011/11/...

As mentioned by someone else, one difference is that DynamoDB stores its data completely in SSDs, whereas, in Azure Storage our writes are committed via journaling (to either SSD or a dedicated journal drive) and reads are served from disks or memory if the data page is cached in memory. Therefore, the latency for single entity small writes are typically below 10ms due to our journaling approach (described in the above SOSP paper). Then single entity read times for small entities are typically under 40ms, which is shown in the results here: http://blogs.msdn.com/b/windowsazurestorage/archive/2010/11/...

Once and awhile we see someone saying that they see 100+ms latencies for small single entity reads and that is usually because they need to turn Nagle off, as described here: http://blogs.msdn.com/b/windowsazurestorage/archive/2010/06/...


This is running on SSDs and it makes a HUGE difference.


How is this vastly different to Azure Tables?


The cost per transaction, performance & ease.

Reads per $0.01 = (50.60).60 = 180000

Writes per $0.01 = (10.60).60 = 36000

Assuming that you hit your usage is at 100% capacity then from a read prospective DynamoDB is half the price. Writes are much more expensive but many applications are heavily read oriented.

DynamoDB claims single digit millisecond reads, azure tables does not (from my experience.)

Azure tables have a maximum performance over a given partition table of 500 requests per second and over the whole account of 5,000 requests per second. DynamoDB does not state this.

http://blogs.msdn.com/b/windowsazurestorage/archive/2010/05/...

To put this into context:

Assume a system with 5000 writes per second and 50000 reads here are the costs:

AWS Reads: $240 AWS Writes: $120 Aws Total: $360

Azure Reads: $4320 Azure Writes: $432 Azure Total: $4752

Seems like quite a difference for a decent sized read heavy application.


Can you please explain your math? AFAIK Azure txns are not paid by the hour - they are a flat cost of $.01 per 10000 storage txns. If you do batched GETs and PUTs you make only 550 txns (55000/100 entites/batch).

http://www.windowsazure.com/en-us/home/tour/storage/

I agree that Dynamo's provisioned throughput capacity is a very useful feature though. Azure does not provide any such performance guarantee; the throughput limit is also a guideline as far as i know, not an absolute barrier.


I should have explained that my costs were calculated on a "per day" assumption. Thus the costs are for:

5000 x 60 x 60 x 24 = 432000000 Writes

50000 x 60 x 60 x 24 = 4320000000 Reads

(432000000/10000) x 0.01 = $432

(4320000000/10000) x 0.01 = $4320

Azure Total Cost For One Days Use: $4752

((5000/10) x 0.01) x 24 = $120

((50000/50) x 0.01) x 24 = $240

AWS Total Cost For One Days Use: $360

You are right that I don't take into account the bulk feature of azure reads & writes but this is down to bulk requests only being possible on a single partition at a time which in my personal experience (not exhaustive) is non-trivial to take advantage of.


Your math is right, except you missed a factor for Dynamo - Unit size.

http://aws.amazon.com/dynamodb/#pricing

If your txns are all within 1KB, your math holds good; otherwise, you pay more. Interesting model, but I suspect it'll average out to similar costs.


The cost difference between Windows Azure Tables and DynamoDB really depends upon the size of the entities being operated over and the amount of data stored. If an application can benefit from batch transactions or query operations, the savings can be a lot per entity using Windows Azure Tables.

For the cost of storage. The base price for Windows Azure Tables is $0.14/GB/month, and the base price for DynamoDB is $1.00/GB/month.

For transactions, there is the following tradeoff

• DynamoDB is cheaper if the application performs operations mainly on small items (couple KBs in size), and the application can’t benefit from batch or query operations that Windows Azure Tables provide

• Windows Azure Tables is cheaper for larger sized entities, when batch transactions are used, or when range queries are used

The following shows the cost of writing or reading 1 million entities per hour (277.78 per second) for different sized entities (1KB vs 64KB). It also includes the cost difference between strong and eventually consistent reads for DynamoDB. Note, Windows Azure Tables allows batch operations and queries for many entities at once, at a discounted price. The cost shown below is the cost per hour for writing or reading 1,000,000 entities per hour (277.78 per second).

• 1KB single entity writes -- Azure=$1 and DynamoDB=$0.28

• 64KB single entity writes -- Azure=$1 and DynamoDB=$17.78

• 1KB batch writes (with batch size of 100 entities) -- Azure=$0.01 and DynamoDB=$0.28

• 64KB batch writes (with batch size of 100 entities) -- Azure=$0.01 and DynamoDB=$17.78

• 1KB strong consistency reads -- Azure=$1 and DynamoDB=$0.05

• 64KB strong consistency reads -- Azure=$1 and DynamoDB=$3.54

• 1KB strong consistency reads via query/scan (assuming 50 entities returned on each request) – Azure=$0.02, DynamoDB=$0.05

• 64KB strong consistency reads via query/scan (assuming 50 entities returned on each request) – Azure=$0.02, DynamoDB=$3.54

• 1KB eventual consistency reads – DynamoDB=$0.028

• 64KB eventual consistency reads – DynamoDB=$1.77


Open source really helps here. Amazon are innovative, but they are not the only place innovation is happening. In fact, here's a pretty good writeup (if a wee biased) on how the new offering compares to to the open source Cassandra project: http://www.datastax.com/dev/blog/amazon-dynamodb


Cassandra is a great project, as it is Hadoop, MySQL, etc. The issue I am raising is that it is not so much which project is better on a feature basis, but the fact that Amazon is able to offer it as a service, in a scalable way that no other vendor is able to do (with the exception of Google and, on a good day, Microsoft). Most other "traditional" cloud vendors, such as Rackspace, do not have anything remotely comparable to this, EBS, SQS, RDS, etc.


I also found it interesting that the storage media is specified and it's SSDs. Solid state will be hugely disruptive for hosted services, I've been hoping for an instance-by-the-hour service backed by SSDs and I'll surmise from this announcement that it won't be long before that shows up on the EC2 menu. Gimme :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: