We have a database (SQL Server) which has a table filled with stats. The table currently has around 10 million entries in it. The table is organized "horizontally" so that one user has one row, and we have about 50 stats in the table.
We have a leaderboard for every single one of these stats, and we want to have a "live" leaderboard - that is, the ranks are fresh on each request.
Right now we use an indexed RANK() query which can rank the top 100 in like 15ms. Top 10K is 100ms. Top 1M (or, offset 1M and get the next 100) is around 20 seconds.
At the very least, we'd like to rank 1M within a couple seconds (10x improvement) but ideally we'd be able to serve any number (up to say.. 10 million) of ranks in under a second. This is probably unreasonable, but it's our goal.
Is there any advantage to using a "vertical" stable structure where each stat is a row? That's where the >100M figure comes from - if we convert these stats into rows it's (number of users) x (number of stats).
Is there any specialized way of accomplishing this? Is the database slowing us down? Maybe there's a better way to store all of this stuff than in a DB? I was thinking perhaps storing it in the DB and an in-memory store at the same time, and repopulating the in-memory store when the server starts. But that wouldn't be much use if the in-memory solution doesn't give us a massive performance boost.
Right now we use dedicated servers for our application and database needs. We'd like to switch to cloud architecture but we want to both accomplish what is mentioned above while also reducing costs (we pay quite a bit for our current stack.)
If so, I think the problem is not clearly described. What does it mean to display a result row? Are you displaying the stat value and a primary key? The entire row? What exactly does it mean to display 10M rows? Are they all presented on one web page? Or is it N at a time with prev/next links? Or is there random access within those 10M (e.g. find me in the ranking for stat #27). Do you overlap retrieval with the display process, (showing the first rows when available)? When someone clicks prev or next does it recompute the entire result?