Before my VPS folded it was mostly bot traffic, which I hadn't been expecting. Evidently many follow the HN RSS feeds and hit new posts (repeatedly..) when they show up.
I've also noticed something in my logs that is stupidly requesting these 'apple-touch.png' icons over and over thousands of times. It seems to discover links from HN. Even if you create a file with that name, it will just keep requesting it over and over.
I saw something like this once where a shitty app was using some of the iPhone icons, but it was proxying them all without any caching.
So if 10 people would load the app, their backend would make 10 requests to the icon on my server and then would relay those images to each user.
I could see something like this happening with a "push" system so when something new is posted on HN some overly clever dev decided to push the icon and title to each device to be read later but forgot to cache the icon so they end up slamming the server.
I have the same effect on my machine. My guess is that since the L3 cache is shared between cores space used by processes running on other cores is causing a bit of 'blurring' of the performance on the boundary based on how much will data will fit.
I think it's this too. The L3 cache seems to be consistently harder to measure than the L1 or L2 caches. I suspect it's just noise from other cores, but I have no strong evidence to support that.
I can see some pretty strong effects (and false positives) from what seem to be TLB misses. On my system, on Firefox, the L2 size is not detected correctly if transparent huge pages are enabled, but on Chromium it is. I'd like to take a closer at what's going on, but the tooling for performance-related debugging in browsers seem to be more or less missing. I've opened a pull request for some fixes to the C implementation though.
It's certainly possible that there are some TLB issues with transparent huge pages that mess up the results. I've had a terrible time debugging the performance across different machines, browsers and OSes, and there seem to be an uncountable number of things that can go wrong, especially with the JITs in play. I'll try to track this one down later today, but I can't make any promises.
And here's my own results: [1]
EDIT: I've also mirrored the repo, and turned it into a site using GitLab CI: [2]
[0] https://github.com/allanlw/cache_size
[1] http://i.imgur.com/LKIvyCt.png
[2] http://shakna-israel.gitlab.io/cache_size/cache_size.html