I disagree with this. By that logic, 404 pages should also return 200, because the error page stating that the content couldn't be found was indeed rendered successfully.
The difference between your status.heroku.com example and this is that in the former case you are seeking to look at a page that tells you about their status, when in this case you were seeking HN's index but instead got a page about status - because there was a problem preventing you from getting what you wanted.
So by that logic a wide variety of Cloudflare's serving modes ought to return a 503, including the one where it preserves cached content because the underlying server is unavailable.
Suppose HN consisted of two content panes and one content pane was unavailable, and the unavailable pane was replaced by a message indicating a partial outage, should HN return a non-200 response code then?
200 means "this response is being served as intended without the server serving it having an error". With Cloudflare acting as a hybrid caching proxy and static site, it is appropriate for it to return a 200.
If HN were not using Cloudflare, the underlying server would probably just show an automatically generated error message of some kind and return a 503 status.
See, if I ask for, say, `https://news.ycombinator.com/item?id=7015129`, and I _get_ the page that URL names (this thread), then that's a 200. Whether it was delivered from a Cloudflare cache or not, I got that page.
IF I ask for `https://news.ycombinator.com/item?id=7015129`, and I get a "Sorry, this service is temporarily unavailable" message instead, I did not get what I asked for (200 "OK"), I got something else -- because the thing I asked for was 503 Temporarily Unavailable.
It's all about what the URL identifies, and if it was in fact succesfully delivered or not.
True, and in the case of the HN page that was up during the outage it was actually the intended content page at the time due to some backend problems, it was a customized static page not an error page.
When we say API, we usually think of it as Twitter API that sort of thing, which we can send a JSON or XML and get back something easily parasable.
Sites like HN doesn't have that kind of service. Instead, it returns HTML. That's fine. No one said API can't return HTML.
When some part of the Twitter API becomes unavailable, the API server should return 503 when we asks those APIs to return response. If these bad APIs are also used to power some part of the frontend, then the frontend will not work properly. For example, viewing thumbnail from dashboard may be down. But the rest of the page is functioning. In that case, you can't send 503 from the frontend. It doesn't make sense. So if any crawler reading Twitter.com it shouldn't see 503 when it hit the home page.
When HN is down, it should return 503. There are two reasons. First, HN is not functional anymore. The page you see may just be a maintenance page configured in Nginx, much like 404 error page. Secondly, HN is itself an API service. When it is broken, it is broken. It just doesn't work anymore. When you try to access it through a Python script, it doesn't return in the format you expect it to return. When you debug you realize it is not returning anything like you were expecting because most of the HTML structure is gone. This is not an API format change. It's simply the backend is not functional anymore.
And semantically, when your site is under a maintenance mode, 503 makes sense. 200 is not that evil. It just doesn't give anyone a better clue.
That is ridiculous. You think the status intended to be conveyed for all HN URLs was "it is up"? It was intended that any automated tools trying to figure out if HN was up or not would decide "Yes, it's up."?
If that was intended, then it was a poor, mis-guided, un-helpful intention.
But HN worker has already said it was not intended, there was not much intention involved at all, they just had other things to worry about (getting the site back up) and weren't thinking about it, due to lack of a pre-planning for how to handle an outage. https://news.ycombinator.com/item?id=7016141
At any rate, your ideology of HTTP status codes does not seem to match that of the actual HTTP designers, or of anyone trying to actually use HTTP status codes for anything. If you aren't going to use HTTP status codes for anything, then it hardly matters what they are, so there's no point in arguing about it. But as soon as you try writing software that uses HTTP status codes for anything, you will start hating sites that give you error messages with 200 OK response codes.
Precisely, I think people try to put too much semantic meaning into HTTP status codes. If you read the spec, most of the exotic semantics are actually part of the WebDAV spec and not actually the HTTP spec.
For something like a pure REST API, basic status codes are fine. It's when you start breaking the REST abstraction (which a human-readable landing page surely does) that it gets tempting to start misusing status codes.
For any kind of non-RESTful or procedural API, HTTP status codes are simply not adequate and application-specific error handling is necessary (HTTP response 200, app-specific-error 999, etc.).
Well, to illustrate my point just look at the WebDAV spec vs the HTTP spec. WebDAV shows how you can add more semantics to HTTP response codes.
But when you think about it, for a pure REST API you really only need classic HTTP response codes.
Response codes only make sense in the context of a resource. Once you introduce query strings or start to stretch the meaning of HTTP verbs the abstraction starts to leak.
So when designing an API it's smart to just handle errors at the application level rather than at the protocol level.
Protcol level errors are for things that are outside of the application. An error response with data validation errors should return a 200, for example.
At my work, we have had arguments about what searches in an REST API should return in the case that no results (literally, in our parlance, "no documents") were found. Is that a 404 or a 200?
> The request has succeeded. The information returned with the response is dependent on the method used in the request, for example:
Is a query that turns up no documents when there is indeed no documents a success? I would argue yes, as the search service executed the query accurately...in fact, if the query turned up documents when it shouldn't have, that would probably be a problem...
But the way the 404 is worded, it would also fulfill the meaning of "no results".
> The server has not found anything matching the Request-URI. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
Does your search page display a list of results, or does it redirect to the best result? If the former, I'd say that a list of zero is still a valid search results 'document', and should therefore return a 200. But the latter could certainly return a 404, although I doubt that's how your search actually works :)
Either would arguably be a reasonable choice, and therefore you should use whichever is going to be more convenient for the actual use cases you know about.
Which is almost always 200. Partially because that's what everyone will expect because that's what everyone else does.
The difference between your status.heroku.com example and this is that in the former case you are seeking to look at a page that tells you about their status, when in this case you were seeking HN's index but instead got a page about status - because there was a problem preventing you from getting what you wanted.