Hacker News new | past | comments | ask | show | jobs | submit login
Top validation errors (w3clove.com)
39 points by cientifico on June 16, 2012 | hide | past | favorite | 49 comments



Pie charts are intended to show how one constituent value compares to the whole. Since these errors are not mutually exclusive (a single page can fail validation for multiple reasons), and since the top 10 reasons constitute an arbitrary whole, they should really be using a bar chart.

Um, like this: http://bl.ocks.org/2941604


Thanks Mike, I've updated the charts. It really looks better as a bar chart:

http://w3clove.com/charts/errors/bar


Sounds good, I'll make an alternative version using bars.


The two most popular are a crock:

* Required attribute alt missing: Most images are decorative, having alt is pointless in most cases. Should be optional.

* & didn't start a character reference: proving this was a bad choice for an escape character to begin with


> Required attribute alt missing: Most images are decorative, having alt is pointless in most cases. Should be optional.

NO. ALT attribute must not be optional. Just put an empty alt in for decorative images and learn how to do sensible alts or long_desc for other images.

And many of the supposedly decorative images are used functionally, in which case they need proper alts.


You indicate than an image is purely decorative by providing the attribute value `alt=""`.


Which provides no more information than no alt attribute at all.


No, it shows the author has thought about alt attributes and has included them because doing so is standards compliant.

For images that are purely decorative including an empty alt attrib is valid. People using screen readers (and people with images turned off[1]) don't need to hear (or read)"SMALL RED BULLET" for all 7 items in the list.

[1] I often have images turned off. I'm using a mobile broadband dongle. My ISP uses horrible proxies to compress images. I'm in the UK. The IP of the proxies is 1.2.3.9 through 1.2.3.12 - this is sub-optimal for many obvious reasons. Bypassing this proxy is trivial, but means I download a lot more data.


> For images that are purely decorative including an empty alt attrib is valid. People using screen readers (and people with images turned off[1]) don't need to hear (or read)"SMALL RED BULLET" for all 7 items in the list.

I typically use "-" or "*" as an alt for bullet images in text, and similar decorative markers. It seems to make more sense to me to emphasize that there's a list.


I'm guessing a 3G provider other than Three or Vodaphone - said proxy is why I left T-Mobile. Last I tried you can instruct the proxy not to alter your web traffic using HTTP cache control headers e.g. http://www.lewiz.org/2007/01/hacking-t-mobile-web-proxy.html


Yes! They re-write webpages to insert a bunch of script, which re-writes all alt tags to instructions for fetching higher quality images for this or all images on the page. It is annoying. They also have a psuedo-random interstitial telling me that I'm over the "fair use" allowance.

The only reason I'm still using them is that they don't charge for MB or GB; I pay for three months and get "unlimited surfing". They block some stuff after you hit 1GB, but that's trivially easy to get around. Which I do, since they've said I get unlimited web use.


No, it allows screen readers to distinguish between potentially important images with missing alt (author didn't care) and images that are known to be unimportant (author explicitly specified it).


And requiring alt means you lose that signal, since content providers will continue to be lazy and leave the alt blank, so now all images look like they're decorative, instead of some being "we don't know".

You can't solve a social problem (content providers who don't care) with a technical limitation (required alt tags).


If a content provider don't care about accessibility, why would they care about validation? They can leave out the alt-attribute whether it is required by the spec or not.


Because they're all using wordpress/blogger/etc which put it in for them.


If the developers lets these tools insert `alt=""` per default with all images, then they are not in compliance with the spec.


But if they leave out alt="", then they won't validate. What are they supposed to do, force the user to enter text? Then you'll end up with alt="sldfjsjf".


No, it is correct according to the HTML5 spec to leave out the alt-attribute if there is no valid alternative text available.


Yes it does. It provides the information that the image is purely decorative and can be ignored in a text-only representation.


The alt field specifies the text to be shown in the container when the resource is missing. If you don't use it you'll get an eyesore in the form of a broken image icon, within a reserved space for the resource. The alt="" instructs the browser to minimize space reserved for a missing resource. This is indeed useful for layout elements that are only a few pixels on each side. This way you don't mess up layout. It's used to keep the UI usefull when things go missing.


AFAIK, using an ampersand character to represent an ampersand, rather than starting a character reference, is no longer a parse error in HTML 5 if it occurs in an attribute. See: http://dev.w3.org/html5/spec/tokenization.html#consume-a-cha...


I agree, the specification seems to set authors up for failure. I never understood why the language needed & followed by WHITESPACE to be escaped. It seems like it would be pretty easy to ignore this sequence in a parser?


This shows how well browsers handle bad HTML.

It always surprises me when I make a mistake in my code and I don't notice it until I next few the source because the browser has understood what I was meant to do.

I guess although this is good for the end user it isn't always a great thing for developers learning HTML if they're able to write bad HTML without ever knowing they're doing something wrong.


  > This shows how well browsers handle bad HTML
It's not that hard to handle something like an <img> tag with no alt attribute.


I'd be glad if they changed the format of entities to something like &non-whitespace; suddenly all uses of & on its own becomes legal and is backwards compatible


This is one of those improvements that could have gone into HTML5. Maybe we'll see it in HTML6.

It's nice how they actually made writing HTML much nicer in HTML5, even without this. The simplification of the doctype and charset (<!doctype html><meta charset=utf-8>) was really nice.


Why use character entities at all when serving documents as UTF-8?


You need them for < > and &.


It's interesting to see the validation for Hacker News:

http://w3clove.com/sitemaps/check?url=http%3A%2F%2Fnews.ycom...

You wouldn't tell there were so many images, would you? Should they have an "alt" attribute?


They are functional. They need an alt tag. Having [grayarrow.gif] and [graydown.gif] is sub-optimal.


While we're at it, we should stop using tables, too.


It is probably the up/down arrows.


sure, it's the arrows. Should this be done with characters, instead? Maybe CSS? This would be indeed a good case of using "alt"... alt="vote up", alt="vote down"


I think most of those are up/down arrows.


Anybody else think its a little absurd that `alt` on img tags are still required for valid HTML?


In HTML5, the `alt`-attribute can be left out there is no way to provide an appropriate alternative text. The spec provides the example of "a blind photographer sharing an image on his blog".

In general though, the `alt`-attribute should be present. If the image is purely decorative, the value should be an empty string (although CSS rather than `img`-tags are recommended for purely decorative images anyway).

The purpose of the alt-text is some text to replace the image if you are browsing without images (e.g. using lynx or a screen-reader). Empty string means no replacement text, while a missing alt-attribute could be replaced with a text saying eg. "[unknown image]".

It makes sense.


No. Why is it absurd?


It's absolutely ridiculous, especially since you can get away with alt="".


No it's not. An empty alt specifies that it's a presentational image. On the other hand, no alt is unhelpful, as you can't tell if the omission was intentional, or the developer is just lazy.


An empty alt can also just mean "I used a tool to generate this and didn't fill out the alt field" or "I just want this page to validate and still don't care".

Making the alt tag required won't make lazy developers/content providers any less lazy, so requiring it actually weakens the signal you're looking for.


And then if we make it optional, now developers and software don't have to even consider putting any alt content.

At least requiring one makes them think about it. And tools can put in "image".


Is it better that the lazy developer is forced to insert empty alt="" to get the site to validate? Your argument is actually in favor of an optional alt tag. Allowing for a missing alt tag means you CAN tell the difference between an intentional empty alt, or a lazy developer missing the tag.


Allowing for a missing alt tag means developers don't even have to consider providing alt info, which is a terrible idea. At least this way they have to put something there, even if it's blank.


How is it better that every single image on a page has alt="image" or alt="asdfasdf", instead of a missing alt-tag?

If the alt tag was recommended - but optional - then a missing alt-tag tells you the developer didn't think about it (and who knows if the image is significant), while a present but empty alt-tag indicates the image really is decorative/insignificant.

As long as the alt-tag is required, a valid site will contain alt="" all over the place, and you can't know whether that's because the site developer was lazy (but was using validating tools) or if the site developer decided to flag those images insignificant.

Am I feeding a troll here?

(edit: It's like the difference between an SQL NULL and an empty string.)


No, I'm not trolling.

I just disagree.


what on earth is "mibenumid" (last entry)? googling for it shows that it's a popular name for a url query param, but i'm not sure how that's relevant (perhaps the error comes from the "&" separator in a URL not being escaped?). is it used by some popular middleware? the pages google is throwing up seem to be asp...


http://www.iana.org/assignments/character-sets

>The MIBenum value is a unique value for use in MIBs to identify coded character sets.

>The value space for MIBenum values has been divided into three regions. The first region (3-999) consists of coded character sets that have been standardized by some standard setting organization. This region is intended for standards that do not have subset implementations. The second region (1000-1999) is for the Unicode and ISO/IEC 10646 coded character sets together with a specification of a (set of) sub-repertoires that may occur. The third region (>1999) is intended for vendor specific coded character sets.

However that specific error (reference to entity "x" for which no system identifier could be generated) has this message:

>This is usually a cascading error caused by a an undefined entity reference or use of an unencoded ampersand (&) in an URL or body text. See the previous message for further details.

I notice this on one website that is using MIBEnumID:

<span

  class="st_facebook_large"  

  displaytext="Facebook"     

  st_url='http://www.hardrock.com/locations/cafes3/events.aspx?locationid=108&MIBenumID  
  
Could it be that it is just throwing this error because of the unencoded ampersand? And that since it is part of some MS/ASP.net setup it is everywhere?


yeah I too agree that the "alt" attribute should not be considered an error. Maybe just a warning.

There's a separate chart for the top 100 warnings:

http://w3clove.com/charts/warnings


uh, forge tthe alt, should the no closing img tag be on the list? or not closing br? i've never used those, ever. they make no sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: