Hacker News new | past | comments | ask | show | jobs | submit login

You might want to look at doing this in several passes:

- Build a site-map, like parsers build a syntax tree.

- Follow that to validate one page at a time.




That is certainly a valid way of doing it and is the methodology that I have used in several search-engine projects in the past (like this: http://intrasitesearchsupport.com/)

However in this case I am offloading the work of building the site-map to the WDG Validation service as I didn't want to have to obtain new servers to provide a free service. This means that I don't get the site-map until the WDG results come back...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: