- From: Dan Brickley <danbri@danbri.org>
- Date: Mon, 14 Jul 2008 15:48:17 -0400
- To: Tom Heath <Tom.Heath@talis.com>
- Cc: Richard Cyganiak <richard@cyganiak.de>, Mark Birbeck <mark.birbeck@webbackplane.com>, public-lod@w3.org, semantic-web@w3.org
Tom Heath wrote: > > As always it's a case of the right tool for the right job. Regarding > your other (admittedly unfounded) claim, there may be many more people > who end up publishing RDF as RDFa, but collectively they may end up > publishing far fewer triples in total than a small number of publishers > with very large data sets who choose to use RDF/XML to expose data from > backend DBs. Hey, size isn't everything :) Generating a massive RDF dataset is as easy as piping one's HTTP logs through sed. There are many measures for data utility. Is the data fresh? accurate? useful? maintained? *used*? Does it exploit well known vocab? Does it use identifiers that other people use? Or identification strategies that allow cross-reference with other data anyway? Are the associated http servers kept patched and secure? Is it available over SSL? Is there at least 5 years paid up on each associated DNS hostname used? Do we know who owns and takes care of those ___domain names? Does it link out? do people link in? Does the data have clear license? And respect user's privacy wishes where appropriate? Is it I18N-cool? On the size questsion: I'm wary of encouraging a 'bigger is better' attitude to triple count. In data as in prose, brevity is valuable. Extra triples add cost at the aggregation and querying level; eg. sometimes a workplaceHomepage triple is better than having a 'workplace' one and a 'homepage one'. cheers, Dan -- http://danbri.org/
Received on Monday, 14 July 2008 19:49:02 UTC