- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Mon, 2 Feb 2009 00:44:15 +0000
- To: Jiri Prochazka <ojirio@gmail.com>
- Cc: public-lod@w3.org
Jiri, Thanks for the feedback. On 1 Feb 2009, at 21:03, Jiri Prochazka wrote: > In the article I haven't found a solid definition of what is a dataset > and when to use another dataset/subset. I think this has to be clearly > defined. �A dataset in voiD (void:Dataset) is a collection of data, which is: - published and maintained by a single provider, and - available as RDF, and - accessible, for example, through dereferenceable HTTP URIs or a SPARQL endpoint.� I think this is as clear as it's possible without becoming overly constraining. > From what I understood, the publisher which is the "primary key" of > datasets. It's three points, see above. > I think that it should be emphasized that categorizing datasets should > only be used, if the data in it are somewhat homogeneous - the > categorization applies to all of it. Categorization is an art that is way older than voiD, and we don't want tell people how to do it properly! And I definitely don't agree with you when you say that �a categorization must apply to all of the dataset�. For example, I think it would be absolutely adequate to say that DBpedia is about people and geography, because it is a sizable and valuable resource for both those areas, even though it also contains data about lots of other things. > I guess the categorization it is fairly unusable in use cases like > personal website, because the information are various... Well, http://dbpedia.org/resource/Personal_web_page might be a nice subject here. (Assuming that you do have some interesting RDF on your site!) (I note with regret that the Wikipedia article on �Random stuff� has been deleted, it would make for another nice DBpedia resource...) > Another thing - dataset partitioning. Combination of dataset > categorization and partitioning led me to great confusion - I have > thought voiD also wanted to categorize the data in the dataset. > Better to put a notice that partitioning should be used carefully and > that it was designed for mirroring of datasets. I don't understand. �I have thought voiD also wanted to categorizing the data in the dataset� -- yes, that IS what we want. �partitioning was designed for mirroring of datasets� -- no, it was designed for cases where voiD authors want to say something about just a part of the dataset, and not about the entire dataset, for whatever reason. Best, Richard > > > Best regards, > Jiri Prochazka > > > PS: Please send the replies also directly to me, as I am not > subscribed > to this list. >
Received on Monday, 2 February 2009 00:44:57 UTC