- From: Gaurav Vaidya <gaurav@ggvaidya.com>
- Date: Tue, 29 Jul 2014 20:32:29 -0600
- To: dbpedia-discussion@lists.sourceforge.net, dbpedia-developers@lists.sourceforge.net, dbpedia-dutch@lists.sourceforge.net, wikidata-l@lists.wikimedia.org, public-lod@w3.org, Wikimedia Commons Discussion List <commons-l@lists.wikimedia.org>
Hi everybody, We are happy to announce an experimental RDF dump of the Wikimedia Commons. A complete first draft is now available online at http://nl.dbpedia.org/downloads/commonswiki/20140705/, and will be eventually accesible from http://commons.dbpedia.org. A small sample dataset, which may be easier to browse, is available on Github at https://github.com/gaurav/commons-extraction/tree/master/commonswiki/20140101 The following datasets showcases some of the improvements that we�ve been working on over the last two months: - File information (*-file-information.*) is a completely new dataset that contains information on the files in the Commons, including file and thumbnail URLs, file extensions, file type classes and MIME types. - DBpedia�s Mappings Extractor (*-mappingbased-properties.*) uses templates stored on the Mapping server (http://mappings.dbpedia.org/) to create RDF for information-rich templates. This system still has some important limitations, such as not being able to process process embedded templates (e.g. license templates inside {{Information}}), but top-level templates are completely configurable. The existing mappings are available at http://mappings.dbpedia.org/index.php/Mapping_commons - This includes 363 license templates that indicate licensing for Commons files under public ___domain, Creative Commons and other open access licenses. These were created by bots and still require verification before use. They are listed at http://mappings.dbpedia.org/index.php/Category:Commons_media_license - The DBpedia Geoextractor (*-geo-coordinates.*) now extracts geographical coordinates from Commons files using the {{Location}} template. - The DBpedia SKOS Extractor (*-skos-categories.*) now identifies relationships between Commons categories, building a SKOS-based description of the entire Commons category tree. Please have a look and let us know what you think. We�ll be working on a number of open tasks over the next three weeks, listed at https://github.com/gaurav/extraction-framework/issues?state=open -- if you see something wrong with what we�ve done above, or have an issue you�d particularly like us to tackle, please report it there or drop me an e-mail! This work is sponsored by the Google Summer of Code program (https://www.google-melange.com/gsoc/project/details/google/gsoc2014/gaurav/5676830073815040). Thanks! cheers, The DBpedia Commons extraction team: Gaurav Vaidya Dimitris Kontokostas Andrea Di Menna Jimmy O�Regan
Received on Thursday, 31 July 2014 07:26:40 UTC