Commons:Machine-readable data/pt

Shortcut: COM:MRD

No Wikimedia Commons, muito dos metadados (incluído a licença e o autor) não são legíveis por máquinas. Existe um módulo API, iiprop=extmetadata que pode ser usado para recuperar alguns valores (exemplo), mas à medida que a informação é introduzida como texto livre na página própria de descrição do ficheiro, mas a forma como a informação é inserida como texto livre na página de descrição do ficheiro em si não é perfeita. Há planos para mudar os metadados na base de dados$ref, mas isso não vai acontecer em breve.

Para compensar e facilitar a transição para dados mais estruturados num momento futuro, o Wikimedia Commons usa um conjunto de predefinições padrão que foram feitas de forma a serem legíveis por máquina de algumas formas, através de elementos HTML. Alguns scripts já fazem delas. É interessante salientar que esses dados estão disponíveis para qualquer wiki que use o Wikimedia Commons, onde podem ser lidos a partir do código HTML da página Ficheiro:, assim como outros dados locais.

Machine readable data

Machine readable data set by infobox templates

These are several standard infobox templates tagging different elements of the template with different tags to allow parsing of the information. Several different styles of tags are used:

  • Microformat tags follow industry standards and can be parsed by already existing tools.
  • <td> id attributes (identifiers) are custom markings which allow more complete tags, which have to be read by custom tools. Most universal infoboxes have two column structure: column #1 holds name of the field and column #2 holds the value
    • Traditionally <td> id attributes were used to tag the name call in the first column in a row. To get the data, you would need to get the contents of the following <td> cell in the second column.
    • {{Creator}} and {{Institution}} templates have more complicated structure, so the cells with the actual data are tagged with attributes using magenta background.
Predefinição Nome de parâmetro da predefinição Descrição ID de atributo <td> Microformato Comentário
{{Information}}descriptiondescrição do ficheirofileinfotpl_deschProduct.description.Often contains multiple languages annotated with {{Lang}}.
{{Information}}datedata original de criação da obrafileinfotpl_datehCalendar vevent.dtstartmicroformato adicionado pela predefinição {{Date}}
{{Information}}sourcefonte do ficheirofileinfotpl_srcOften contains entire tables. We have no good way to deal with this source templates yet. Source templates often have references to catalogue IDs, but these are also not machine readable.
{{Information}}authorautor do ficheirofileinfotpl_autThis can be author, creator and/or copyright holder and is used mixed. Often contains the {{Creator}} template which is described below.
{{Information}}permissionlicença/permissão do ficheirofileinfotpl_perm
{{Information}}other versionsoutras versões do ficheirofileinfotpl_ver
{{Artwork}}descriptiondescrição da obra de artefileinfotpl_deschProduct.description
{{Artwork}}datedata original de criação da obra de artefileinfotpl_datehCalendar vevent.dtstartmicroformat added by {{Date}} template
{{Artwork}}sourcefonte do ficheirofileinfotpl_src
{{Artwork}}artistcriador da obra de artefileinfotpl_aut"hProduct.fn value"
{{Artwork}}authorautor da obra de artefileinfotpl_aut"hProduct.fn value"
{{Artwork}}permissionlicença/permissão do ficheiro e obra de artefileinfotpl_perm
{{Artwork}}other versionsoutras versões do ficheirofileinfotpl_ver
{{Artwork}}titletítulo da obra de artefileinfotpl_art_titlehProduct.fn
{{Artwork}}object typetipo de objeto da obra de artefileinfotpl_art_object_type
{{Artwork}}mediumtécnica e meios da obra de artefileinfotpl_art_medium
{{Artwork}}dimensionsdimensões da obra de artefileinfotpl_art_dimensions
{{Artwork}}galleryinstituição que possui a obra de artefileinfotpl_art_gallery
{{Artwork}}locationlocalização da obra de arte dentro da instituiçãofileinfotpl_art_locationhProduct.locality
{{Artwork}}accession numbernúmero de acesso da obra de artefileinfotpl_art_idhProduct.identifier
{{Artwork}}object historyobject history of the artworkfileinfotpl_art_object_history
{{Artwork}}exhibition historyexhibition history of the artworkfileinfotpl_art_exhibition_history
{{Artwork}}credit linecredit line of the artworkfileinfotpl_art_credit_line
{{Artwork}}inscriptionsinscrições na obra de artefileinfotpl_art_inscriptions
{{Artwork}}notesnotas sobre a obra de artefileinfotpl_art_notes
{{Artwork}}referencesreferências relacionadas à obra de artefileinfotpl_art_references
{{Book}}Authorautor do livrofileinfotpl_author
{{Book}}Editoreditor do livrofileinfotpl_book_editor
{{Book}}Translatortradutor do livrofileinfotpl_book_translator
{{Book}}Illustratorilustrador do livrofileinfotpl_book_illustrator
{{Book}}Titletítulo do livrofileinfotpl_book_title
{{Book}}Subtitlesubtítulo do livrofileinfotpl_book_subtitle
{{Book}}Series titletítulo da série do livrofileinfotpl_book_series-title
{{Book}}Authority filedados de controlo de autoridadefileinfotpl_book_authority
{{Book}}Publisherpublicação do livrofileinfotpl_book_publisher
{{Book}}Printerimpressor do livrofileinfotpl_book_printer
{{Book}}Year of publicationdata ou ano da publicação do livrofileinfotpl_date
{{Book}}Place of publicationsítio ou cidade da publicação do livrofileinfotpl_book_place-of-publication
{{Book}}Languageidioma do livrofileinfotpl_book_language
{{Book}}Descriptiondescrição do livrofileinfotpl_desc
{{Creator}}NameNome do criadorcreatorvCard.fn
{{Creator}}Alternative namesNomes alternativos do criadorfileinfotpl_creator_alt-name_valuevCard.nickname
{{Creator}}DescriptionNacionalidade e ocupação(ões) do criadorfileinfotpl_creator_desc_valuevCard.note
{{Creator}}Date of deathData da morte do criadorfileinfotpl_creator_deathdate_value
{{Creator}}Date of birthData do nascimento do criadorfileinfotpl_creator_birthdate_valuevCard.bday
{{Creator}}Location of birth/deathLocal da morte do criadorfileinfotpl_creator_deathloc_value
{{Creator}}Location of birthLocal de nascimento do criadorfileinfotpl_creator_birthloc_value
{{Creator}}Work periodPeríodo de atividade do criadorfileinfotpl_creator_work-period_value
{{Creator}}Work locationLocal de trabalho do criadorfileinfotpl_creator_work-location_valuev
{{Creator}}Imageretrato ou foto a mostrar o criadorfileinfotpl_creator_image
{{Creator}}Authority fileControlo de autoridade relacionado com o criadorfileinfotpl_creator_authority_value


{{FileContentsByBot}}(vários)depende, por favor confira {{FileContentsByBot}}(various)hproduct-by-botgrande conjunto de dados e ainda em crescimento, por favor confira {{FileContentsByBot}}
{{Photograph}}titletítulo da fotografiafileinfotpl_art_titlehProduct.fn
{{Photograph}}descriptiondescrição da fotografiafileinfotpl_deschProduct.description
{{Photograph}}original descriptiondescrição arquivística original da fotografiafileinfotpl_deschProduct.description
{{Photograph}}datedata da criação da obra de arte originalfileinfotpl_datehCalendar vevent.dtstartmicroformat added by {{Date}} template
{{Photograph}}mediumtécnica e meios da fotografiafileinfotpl_art_medium
{{Photograph}}dimensionsdimensões da fotografiafileinfotpl_art_dimensions
{{Photograph}}artistcriador da fotografiafileinfotpl_aut"hProduct.fn value"
{{Photograph}}institutioninstituição que possui a fotografiafileinfotpl_art_gallery
{{Photograph}}locationlocalização da fotografia dentro da instituiçãofileinfotpl_art_locationhProduct.locality
{{Photograph}}sourcefonte do ficheirofileinfotpl_src
{{Photograph}}permissionlicença/permissão do ficheiro e obra de artefileinfotpl_perm
{{Photograph}}other versionsoutras versões do ficheirofileinfotpl_ver
{{Photograph}}accession numbernúmero de acesso da fotografiahProduct.identifier

Alternative format for CommonsMetadata

Because the table + id based format proved very hard to add to templates which were not formatted similarly to the Commons information template, CommonsMetadata allows an alternative format, similar to license templates: the whole information template has to be enclosed in a fileinfotpl class and the tag containing the specific information needs to have a fileinfotpl_* class (same names as above, but class, not id).

Conjunto de dados legíveis por máquina por predefinições de licença

Introduced in October 2010, using classes <span class="licensetpl_XXX">

licensetpl
An element identifying a license. Wraps the entire license code and should be a SINGLE license, not a multi license.
licensetpl_short
Short name of the license: “Public domain”, “CC BY-SA 3.0”, “CC by 2.0 fr”, etc.
licensetpl_long
Long name of the license: “Public domain”, “Creative Commons Attribution-Share Alike 3.0”,
licensetpl_attr_req
Whether attribution is required. “true” or “false”.
licensetpl_attr
The requested attribution: Free text.
licensetpl_link_req
Whether a link to the license is required for this license. “true” or “false”.
licensetpl_link
The link to the license deed. “www.creativecommons.org/licenses/by-sa/XXX/YYY”
licensetpl_nonfree
“true“ if this is a non-free license (not used on Commons, only on wikis with an EDP)

Multiple licensetpl blocks for the same work might be wrapped in a block using the class licensetpl_wrapper.

Templates setting this information

  • Templates setting licensetpl include:

{{PD-Layout}}, {{Cc-by-sa-3.0-migrated}}, {{Cc-by-layout}}, {{Cc-by-sa-layout}}, {{Cc-zero}}, {{FAL}}, {{GFDL}}, {{GFDL-1.2}}, {{GPL}} e {{LGPL}}.

Machine readable data set by style formatting templates

Style formatting templates, meant to provide uniform styles to different families of non-license templates, carry machine readable data identifying these families.

Predefinição Propósito nome da classe
{{Restriction-Layout}} used by Restriction tags restrictiontemplate
{{FoP-Layout}} used by freedom of panorama tags foptemplate
{{Partnership-Layout}} used by Partnership templates partnershiptemplate
{{Source-Layout}} used by generic Source templates sourcetemplate
{{Created with}} used by Created with ... templates createdwithtemplate

Templates regarding non-copyright legal restrictions carry these classes to identify specific types of restrictions.

Template(s) Purpose class name
{{Trademarked}} Trademarked images restriction-trademarked
{{Copydesign}} Copyrighted designs restriction-design
{{Communist symbol}} Communist symbols restriction-communist
{{Italy-MiBAC-disclaimer}} {{Soprintendenza}} Italian cultural goods restriction-ita-mibac
{{Australian Commonwealth reserve}} Australian reserves restriction-aus-reserve
{{Personality rights}} {{Romania personality rights}} Personality rights restriction-personality
{{2257}} Child Protection and Obscenity Enforcement Act warning (United States) restriction-2257
{{Costume}} Costuming restriction-costume
{{Fan art}} Fan art restriction-fan-art
{{Currency}} Currency restriction-currency
{{IHL Symbol}} Symbols restricted by International Humanitarian Law restriction-ihl
{{Nazi symbol}} Nazi and fascist symbols restriction-nazi
{{Insignia}} Official insignia restriction-insignia

Machine readable data set by specific templates

More machine-readable data are set. Here is a non-exhaustive list:

{{Personality rights}}
<span class="commons-template-name" style="display:none" id="commons-template-personality-rights">Personality rights</span>
{{Credit line}}
<td id="fileinfotpl_credit" class="fileinfo-paramfield fileinfotpl_credit" style=""></td>

Machine-readable data set by location templates

{{Location}} and similar templates add machine-readable geocodes in the following format: <span class="geo">12.34;24.68</span> (latitude and longitude as floating-point numbers, separated by a semicolon). The coordinates use the en:WGS84 system (same as the GPS and most online maps). See Commons:Geocoding for more details.

Uso

MediaWiki API

The MediaWiki API now serves a limited number of metadata. Consider the following query:

(Open in API Sandbox) that returns some useful parameters such as Credit, Artist, LicenseUrl and Copyrighted and is used by Media Viewer, for example.

Scripts que usam dados legíveis por máquina

Ferramentas externas

Ver também

Defining new machine readable data

  • Do NOT use HTML id's, use classes. An ID can only be used once per page and most of these fields can occur multiple times per page. Consider for instance descriptions of derivative works, which can include information about the original and the derivative.
  • When possible, wrap the actual data, not some field header. This last method is historically used for all our Information templates, but much harder to support in the long run.
  • Wrap data, not the way the data is formatted.
  • Expect that formatting is lost when converting to data. Visual dress up is not part of the information.
  • Don't wrap multiple units of information inside one field. There is a difference between a publication date and a creation date. Both are dates, but both are different 'data fields'. Also CC BY-SA-4.0-3.0-2.5 is not a license name, those would be 3 licenses with the name CC BY-SA-##.
  • Make sure that the data value has one unit, or outputs one consistent unit.

Problems

There are a few things that are currently NOT or badly recognizable. These include:

  • Derivative works
  • Works included in works. See also Category:FoP_templates
  • licenses derivates or works included in works are a mess.
  • Author vs. Copyright holder
  • usernames vs 'real names'
  • Catalogue IDs etc
  • VRTS permissions
  • Publication date vs creation date
Category:Commons help/pt
Category:Commons help/pt