TimedText:Wikidata Editing with OpenRefine - Part 3.webm.en.srt
1 00:00:00,550 --> 00:00:02,800 Welcome back to this tutorial
2 00:00:02,850 --> 00:00:04,225 on using OpenRefine
3 00:00:04,225 --> 00:00:06,050 to import data into Wikidata.
4 00:00:07,700 --> 00:00:09,775 In the previous videos,
5 00:00:09,775 --> 00:00:10,988 we have matched the films
6 00:00:10,988 --> 00:00:12,525 and locations in our table
7 00:00:12,525 --> 00:00:15,375 to items.
8 00:00:13,883 --> 00:00:16,883 We now want to transform our table into statements
9 00:00:16,882 --> 00:00:19,882 and upload them to Wikidata.
10 00:00:19,988 --> 00:00:22,169 Let's first look at how this information
11 00:00:22,169 --> 00:00:24,350 is typically modelled in Wikidata.
12 00:00:24,986 --> 00:00:26,743 Pick a well-known movie
13 00:00:26,743 --> 00:00:28,500 where we expect to find this information.
14 00:00:29,114 --> 00:00:32,113 We can see that there is a "filming location" property for that.
15 00:00:33,850 --> 00:00:35,575 We review the page of the property
16 00:00:35,575 --> 00:00:37,300 and make sure it fits our needs.
17 00:00:37,695 --> 00:00:40,695 In this case it looks like a perfect fit!
18 00:00:44,867 --> 00:00:46,367 Click the Wikidata button
19 00:00:46,367 --> 00:00:47,383 in the top right corner
20 00:00:47,383 --> 00:00:50,770 and choose "Edit Wikidata schema".
21 00:00:52,010 --> 00:00:54,600 A schema is a template of Wikidata edits
22 00:00:54,600 --> 00:00:56,650 that describes how your tabular data
23 00:00:56,650 --> 00:00:59,610 will be transformed into Wikidata edits.
24 00:01:00,980 --> 00:01:03,415 It works pretty much like the Wikidata interface,
25 00:01:03,415 --> 00:01:06,000 except that you can drag and drop column names
26 00:01:05,950 --> 00:01:07,650 in place of values.
27 00:01:08,947 --> 00:01:11,947 Click "Add item".
28 00:01:12,099 --> 00:01:14,750 The items we want to modify
29 00:01:14,650 --> 00:01:15,725 are the films
30 00:01:15,725 --> 00:01:16,963 which we have reconciled
31 00:01:16,963 --> 00:01:18,200 in the "Title" column.
32 00:01:18,690 --> 00:01:21,690 So drag and drop that column to the item.
33 00:01:23,850 --> 00:01:25,450 You can see that this column
34 00:01:25,400 --> 00:01:27,625 is underlined in green:
35 00:01:27,625 --> 00:01:29,800 that is because we have reconciled it
36 00:01:29,000 --> 00:01:30,550 to Wikidata.
37 00:01:30,789 --> 00:01:33,019 You can only use reconciled columns
38 00:01:33,019 --> 00:01:35,550 in the inputs where an item is expected.
39 00:01:38,249 --> 00:01:40,249 On each of these items,
40 00:01:40,249 --> 00:01:42,250 we want to add the filming locations.
41 00:01:42,968 --> 00:01:45,009 Drag and drop the street column
42 00:01:45,009 --> 00:01:47,050 in the filming location.
43 00:01:48,559 --> 00:01:50,605 You can get a preview of the edits
44 00:01:50,605 --> 00:01:52,950 generated by the schema in the "Preview" tab.
45 00:01:54,367 --> 00:01:55,570 In the "Issues" tab,
46 00:01:55,570 --> 00:01:56,750 you get some feedback
47 00:01:56,750 --> 00:01:58,580 about the quality of your edits
48 00:01:58,580 --> 00:02:01,100 before they are made.
49 00:02:01,100 --> 00:02:02,940 OpenRefine complains about the fact
50 00:02:02,940 --> 00:02:06,099 that we haven't added any reference
51 00:02:04,200 --> 00:02:05,830 to our statements.
52 00:02:05,830 --> 00:02:08,050 So let's do that.
53 00:02:08,525 --> 00:02:10,288 I'm going to use the URL for the dataset
54 00:02:10,288 --> 00:02:12,050 with a retrieved date.
55 00:02:12,820 --> 00:02:15,540 We can also create an item for the dataset
56 00:02:15,540 --> 00:02:18,794 and use it in the reference if you prefer.
57 00:02:23,501 --> 00:02:25,701 While we are here, why not
58 00:02:25,701 --> 00:02:28,621 adding a few qualifiers to the statements...
59 00:02:28,625 --> 00:02:32,625 We have the start and end dates of the shooting
60 00:02:32,641 --> 00:02:36,641 as well as the geographical coordinates.
61 00:03:07,144 --> 00:03:09,524 It is also useful to check, if our additions
62 00:03:09,529 --> 00:03:11,779 will conflict with any existing data
63 00:03:11,779 --> 00:03:12,819 on the items.
64 00:03:13,380 --> 00:03:16,410 We fetch the existing values for the filming locations
65 00:03:35,208 --> 00:03:38,018 Once fetching has completed we use a text facet
66 00:03:38,018 --> 00:03:41,041 to inspect the sort of values they are.
67 00:03:42,057 --> 00:03:44,317 Most of these films do not have any
68 00:03:44,317 --> 00:03:45,857 filming location yet.
69 00:03:45,865 --> 00:03:47,345 and when they have one,
70 00:03:47,355 --> 00:03:50,705 it is a much less precise location.
71 00:03:50,874 --> 00:03:53,974 It should not be hard to remove the less precise location
72 00:03:53,975 --> 00:03:56,335 which are redundant with our additions
73 00:03:56,335 --> 00:03:58,844 once the dataset is uploaded.
74 00:03:58,844 --> 00:04:00,964 So, we're happy with our edits.
75 00:04:01,283 --> 00:04:03,763 Because this is a rather small dataset,
76 00:04:03,763 --> 00:04:06,021 we will upload it directly.
77 00:04:06,021 --> 00:04:07,601 For larger imports,
78 00:04:07,614 --> 00:04:08,954 complicated schemas,
79 00:04:08,965 --> 00:04:11,065 or large scale creation of new items
80 00:04:11,073 --> 00:04:13,023 it is good to request feedback
81 00:04:13,023 --> 00:04:15,113 about the import in Wikidata first.
82 00:04:15,250 --> 00:04:19,020 Click "Wikidata" – "upload edits to Wikidata"
83 00:04:19,927 --> 00:04:21,787 You will need to log in with your
84 00:04:21,796 --> 00:04:22,936 Wikidata account,
85 00:04:22,936 --> 00:04:26,166 this account will be used to make the edits.
86 00:04:26,166 --> 00:04:28,056 Add a meaningful edit summary
87 00:04:28,056 --> 00:04:29,376 to describe your edits
88 00:04:31,484 --> 00:04:34,073 This is important because it helps other editors
89 00:04:34,073 --> 00:04:35,973 understand what your edits do,
90 00:04:35,976 --> 00:04:38,376 when they look in the history of an item.
91 00:04:39,677 --> 00:04:43,977 We can now upload the dataset to Wikidata.
92 00:04:43,998 --> 00:04:46,188 You can check how the upload is going
93 00:04:46,188 --> 00:04:50,428 by looking at your own contributions.
94 00:04:53,710 --> 00:04:55,834 If you notice any issue with an edit
95 00:04:55,834 --> 00:04:58,604 you can cancel the upload in OpenRefine.
96 00:04:58,614 --> 00:05:01,154 This will stop making any further edits
97 00:05:01,159 --> 00:05:05,156 but will not remove the edits already made.
98 00:05:05,156 --> 00:05:06,616 To remove these edits
99 00:05:06,616 --> 00:05:08,276 click in the "details" link
100 00:05:08,276 --> 00:05:10,206 of any edit in the group.
101 00:05:10,214 --> 00:05:13,334 This will lead you to the edit groups tool
102 00:05:13,354 --> 00:05:17,354 where you can undo the entire edit group easily.
103 00:05:27,178 --> 00:05:29,038 This is the end of the tutorial
104 00:05:29,040 --> 00:05:33,040 I hope you enjoyed it, thanks for watching!