TimedText:Introduction to Mix'n'Match (Wikidata Leveling Up Days 2024).webm.en.srt

1 00:00:03,881 --> 00:00:05,017 (Léa) Hello, everyone.

2 00:00:05,017 --> 00:00:09,380 I'm here with Epìdosis, and we're going to talk about Mix'n'match.

3 00:00:10,922 --> 00:00:13,596 Tell us, how did you start editing Wikidata?

4 00:00:13,596 --> 00:00:14,938 What got you into it?

5 00:00:16,386 --> 00:00:20,564 (Epìdosis) I started in, I think, the first days of 2013,

6 00:00:20,564 --> 00:00:23,520 when Wikidata was just a few months old.

7 00:00:24,430 --> 00:00:27,662 It was just adding a site link to some article

8 00:00:27,662 --> 00:00:29,530 that I created in Italian Wikipedia.

9 00:00:30,095 --> 00:00:33,220 (Léa) Nice. A lot of people started like this, I think.

10 00:00:33,990 --> 00:00:35,064 Cool. Okay.

11 00:00:35,064 --> 00:00:36,222 Without further ado,

12 00:00:36,222 --> 00:00:39,367 can we explain a little bit what Mix'n'match is?

13 00:00:40,591 --> 00:00:44,169 (Epìdosis) Mix'n'match is an external tool which is hosted on Toolforge,

14 00:00:44,365 --> 00:00:50,787 and it's one of the most used to reconcile data

15 00:00:50,787 --> 00:00:54,043 from external databases with Wikidata.

16 00:00:55,964 --> 00:01:00,455 Mix'n'match is, of course, it requires first to log in.

17 00:01:01,085 --> 00:01:03,235 Here you have the Log in button,

18 00:01:03,395 --> 00:01:09,644 which will require you to allow Mix'n'match to edit Wikidata.

19 00:01:11,077 --> 00:01:12,499 You should just allow it,

20 00:01:12,499 --> 00:01:17,375 and then it will refresh the main page,

21 00:01:17,375 --> 00:01:20,375 and you will see your username here.

22 00:01:21,705 --> 00:01:23,792 Mix'n'match is divided in many catalogs.

23 00:01:24,370 --> 00:01:27,030 Here you can just see the most recent ones,

24 00:01:27,330 --> 00:01:31,849 and you can also see them grouped according to many different criteria

25 00:01:31,849 --> 00:01:34,474 here on the left, here on the right, and so on.

26 00:01:34,722 --> 00:01:38,525 Or you can just search for a catalog here,

27 00:01:38,764 --> 00:01:40,330 typing the name,

28 00:01:40,330 --> 00:01:45,480 and there will appear the different catalogs which have that name.

29 00:01:46,766 --> 00:01:48,590 Looking at a catalog,

30 00:01:49,542 --> 00:01:51,637 you see the name of the catalog,

31 00:01:51,865 --> 00:01:52,884 "The Uploader."

32 00:01:53,622 --> 00:01:57,255 The entries are divided into four different categories.

33 00:01:57,740 --> 00:02:01,022 Fully matched entries have already been matched with Wikidata,

34 00:02:01,022 --> 00:02:03,620 don't require further work.

35 00:02:05,365 --> 00:02:09,920 Not applicable to Wikidata have been marked as not matchable

36 00:02:10,326 --> 00:02:12,035 to Wikidata for some reason.

37 00:02:12,035 --> 00:02:14,162 Either they don't exist anymore

38 00:02:14,469 --> 00:02:18,395 or they just are not notable on Wikidata.

39 00:02:18,977 --> 00:02:23,605 The entries on which you can work are the Preliminary matched and the Unmatched.

40 00:02:24,008 --> 00:02:29,636 The Preliminary matched are the entries for which Mix'n'match proposes a match,

41 00:02:29,846 --> 00:02:34,036 and you can either accept it or refuse it.

42 00:02:34,415 --> 00:02:37,864 Let's just have a look at the interface, which is very simple.

43 00:02:37,864 --> 00:02:41,140 You can just confirm the match or remove the match.

44 00:02:41,399 --> 00:02:43,960 Sometimes, there is one main match proposed,

45 00:02:44,260 --> 00:02:48,200 but there are also others, which you can open like this.

46 00:02:48,820 --> 00:02:50,517 And for the Unmatched,

47 00:02:50,900 --> 00:02:54,085 Mix'n'match wasn't able to propose a match.

48 00:02:54,787 --> 00:03:01,273 You can just find yourself the match and set the Q ID,

49 00:03:01,620 --> 00:03:06,680 or you can create a new item or mark the item as "Not Applicable"

50 00:03:06,980 --> 00:03:10,056 and in that case, it will end up here.

51 00:03:10,608 --> 00:03:14,275 And then, you can also see some statistics about the matches over time

52 00:03:14,465 --> 00:03:15,865 and the main users,

53 00:03:15,865 --> 00:03:18,405 which match the entries from this catalog.

54 00:03:19,044 --> 00:03:22,392 This is the basics of Mix'n'match.

55 00:03:23,208 --> 00:03:24,492 (Léa) Alright. Awesome.

56 00:03:24,700 --> 00:03:28,157 Is there anything we should be careful about

57 00:03:28,157 --> 00:03:29,875 when we start using Mix'n'match?

58 00:03:29,875 --> 00:03:32,330 Anything we should be particularly attentive to?

59 00:03:33,904 --> 00:03:36,242 (Epìdosis) Of course, the first thing is just logging in,

60 00:03:36,242 --> 00:03:39,324 because otherwise, your matches won't be saved.

61 00:03:40,249 --> 00:03:42,322 And secondly, of course,

62 00:03:42,645 --> 00:03:46,704 when you are using Mix'n'match, you are editing Wikidata to all effects.

63 00:03:47,735 --> 00:03:50,989 You should be very careful when making a match,

64 00:03:50,989 --> 00:03:57,607 because undoing a wrong match is, for many reasons, not so easy.

65 00:03:57,607 --> 00:03:59,093 Of course, it can be done.

66 00:03:59,093 --> 00:04:03,686 Nothing is in irremediable damage from Wikidata.

67 00:04:03,686 --> 00:04:06,373 But, anyway, you should be particularly careful.

68 00:04:06,373 --> 00:04:09,120 If you are not sure about the match,

69 00:04:09,463 --> 00:04:10,940 just leave it there.

70 00:04:11,514 --> 00:04:15,866 If you know other users, which are expert of the topic,

71 00:04:15,866 --> 00:04:19,675 you can maybe ask them if they think that this match is correct or not.

72 00:04:19,675 --> 00:04:23,657 But do a match only if you are reasonably sure that it is correct.

73 00:04:24,429 --> 00:04:26,969 (Léa) Right. If I'm not sure about a match,

74 00:04:26,969 --> 00:04:28,135 what should I do?

75 00:04:28,960 --> 00:04:30,502 (Epìdosis) Just go on

76 00:04:30,502 --> 00:04:35,585 and find other matches about which you are more sure.

77 00:04:36,205 --> 00:04:37,677 (Léa) Alright. Makes sense.

78 00:04:38,000 --> 00:04:41,313 And can I also edit directly from Wikidata's interface?

79 00:04:42,110 --> 00:04:48,207 (Epìdosis) Yes, because Mix'n'match is also connected with a gadget,

80 00:04:48,750 --> 00:04:51,115 the so-called Mix'n'match gadget,

81 00:04:51,115 --> 00:04:55,051 which you can turn on in your commonJS.

82 00:04:55,355 --> 00:04:59,609 So if you look at my commonJS here,

83 00:05:00,020 --> 00:05:04,340 you see that, among the gadgets,

84 00:05:04,340 --> 00:05:07,215 there is the Mix'n'match gadget, this one.

85 00:05:08,035 --> 00:05:10,368 The Mix'n'match gadget, once activated,

86 00:05:10,795 --> 00:05:13,370 shows you in the items,

87 00:05:13,667 --> 00:05:17,665 this part of--

88 00:05:18,680 --> 00:05:20,134 these lines of text

89 00:05:20,134 --> 00:05:26,405 which are between the part about Labels,

90 00:05:26,405 --> 00:05:29,893 Descriptions, and [Aliases], and the Statements.

91 00:05:30,656 --> 00:05:32,854 Here you see these lines.

92 00:05:34,180 --> 00:05:36,920 The lines which start with this checkmark

93 00:05:37,229 --> 00:05:40,163 are the matches which have already been done.

94 00:05:41,472 --> 00:05:46,561 You basically don't need to touch them unless you see that they are incorrect.

95 00:05:47,614 --> 00:05:50,637 For the matches on which you can work,

96 00:05:50,847 --> 00:05:52,878 you see these two buttons:

97 00:05:52,878 --> 00:05:56,054 the plus, which means that you can confirm the match,

98 00:05:56,054 --> 00:06:00,246 and the minus, which means that you remove the match.

99 00:06:01,583 --> 00:06:03,110 Let's make an example.

100 00:06:03,110 --> 00:06:07,741 This Macroeconomics is correct.

101 00:06:08,614 --> 00:06:12,250 It matches with the concept of macroeconomics,

102 00:06:12,250 --> 00:06:14,467 so you can just confirm it.

103 00:06:15,710 --> 00:06:18,313 When you press one of the two buttons,

104 00:06:18,585 --> 00:06:21,953 you will see briefly this tab that opens.

105 00:06:21,953 --> 00:06:24,227 You should enable the tab opening.

106 00:06:24,227 --> 00:06:27,976 Otherwise, the edit will not be effective.

107 00:06:28,770 --> 00:06:30,135 Now we have confirmed the match.

108 00:06:30,945 --> 00:06:33,645 You see that there is this checkmark now,

109 00:06:33,945 --> 00:06:37,714 and the item has probably been edited.

110 00:06:38,585 --> 00:06:40,925 Let's have a look at the page history.

111 00:06:41,065 --> 00:06:46,351 Yes, you see that the item has been edited through Mix'n'match.

112 00:06:46,696 --> 00:06:49,812 Of course, you can also remove a wrong match,

113 00:06:50,090 --> 00:06:52,277 which is easy as well.

114 00:06:52,978 --> 00:06:54,513 You just press the minus.

115 00:06:56,375 --> 00:07:00,963 After, you see that the minus has been substituted by this N/A,

116 00:07:01,245 --> 00:07:05,277 which is useful if you want to mark the entries "Not Applicable."

117 00:07:05,550 --> 00:07:07,970 But otherwise, if you reload the item,

118 00:07:08,270 --> 00:07:13,380 you just see that the entry which you marked with the minus

119 00:07:13,651 --> 00:07:15,656 has just disappeared.

120 00:07:16,313 --> 00:07:18,074 There is no more this entry

121 00:07:18,074 --> 00:07:22,980 because the preliminary match has been removed.

122 00:07:24,090 --> 00:07:26,575 In this way, you can do a lot of matches

123 00:07:26,575 --> 00:07:30,395 or remove a lot of wrong matches,

124 00:07:31,375 --> 00:07:33,870 just from Wikidata without opening Mix'n'match.

125 00:07:33,870 --> 00:07:36,377 Of course, you need also to be logged in on Mix'n'match,

126 00:07:36,377 --> 00:07:39,100 otherwise, the edits would not be saved.

127 00:07:39,591 --> 00:07:42,273 (Léa) Alright. That's fairly easy to use.

128 00:07:42,273 --> 00:07:44,695 It looks almost like a little game.

129 00:07:45,055 --> 00:07:46,764 Alright. Going back to--

130 00:07:46,764 --> 00:07:49,036 (Epìdosis) Yes. Mix'n'match was developed as a game, yes.

131 00:07:49,234 --> 00:07:50,584 (Léa) Yes. Great.

132 00:07:51,023 --> 00:07:52,860 Going back to the catalog,

133 00:07:52,860 --> 00:07:56,899 we see that we have different topics and types of catalogs.

134 00:07:56,899 --> 00:07:58,305 I'm curious,

135 00:07:58,305 --> 00:08:00,945 how is a catalog added to Mix'n'match?

136 00:08:00,945 --> 00:08:04,139 And can I just call my favorite national library and tell them,

137 00:08:04,139 --> 00:08:06,620 "Hey, please put your catalog on Mix'n'match?"

138 00:08:09,212 --> 00:08:11,960 (Epìdosis) You have two ways to upload a catalog on Mix'n'match,

139 00:08:11,960 --> 00:08:16,096 and they are indicated here in the main page of Mix'n'match.

140 00:08:16,305 --> 00:08:18,387 You can scrape new catalogs

141 00:08:18,739 --> 00:08:21,802 through the interface that Mix'n'match offers.

142 00:08:23,328 --> 00:08:26,476 You can just program a scraper through Mix'n'match,

143 00:08:26,476 --> 00:08:31,849 which, of course, can be sometimes not so easy, but it's possible.

144 00:08:32,265 --> 00:08:36,848 The other option is uploading the catalog through the import page,

145 00:08:37,203 --> 00:08:43,443 which will require you to upload the catalog as a spreadsheet, basically.

146 00:08:45,310 --> 00:08:49,547 A tab separated, a CSV or TSV file.

147 00:08:50,841 --> 00:08:56,526 The instructions about how to construct this spreadsheet are here.

148 00:08:56,906 --> 00:08:59,050 Here you can find the instructions

149 00:08:59,050 --> 00:09:04,725 about the data columns that should be included in the CSV or TSV.

150 00:09:05,275 --> 00:09:09,789 And so, yes, you can basically ask an institution,

151 00:09:09,789 --> 00:09:12,116 which manages a database

152 00:09:12,955 --> 00:09:17,275 to send you the data in a spreadsheet,

153 00:09:17,430 --> 00:09:22,641 and then, you can rearrange them according to the instructions

154 00:09:23,623 --> 00:09:28,859 so that the data columns are the ones accepted by Mix'n'match,

155 00:09:28,859 --> 00:09:33,100 and then, you can upload the catalog through this very easy form.

156 00:09:34,870 --> 00:09:36,006 (Léa) Awesome.

157 00:09:36,302 --> 00:09:40,245 And then, to wrap it up, let's say I'm someone who's new to Wikidata

158 00:09:40,245 --> 00:09:42,965 and I want to start editing using Mix'n'match.

159 00:09:42,965 --> 00:09:45,796 Can you show me again what is the easy task

160 00:09:45,796 --> 00:09:48,910 that I can quickly get doing on Mix'n'match?

161 00:09:49,507 --> 00:09:51,828 (Epìdosis) Yeah. The easiest task, of course,

162 00:09:51,828 --> 00:09:56,691 is just matching with existing items.

163 00:09:57,171 --> 00:10:00,193 You can just choose a catalog,

164 00:10:00,193 --> 00:10:04,302 which could be, for example, a catalog of a library,

165 00:10:04,664 --> 00:10:07,642 and then, go to the Preliminary matched,

166 00:10:09,550 --> 00:10:13,356 just confirming or removing the matches.

167 00:10:13,356 --> 00:10:17,706 Of course, in many cases, you can just be unsure.

168 00:10:18,055 --> 00:10:24,180 But, for example, here you see the date, which is not the same.

169 00:10:24,180 --> 00:10:26,588 This is surely a wrong match,

170 00:10:26,855 --> 00:10:31,317 as well as this, which has just different names.

171 00:10:31,675 --> 00:10:35,719 But when you are sure, you can as well confirm the matches.

172 00:10:36,090 --> 00:10:40,909 Another thing which you can do if you are a bit more advanced

173 00:10:41,129 --> 00:10:44,145 is using Mix'n'match to create new items

174 00:10:44,525 --> 00:10:48,545 on the basis of many catalogs, which is also fairly interesting.

175 00:10:48,685 --> 00:10:53,027 You can use this function, Names in other catalogs.

176 00:10:54,639 --> 00:10:57,353 For example, open one of these,

177 00:10:58,404 --> 00:11:02,655 clicking on the blue here, for example,

178 00:11:03,045 --> 00:11:05,546 and then, you have these different catalogs,

179 00:11:05,675 --> 00:11:09,180 which all have the the same name,

180 00:11:09,320 --> 00:11:13,640 which is probably, in this case, the same person, but in other cases,

181 00:11:13,640 --> 00:11:15,698 let's say, in general, the same concept.

182 00:11:16,015 --> 00:11:18,915 You can select all the pertinent catalogs

183 00:11:19,135 --> 00:11:21,377 and just create a new item,

184 00:11:21,547 --> 00:11:24,700 which then you will have, of course, to refine manually.

185 00:11:24,773 --> 00:11:27,222 But you just create the item,

186 00:11:27,222 --> 00:11:33,808 and the item is created with many data added from all of these entries

187 00:11:33,808 --> 00:11:38,498 and also the identifiers of these entries, which are already aggregated to the item.

188 00:11:38,738 --> 00:11:39,770 (Léa) Awesome.

189 00:11:39,770 --> 00:11:42,710 That's a great way to start creating new items

190 00:11:42,930 --> 00:11:45,700 always taking the data from external sources.

191 00:11:46,256 --> 00:11:47,401 (Epìdosis) Yeah, exactly.

192 00:11:47,401 --> 00:11:52,480 Taking them from many external sources at the same time is very powerful, yes.

193 00:11:52,480 --> 00:11:53,524 (Léa) Makes sense.

194 00:11:53,924 --> 00:11:57,696 Thank you so much, Epìdosis, and have fun with Mix'n'match.

195 00:11:58,351 --> 00:11:59,369 (Epìdosis) Thank you.