Commons:Bots/Requests/HandleCommonsOnOSMBot

HandleCommonsOnOSMBot (talk · contribs)


Operator: Fl.schmitt (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: HandleCommonsOnOSMBot tries to add {{Object location}} and {{On OSM}} templates for Commons Files that are used on OpenStreetMap (using attributes wikimedia_commons or image). Insofar, HandleCommonsOnOSMBot relies on the work of Usage Bot and goes through the media listed on the Files used on OpenStreetMap pages (see also this bot request).

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Continuous

Maximum edit rate (e.g. edits per minute): 12-15

Bot flag requested: (Y/N): Y

Programming language(s): Python (pywikibot)

Fl.schmitt (talk) 20:38, 28 May 2024 (UTC)

Discussion
Thanks for doing this. Maybe it's worth tracking images for which the bot tried to retrieve coordinates, but for some reason can't. This avoids re-trying them if the bot runs again or restarts. The "Usage Bot" that maintains the lists does update them though. Enhancing999 (talk) 22:12, 28 May 2024 (UTC)
Good idea - this would require some sort of blacklist, i think. Maybe evaluating the bot logs timely would be sufficient? Fl.schmitt (talk) 05:30, 29 May 2024 (UTC)
Maybe it's a non-issue. You could just run it for a while and see how it goes. If needed, add some logic later. Enhancing999 (talk) 07:20, 29 May 2024 (UTC)

Please make a small test run. --Krd 08:06, 29 May 2024 (UTC)

@Krd: Test run finished with five edits. Captchas were a little bit annoying :-) Fl.schmitt (talk) 17:26, 29 May 2024 (UTC)
Looks good to me. Krd 17:31, 29 May 2024 (UTC)
I take it the coordinates match. I'm a bit hesitant about {{On OSM}}. Apparently others use it too , but the layout and wording doesn't seem ideal for files. Maybe a new template could be made for filenamespace (thus the exact wording for this use can be changed easily). At File:1347 Matterhorn.jpg, I add some stuff with "other fields", which blends it into the template. Enhancing999 (talk) 21:54, 29 May 2024 (UTC)
@Enhancing999 - using the other fields parameter for {{On OSM}} sounds interesting. I've just modified "Altes Kirchle" voller Kunstschätze. 01.jpg manually, moving the {{On OSM}} inside a {{InFi}} and adding it to the other fields parameter. For me. this looks ok. Anyway, since the bot's main effort is adding the coordinates, maybe the best option is to "shelve" the {{On OSM}} question and restrict the bot to the location template. Fl.schmitt (talk) 07:23, 30 May 2024 (UTC)
Somwhow it still looks overly highlighted. I'd still keep the ids around somewhere. Enhancing999 (talk) 11:15, 30 May 2024 (UTC)
@Enhancing999 ah, ok - now i think i got it. What about "Altes Kirchle" voller Kunstschätze. 01.jpg now? I've created a new template for textual OSM links ({{OSMLink}}) without any other fancy stuff which should fit nicely into the `other_fields` paratemer using {{Information field}}. Thus we have a link, a category and the OSM id. Fl.schmitt (talk) 12:08, 30 May 2024 (UTC)
Looks ok. I'd include {{Information field}} directly in {{OSMLink}} Enhancing999 (talk) 12:10, 30 May 2024 (UTC)
Oh, good idea! Done :-) Maybe i'll find a way to add the OSM icon next to the link, so it's marked as "external" link. Fl.schmitt (talk) 12:28, 30 May 2024 (UTC)
Better no icon, this is misleading because it behaves differently from the location link icon. Fl.schmitt (talk) 12:49, 30 May 2024 (UTC)
  • File:'La Brabançonne' (15607113426).jpg has depicts statement, so coordinates and OpenStreetMap identifier should be taken from Wikidata. --EugeneZelenko (talk) 14:24, 31 May 2024 (UTC)
    Interesting point that shortly puzzled me, but it's due to Schlurcherbot re-copying the coordinates since the test run. Enhancing999 (talk) 14:55, 31 May 2024 (UTC)
    @EugeneZelenko - thank you for your feedback, but - to be honest - I would generally expect OpenStreetMap to provide "better" (more reliable and more precise) coordinates than Wikidata, especially in an urban area. OpenStreetMap doesn't have coordinates assigned to media, but vice versa, media assigned to quite precise geographical information. In the case of La Brabançonne, Wikidata has 50°51'4"N, 4°22'5"E while OSM offers 50° 50′ 56.18″ N, 4° 22′ 05.97″ E. Please compare yourself on Google Maps or OpenStreetMap (just c&p both values in the respective search boxes): in the real world, those values may amount to a difference of ca. 240 meters ("beeline") according to Google Maps! For me, that example shows why it may be quite useful for Commons and Wikidata if we could make use of OSM data. Fl.schmitt (talk) 15:24, 31 May 2024 (UTC)
    Obviously, fixing coordinates on Wikidata would benefit not only Commons, but other projects. --EugeneZelenko (talk) 15:29, 31 May 2024 (UTC)
    Exactly! But to fix them, we need the OSM data. It seems you're asking for a different bot than mine: HandleWikidataOnOsm, evaulating the wikidata attribute of OSM nodes/ways/relations. Would be very nice but doesn't help for Commons media lacking a depicts attribute / a wikidata object. That's the case for many of the Files used on OpenStreetMap: Those hiking fingerposts and wayside shrines usually don't have wikidata, they don't have "depicts", and precise location would be useful anyway. Fl.schmitt (talk) 15:50, 31 May 2024 (UTC)
    So fix mismatches in the source (Wikidata), not individual re-users of shared data. --EugeneZelenko (talk) 13:34, 1 June 2024 (UTC)
    File depicts statements seem to have undergone quite a lot of deletions recently. I'm not sure if the link "file depicts - Wikidata item location coordinates" to "File image depicted coordinates (per OSM)" is 1:1.
    The proposal here is merely for geocoding photos. Enhancing999 (talk) 13:53, 1 June 2024 (UTC)
    Whole point of this thread is to define proper process for geocoding photos when depicts statement is available. --EugeneZelenko (talk) 13:59, 2 June 2024 (UTC)
    That's easy: proper process is to ignore it. There are over 2,2 Million OSM nodes with an unique wikidata link (that's the set of wikidata items where precise coordinates are available), while there are ca. 220,000 wikimedia_commons references on OSM. Of those 220,000, only small subset has a "depicts" statement. Thus, handling that case (which is quite difficult) would have a very limited benefit, while most part of the task (handling the remaining 2,2 million wikidata entries) is still to be done. This is why I said that we clearly need OSM data, but it's a bot task on its own to reconcile Wikidata with OSM. Fl.schmitt (talk) 06:27, 4 June 2024 (UTC)
    Postscriptum: "ignore it" refers to the "depicts" statement, not to the photo... Fl.schmitt (talk) 07:58, 4 June 2024 (UTC)
    Is it really huge task for bot? --EugeneZelenko (talk) 14:33, 4 June 2024 (UTC)
    It doesn't matter if it's a huge task for a bot. What matters is that it's huge task for me to create such a bot. You're requesting a completely new feature which wasn't part of the initial bot work request. I'm not some sort of AI that delivers nice code on keypress. I'm neither a Python professional nor did i have any experience using Pywikibot before starting the project "HandleCommonsOnOSMBot". I simply picked a bot work request and tried to offer a working solution to improve Wikimedia content, which took me hours of work. Others may do such things faster, but i'm no professional developer. So – yes, it's a huge task, at least for me. Fl.schmitt (talk) 16:47, 4 June 2024 (UTC)
    One more point: The OSM wiki explicitly prohibits copying data from OSM into Wikidata. There's no similar restriction regarding other Wikimedia wikis, as far as i see. So, you're requesting a different, new feature not only in technical terms but also in legal terms - maybe it's even simply prohibited. Fl.schmitt (talk) 17:13, 4 June 2024 (UTC)
    It's reasonable to make a bot as best as possible, even if it'll take some time to implement. If there are legal problem, bot could create list of Wikidata items that require coordinates correction and later you could fix them manually. --EugeneZelenko (talk) 14:30, 5 June 2024 (UTC)
I wonder how long the bot should wait for new additions to the list?
Maybe Commons:Files_used_on_OpenStreetMap/177 should only be processed once there is Commons:Files_used_on_OpenStreetMap/178. Enhancing999 (talk) 17:08, 5 June 2024 (UTC)
I'm not sure how Usage Bot works in Detail, esp. if it modifies existing galleries of externally-used media or just adds new galleries. But I think handling that case isn't necessary. The bot has to check every single file in those galleries for the actions required anyway (adding Location, adding OSMLink or both). Since this requires read operations but no write/update, it doesn't affect the bot's edit rate, so I think it isn't a real problem if the bot visits a certain gallery multiple times. Fl.schmitt (talk) 09:45, 12 June 2024 (UTC)
I was rather think of OSM being sufficiently stable .. but then I have no idea of their quality control mecanisms. After the initial rather run, it might want to process new entries with some delay. Enhancing999 (talk) 10:00, 12 June 2024 (UTC)
I doubt that there are any formal / technical quality control mechanisms, at least regarding the wikimedia_commons attribute. Just take a look at the way the references to commons are set: The OSM wiki advices to just set the File:... or Category:... name as attribute value, but there are many entries with full URLs. Additionally, AFAIK there's no mechanism to update the attribute value after the File or Category was renamed. So, I think it's no mistake to check those galleries multiple times, maybe also logging renamed media that requires an update on OSM. Fl.schmitt (talk) 13:04, 15 June 2024 (UTC)

Approved. Please start slowly just in case anything shows up during the live run. --Krd 14:40, 2 July 2024 (UTC)