Commons:Picture of the Year/2015/R2/Results/verification/Python code

Picture of the Year 2015
The results are in! View results »
end
1


Python code

For reproducibility, the python code used to generate the results is below. This is (bad and hacky) python2 code. The dependencies are clear from the imports, the only external dependency is BeautifulSoup4 for xml parsing.

The XML files from Special:Export should be put in XMLPATH, with no other *.xml files.

from glob import glob
from bs4 import BeautifulSoup
from difflib import Differ
from itertools import tee
from collections import defaultdict
import json
import re

XMLPATH="./"

def pairwise(i):
    a, b = tee(i)
    next(b, None)
    return zip(a, b)

d=Differ()
voters=defaultdict(dict)
anomalies=defaultdict(dict)
validvote=re.compile(r"([+-]) # \[\[User:([^|]+)[^]]+\]\]")
POTYapp=re.compile(r"(?:\+1 POTY vote - eligible on \w+ with 76\+ edits|-1 POTY removing vote) - Vote through \[\[[^]]+\]\] - \[\[Help:EnhancedPOTY.js\|POTY App]]")
checkeligible=set()

for filename in glob(XMLPATH+"*.xml"):
    with open(filename) as f:
        print "doing ",filename
        
        
        XML=BeautifulSoup(f,"xml")
        pages=XML.find_all("page")
        for page in pages:
            revisions=page.find_all("revision")
            candidate=page.title.text
            for before, after in pairwise(revisions):
                user_name=after.contributor.username.text
                diff_id=after.id.text
                timestamp=after.timestamp.text
                diff=[_ for _ in d.compare(before.text.split("==\n")[-1].splitlines()[:-1],after.text.split("==\n")[-1].splitlines()[:-1]) if not _.startswith(" ")]
                if len(diff)!=1:
                    anomalies[user_name][diff_id]=timestamp,diff
                else:    
                    v=validvote.match(diff[0]).groups()
                    if v[-1]==user_name and "2016-05-15"<timestamp<"2016-05-29":
                        voters[user_name][diff_id]=timestamp,int(v[0]+"1"),candidate
                    else:
                        anomalies[user_name][diff_id]=timestamp,candidate,diff
                    try:
                        if not POTYapp.match(after.comment.text):
                            checkeligible.add(user_name)
                        else:
                            checkeligible.discard(user_name)
                    except:
                        checkeligible.add(user_name)


def split_valid(votes):
    #returns (stricken, counted)
    if sum(vote[1] for vote in votes.values())>3:
        sumvotes=0
        stricken={}
        counted={}
        strike=False
        for v in sorted(votes.keys(),reverse=True):
            sumvotes+=votes[v][1]
            if sumvotes>3:
                strike=True
            if strike:
                stricken[v]=votes[v]
            else:
                counted[v]=votes[v]
        return stricken,counted
    else:
        return {},votes

    
candidates=defaultdict(set)
stricken={}

for voter,votes in voters.items():
    to_strike,to_count=split_valid(votes)
    if len(to_strike)>0:
        stricken[voter]=to_strike
    for valid_vote in sorted(to_count.values()):
        if valid_vote[1]==1:
            candidates[valid_vote[-1]].add(voter)
        elif valid_vote[1]==-1:
            candidates[valid_vote[-1]].remove(voter)
        else:
            print voter,"had a parsing error on", valid_vote
            

# to_add is from the manually checked "anomalous" table on-wiki... these are votes that are valid but looked potentially invalid at parse time
       
to_add=[_.decode("utf-8").split(" voted for ") for _ in """Metrophil voted for Commons:Picture of the Year/2015/R2/v/Slussen Stan May 2015.jpg
Metrophil voted for Commons:Picture of the Year/2015/R2/v/Dülmen, Börnste, Eisenbahnlinie Dortmund-Enschede -- 2015 -- 9918.jpg
Metrophil voted for Commons:Picture of the Year/2015/R2/v/Pillars of creation 2014 HST WFC3-UVIS full-res denoised.jpg
Heneral voted for Commons:Picture of the Year/2015/R2/v/Slussen Stan May 2015.jpg
Heneral voted for Commons:Picture of the Year/2015/R2/v/Air to air image of a Spitfire, taken over RAF Coningsby. MOD 45147974.jpg
Heneral voted for Commons:Picture of the Year/2015/R2/v/Koettmannsdorf Unterschlossberg Stausee und Strassenbruecke 03032015 0234.jpg
VolaciousEditor voted for Commons:Picture of the Year/2015/R2/v/Sigmaringen Schloss BW 2015-04-28 17-37-14.jpg
Salavat voted for Commons:Picture of the Year/2015/R2/v/Santuario de Las Lajas, Ipiales, Colombia, 2015-07-21, DD 21-23 HDR-Edit.JPG
CreativeC38 voted for Commons:Picture of the Year/2015/R2/v/Port and lighthouse overnight storm with lightning in Port-la-Nouvelle.jpg
CreativeC38 voted for Commons:Picture of the Year/2015/R2/v/Nasir-al molk -1.jpg
CreativeC38 voted for Commons:Picture of the Year/2015/R2/v/Santuario de Las Lajas, Ipiales, Colombia, 2015-07-21, DD 21-23 HDR-Edit.JPG
Artoria2e5 voted for Commons:Picture of the Year/2015/R2/v/Pluto-01 Stern 03 Pluto Color TXT.jpg
Artoria2e5 voted for Commons:Picture of the Year/2015/R2/v/LibellulaCroceipennis 6561PMax.jpg
Artoria2e5 voted for Commons:Picture of the Year/2015/R2/v/Leccinum variicolor LC0365.jpg
*feridiák voted for Commons:Picture of the Year/2015/R2/v/Macaca fuscata juvenile yawning.jpg
*feridiák voted for Commons:Picture of the Year/2015/R2/v/Vincent van Gogh - Starry Night - Google Art Project.jpg
*feridiák voted for Commons:Picture of the Year/2015/R2/v/Iglesia de San Francisco, Quito, Ecuador, 2015-07-22, DD 162-164 HDR.JPG
Qian.Nivan.Out.Of.Service voted for Commons:Picture of the Year/2015/R2/v/Leccinum variicolor LC0365.jpg
Qian.Nivan.Out.Of.Service voted for Commons:Picture of the Year/2015/R2/v/Nasir-al molk -1.jpg
Qian.Nivan.Out.Of.Service voted for Commons:Picture of the Year/2015/R2/v/Macaca fuscata juvenile yawning.jpg
Labordeta voted for Commons:Picture of the Year/2015/R2/v/Esquisse d'une carte géologique d'Italie.jpg
Labordeta voted for Commons:Picture of the Year/2015/R2/v/Dülmen, Börnste, Eisenbahnlinie Dortmund-Enschede -- 2015 -- 9918.jpg
Labordeta voted for Commons:Picture of the Year/2015/R2/v/Lion d'Afrique.jpg""".split("\n")]

for voter,vote in to_add:
    candidates[vote].add(voter)

Results are in stricken, checkeligible, and candidates.

Public domain This work has been released into the public domain by its author, Storkk. This applies worldwide.

In some countries this may not be legally possible; if so:
Storkk grants anyone the right to use this work for any purpose, without any conditions, unless such conditions are required by law.

Category:POTY 2015 Category:Python (programming language)
Category:POTY 2015 Category:Pages using deprecated source tags Category:Python (programming language)