Extension talk:CanonURL


Patch: Redirects support

        if (isset($wgArticle)) {
            $out->addHeadItem( 'canonical',
                '<link rel="canonical" href="'.$wgArticle->getTitle()->getFullURL().'" />'."\n");
        } else {
            $out->addHeadItem( 'canonical',
                '<link rel="canonical" href="'.$CanonBaseURL.$pg_title.'" />'."\n");
        }

Vlsergey (talk) 01:49, 25 January 2013 (UTC)

Don't forget to run the attribute value through htmlspecialchars(), and use Title::getCanonicalURL instead of Title::getFullURL so that sites that support both HTTP and HTTPS output the canonical url here instead of the one on the current request. Krinkle (talk) 09:01, 14 March 2013 (UTC)

Patch: Escape output

As demonstrated in this example, the complete lack of escaping in this extension is causing two problems:

  • Titles that contain special characters will not be escaped, as such the url outputted in the rel="canonical" will not lead to the real article but a "Not found page" or "Bad title" error.
  • Title that contain html script characters will result in an arbitrary html injection vector.

I've filed https://github.com/Abhi-M/CanonURL/issues/1 to track this issue. I'll provide a patch shortly. Krinkle (talk) 09:59, 14 March 2013 (UTC)

Released an immediate new version to fix the security issue. Thanks a lot for pointing it out.
Seen some other suggestions from your side on the GitHub page (https://github.com/Abhi-M/CanonURL/issues), will be more than glad to implement them. ~ Abhi M 13:20, 14 March 2013 (UTC)
It looks like characters are being now URL escaped where they should not be, such as the colon (namespace separator) and slash (subpage separator).
For example, http://en.uncyclopedia.co/wiki/User:Kaizer_the_Bjorn/Caffeine is mirrored at http://mirror.uncyc.org/wiki/User:Kaizer_the_Bjorn/Caffeine and the mirror is flagged with extension:CanonURL to direct search bots to the main site.
The result is <link rel="canonical" href="http://en.uncyclopedia.co/wiki/User%3AKaizer_the_Bjorn%2FCaffeine" /> which is a big fat 404 depending on how the target site's Apache rewrites short URLs.
Mediawiki seems to tolerate the %3A namespace separator. It's the / as %2F which is breaking and it seems to depend on Apache rewriting URLs and not on MW: http://en.wikipedia.org/wiki/User%3AKaizer_the_Bjorn%2FCaffeine would work if the subpage existed while http://wikipedia.org/wiki/User%3AKaizer_the_Bjorn%2FCaffeine goes 404 and http://wikipedia.org/wiki/User%3AKaizer_the_Bjorn/Caffeine would be valid.
Admittedly, the page doesn't exist on Wikipedia, but a 404 at the Apache level indicates MediaWiki didn't even see these; perhaps the %2F isn't a / to mod_rewrite on all MW sites? Carlb (talk) 18:42, 23 March 2013 (UTC)
Your latest revision addresses it partially by url escaping the value, it does not however apply any html escaping. In practice this is much less of an issue, however there are lots of tiny downsides and inconveniences with how the extension operates in general. I'd highly recommend considering replacing the code with https://github.com/Abhi-M/CanonURL/pull/4. Krinkle (talk) 18:09, 14 March 2013 (UTC)
Yes, agrees with you. Last time, I haven't seen the solution you gave and went with a solution that came to my mind.
It stands fixed and I added your name, will be glad to add you as a developer if it is alright with you. Abhi M 12:18, 24 March 2013 (UTC)

Stable release

Using this extension, and hopefully its all up and running the way it should. Looking forward to a stable release. Thanks for the great extension! Aviationwikinet (talk) 04:54, 2 January 2015 (UTC)

Where this extension?

The list contains no such expansion: https://www.mediawiki.org/wiki/Special:ExtensionDistributor/CanonURL Where does it go? Хаджимурад (talk) 14:33, 13 April 2015 (UTC)