Talk:Universal Language Selector/2012
This page used the LiquidThreads extension to give structured discussions. It has since been converted to wikitext, so the content and history here are only an approximation of what was actually displayed at the time these comments were made. |
What happens if the user changes the language?
As I said to Arun at the hackathon, the execution of the design is really nicely done and professional! It does raise quite a few questions/issues, the biggest group of which can be summed up as:
What happens if the user changes the language?
As the document notes, we're dealing with the different constructs of UI language, interlanguage links to different versions of an article, different language versions of a project, and different language versions of content inside the same multilingual project (Meta, Commons). Moreover, we have an existing user interface control, the interlanguage links, that uses the label "Language" in one specific fashion now anchored in the minds of millions of users.
Any changes we make here have the potential to be highly confusing and disorienting to the user. Let me give a simple example. Let's say you're on English Wikipedia on the article about carrots, and you pick this language widget to try to find the French article about carrots. But let's say the widget does the following instead:
→ Change the user interface language to French
Now you're operating in English Wikipedia but using a French UI. WTF?! Even if that's what you actually wanted to do, due to the way MediaWiki: messages are localized, the results can be highly unpredictable. For example, compare the AbuseFilter interface on English Wikipedia in English vs. the same interface in French.
Or let's say the widget does the following instead:
→ View the French Main Page
Eek! We were looking for the French article about carrots. Oh, should have clicked the interlanguage link in the sidebar instead. WTF #2. An example of that kind of WTF in action is the implementation of interlanguage links on Commons. Here, language links inside Commons are sometimes shown at the bottom (commons:Main Page), usually at the top (commons:Commons:Village pump), and never in the sidebar. Instead, the sidebar is confusingly used for Wikipedia language versions of the page you're on. WTF #3!
Another example of brokenness in action is the language selection hack that was implemented in Commons to switch content and UI language. While a worthwhile attempt to make Commons more understandable, it nevertheless has many confusing characteristics. For example, if I visit commons:Main Page anonymously and change the language selector to German, my expectation would be ... to see the German main page. Instead I see the English Main Page with a German UI. WTF #4!?
Here are some suggested design principles I think we should aim for:
- If I'm in a multilingual wiki (like Meta or Commons), the system should behave as much as possible like one where multiple wiki instances are used. Having a completely different language selection system in those wikis violates the principle of least surprise and is a recipe for confusion.
- One especially tricky bit is that if I'm hopping between English and German Wikipedia, the UI language changes. If I'm hopping between an English vs. German page on Commons, it doesn't. But note, for example, that on Meta, it actually appends the &uselang parameter when clicking on the language links in the Main Page, switching the UI language. Both behaviors have the potential to be confusing.
- If I'm in a language-specific wiki instance like English Wikipedia, changing the UI language is an edge case with unpredictable consequences. We may still want to surface it through a simple UI for edge cases where it might be helpful (e.g. view the history of an article in Arabic Wikipedia without speaking Arabic), but any accidental activation should be avoided by being absolutely unambiguous about what this feature is. If we can't implement this as an intuitive feature, I'd prefer to not implement it at all outside the existing Special:Preferences because the risk of confusing users is too great.
- The cases of "Read this article in language FOO" vs. "Visit this project in language BAR" need to be clearly distinguished from each other. While not as confusing as the UI language change, the difference between those also can be highly confusing.
- We have existing consensus that (some) language links should be immediately visible (expanded). These fall into the "Read this article in language FOO" category, i.e. interlanguage links. We should preserve that characteristic, even if we change the layout/appearance of interlanguage links. On the other hand, I think it's appropriate for the "Visit this project in language BAR" option to be hidden behind a selector, except for the Main Page, because ...
- The Main Pages are usually functioning as initial language selection portals they need to most likely be special cased here as well.
With all that said, I think we need to rethink the language selector with these assumptions in mind:
- It would be simplest to add a new language control and keep the interlanguage links as they are for now.
- If we're going to simply call it "Language" and render it in addition to the interlanguage links, it may be desirable to explicitly call out the two different options: "Read this article in language .." vs. "Visit (project name) in language .."
- For hopping between monolingual wikis, the UI language option, if it is implemented in the standard selector, needs to be very clearly separated and labeled apart from everything else and likely with a lot less visibility.
- For hopping between languages in multilingual wikis, we may want to consider an "[ ] Also change my user interface language" checkbox for the choices in the language selector to avoid accidental switches as users have to often work in multiple languages (e.g. their native one and English or another more widely spoken language than their own). But I think this area requires further research and user testing.
That's some initial feedback. I think the specific controls in the proposed selectors could also be further simplified:
- The top map is too small to select and seems confusing next to the list of top languages. I'd keep those two clearly distinct: 1) a list of top languages (perhaps combined with search), 2) a list of languages relevant to the user's current location.
- Given how much stuff we're likely to stash into this, it may make sense to collapse it into a few headings by default, e.g.: 1) Read (page) in .. 2) Visit (site name) in .. 3) Type in .. 4) Change user interface to .. 5) Change font to ...
Looking forward to future iterations as this is definitely an area with a lot of potential for UI/UX improvement. --Eloquence 10:10, 30 December 2011 (UTC) 04:59, 13 January 2012 (UTC)
- Can I just say, I totally second Erik's point here. Of the three designs already given, none seems to have clarified the link (or distinction) between interwikis and the intended operation of the selector at all. (though they are visually appealing :) ) Jarry1250 (talk) 21:24, 21 May 2012 (UTC)
- Thank you for having taken the time to read through the comments, Harry. Interlanguage links are being kept out of scope for now. The main reason for this is to not make this already considerable project a whale. After initial development, we will proceed by integrating content language and interlanguage in the language selector. siebrand (talk) 09:24, 22 May 2012 (UTC)
- The selector was designed to be capable of replacing the current interlanguage links. This is can be achieved by providing content language selection by default. See the multi-language workflow (page 2) for an example.
- In the meantime, different alternatives are currently being considered for integrating interlanguage links and the ULS in a way that their functions are not confusing for users. Pginer (talk) 10:49, 22 May 2012 (UTC)
My comments: countries and inputbox
First of all: this looks really nice. Some comments:
- I am a bit skeptical about the country selection. While it seems nice, I think it might be subject to a lot of complaints like "but language X is also spoken in country Y", or about the number of speakers. Plus, we would need to get the data from somewhere (CLDR?) because we don't currently have that information of where a language is spoken and how many speak it where.
- The inputbox is something I have been thinking about for some time and I really want it (similar to the one on OmegaWiki). It should replace the current selector. Also, a language selector is used on various places (e.g. Special:Translate, ...) and you don't need or you can't have such a huge selection form everywhere. And it should be able to search in language names in the content language, user language and/or autonyms. When selecting a user interface language, it makes sense to make autonyms the primary language name, in other places like Special:Translate it makes sense to have the names primarily in the user language. SPQRobin 05:01, 13 January 2012 (UTC)
Autocompleting form
Guillaume Paumier & I came up with a different design for a universal language picker widget, back in 2010:
https://www.guillaumepaumier.com/articles/universal-language-picker/
TL;DR: autocomplete field that knows about every possible way to represent a language.
i.e. Collect the list of all possible languages, all possible ways of typing them, and all translations of those languages that we have available. Hide this monster list behind an autocomplete text field that makes an API call to the server. Order autocompleted entries by number of speakers of this language. The end.
So, if you wanted to get to Chinese, you could begin by typing 'ch', or the Chinese characters, or the *Japanese* name for Chinese, maybe even the ISO 639 code 'zh', for those die hard Wikipedians who do it like that.
This autocomplete field can be hidden behind an icon. I would not suggest using a 'world' icon as that is already overused for Wikipedian things. Perhaps an small icon with two word balloons, to indicate different ways of speaking. Like this except simplified. http://www.shutterstock.com/pic-25079242/stock-vector-speech-balloon-symbol.html
The design that you have currently -- using a combination of geography and langauge popularity -- is attractive, and we considered something just like that, but rejected it. I worry that it will be too complicated to use in practice, and depends on a knowledge of geography. Also it may not help you find your language if the current interface language is not one that you understand. When I think of switching a language, I don't think about geography and countries.
(Also, you probably know this already, but flags are rarely a good idea for any interface involving languages, that tries to address the whole world. It sort of works in Europe but in other parts of the world you may even be taking a political side to an argument just by including a flag).
Of course all of these ideas could be combined -- multiple approaches are possible. Still I would like to test the usability of the simplest possible thing, an autocomplete text field. Maybe with shortcuts to the really popular languages. NeilK 20:56, 25 January 2012 (UTC)
m x n language names
The page claims that one will only be able to search for language names in a few major languages. Why? I see no technical limitation, although it probably would require some special procedures - like loading the data in from PHP into the database for fast prefix search, or configuring the search with a different document type, and letting it find out about languages by crawling some special namespace. A little bit out of the ordinary but doable.
I'm just worried that a major-language approach would exclude important local cases like the Serbian name for Greek, or the Basque name for Spanish.
It is implicit that the list of language names in other languages is sparse. There may be no special Basque name for Navajo, but that's ok. NeilK (talk) 19:50, 21 May 2012 (UTC)
- Thanks for voicing your concerns, Neil. The document currently states "Due technical limitations, language searches can be limited to language names in their own language and few major languages." No actual code has been developed at this point in time, so there are some uncertainties. I hope you know us, in that we want to support each and every language in the best way possible. We'll make every effort to do that. However, if that would harm progress or performance unnecessarily, this could be a choice. siebrand (talk) 09:22, 22 May 2012 (UTC)
- As you commented with the Basque-Navajo example, some combinations will serve mainly to generate "false positives" (languages their name in another language matches the search term but the user is not looking for them), and this would make the filtering less effective.
- We want to cover as much combinations as they are useful, but the meaningful combinations may be hard to anticipate completely. The proposed solution in the design is a basic starting point that will allow us to detect real needs (by user feedback or analyzing searches where users do not find their language) and add support for them. As you suggested with the Basque-Spanish example, neighbourhood is a good criteria for language combinations to be considered. Pginer (talk) 10:27, 22 May 2012 (UTC)
- I don't think you need to predict anything. Just start a list of messages whose names are the languages recognized in Wikimedia's current scheme. People translate them as necessary in the usual way.
- Then you have a sparse matrix, where there's probably an entry for Basque->Spanish but almost certainly not for Basque->Navajo. Enter these into a database structure, of some kind that allows for prefix searches, probably server-side, and make that power your autocompleting form. NeilK (talk) 18:01, 22 May 2012 (UTC)
- Just wanted to note that cross script language name is present in current version of ULS.See Universal Language Selector/Technical Design#Language_Search_API Santhosh.thottingal (talk) 23:56, 25 September 2012 (UTC)
- I'm glad that you're doing this... a few comments from the peanut gallery:
- The idea of Unicode code point mod 1000 is interesting. I've never heard of anything like that. I would have assumed a Trie for this. That would give you more efficient results for the common cases which are all in code page 1 or 2. But, Tries and PHP and the current deployment process don't play nice together. So maybe your way is best.
- (You should be caching query results for 1-3 characters anyway, so it barely matters what you do on the backend.)
- It's not clear to me that levenshtein distance is appropriate... we are talking about one or two characters typed before the best result is shown, 99% of the time. Nobody is going to get to type 'finish', they are going to type 'fi' at most and then arrow down to 'Finnish'. There is immediate feedback that you made a mistake when the auto-completions disappear, showing that you must have screwed something up. I understand you want to deal with typos, but i'm not sure what the right strategy is for autocomplete plus typo-forgiveness. NeilK (talk) 02:09, 26 September 2012 (UTC)
- Thanks NeilK for the comments :)
- I initially though about doing it in Trie. Trie will be more effective if we have lot of common string prefix patterns. but in our case the strings are from multiple scripts, with practically no common prefixes. An easy way of creating buckets for strings from multiple scripts is using unicode code points.
- And for typo-correction - it get triggered only when there is no result from normal prefix match. ie, It happens when there is nothing to autocomplete. Santhosh.thottingal (talk) 16:59, 7 October 2012 (UTC)
- Just wanted to note that cross script language name is present in current version of ULS.See Universal Language Selector/Technical Design#Language_Search_API Santhosh.thottingal (talk) 23:56, 25 September 2012 (UTC)
Option C for widget placement
You might just be giving it as an example, but I wanted to point out some potential problems with this choice:
First, the large size and placement of Option C (mockup) is not super great in the context of user expectations. Selecting a chrome language that is different than the content of the site is a very advanced and perhaps even unexpected option. Generally users expect that the two will match, and that they should visit a different site to view a wiki in another language). Option C is rather prominent placement for such an option, and it's a place that is not usually associated with language preferences.
Additionally, there already a great many icons which appear in that space, including:
- Featured status
- Good Article status
- Protection status
- User rights (for NS2 and NS3)
- Online status (again, not NS0)
- And an infinite number of others, mostly placed there by templates
As these icons tend to use absolute positioning, placement of the language selector there would likely be pushed far to the left. Steven Walling (WMF) • talk 17:12, 22 May 2012 (UTC)
- Thanks for the comments.
- The idea with the Option C was to create a general "page settings" which is not exclusively used for language settings (as opposed to Options A and B). It may also include settings from other extensions or even tools related to content such as exporting to PDF.
- This specific approach has several implications (such as the problem with icons that you mention). But we wanted to observe from users whether they found intuitive to look for language settings inside some kind of page settings or they need them to be more prominent. Pginer (talk) 21:49, 22 May 2012 (UTC)
Menus vs. input methods
I really like option b for changing the menu language -- I think it's appropriately easy to access. And I think overlaying the interlanguage links reduces confusion -- you can either switch a language, or change the settings for the language you're on. User tests will confirm whether this works, but intuitively it makes a lot of sense to me.
For input methods I worry that any of the options suggested are too detached from, well, inputs to be discoverable. I've not read through all the docs, so if the goal is to have an additional input method activator somewhere, that's great. If not, I'd suggest having an input method activation widget associated with each input. That would require styling it for text areas & text inputs, but I think that's worth the effort. (Having redundant activation/deactivation methods is probably not a bad idea.) Eloquence (talk) 18:52, 22 May 2012 (UTC)
- Thanks for the comments.
- We expect Option B to be natural to users that are familiar with the interlanguage links.
- For Option B we are mainly interested in observing whether users that are not familiar with interlanguage links are able to find the language settings when they need them.
- We considered also the adaptation of the ULS for some specific contexts such as editing and multiple-language selection (e.g., when translators subscribe to multiple languages). In the case of editing, we'll include the current language indicator close to the editing context and provide direct access to input settings. We made some wireframes for these ideas but they are not included in the prototypes for this initial round of tests. Pginer (talk) 22:32, 22 May 2012 (UTC)
- For Option B we are mainly interested in observing whether users that are not familiar with interlanguage links are able to find the language settings when they need them.
- That makes sense. I'm noticing in the test plan that we're also testing confusion between interlanguage links vs. language settings. That's great, and if option B fares better in this regard than options A and C, IMO that may outweigh discoverability considerations, especially for a first version of ULS. Accidental introduction of new usability tripwires into the basic "read this article in language FOO" operation would be a major problem as it's such a critical feature.
- Regarding input method activation - that's also good to hear. We have quite a few inputs beyond text areas that we'll need to test for any layout wonkiness if we introduce some kind of contextual activation, e.g. the file description fields in Upload Wizard. Narayam works beautifully with those today, BTW :-) Eloquence (talk) 23:40, 22 May 2012 (UTC)
My thoughts
- I don't think there's really a need for Font settings. Users shouldn't have to deal with changing fonts, it should just be the right font at the start.
- Likewise, I don't see the need for making UI settings very accessible. It's a a very rare situation that a user would need a different language for the UI than the language of the content of the wiki they're using. For heavy editors that would need this, it can be set from the preferences menu.
- Attaching input settings to the language settings seems unnecessarily confusing, in my opinion. For languages that have need for special input tools, the tools are generally visible somehow/show some visible element, either when the box is focused or when some button near the input box/textarea is clicked, right? (Not sure about this.)
- I think it makes sense to highlight languages that are more likely to be relevant to the user by taking into account geographic location, but I don't think it makes sense to do it by dividing places into either countries or standard geographic regions. Rather, I would do it like this:
- Take all IP addresses that visit the project. Divide these IPs into IP-groups of about the same size, each large enough to get a good statistical sample of visits, with each IP in each group somewhat geographically close to all the other IPs in the group. Keep track of the visits by each IP-group to each language version. Then, for each user in an IP-group that's looking to switch to their preferred language, highlight (preferably by way of placing these on top) the 5-12 (available) languages that score highest in the following test: Visits by the IP-group to the language version ^ 2 / total visits to the language version by all IP-groups.
- Tracking which languages have been read by a particular user and using that data to determine "likely languages" is a bad idea, in my opinion. Wikimedia has a pretty good reputation about these kinds of things so far.
- Having a "remember that I often use this language" checkbox available would be useful though, in my opinion.
- Allowing the searchbox to also work with approximate transliterations of language names into different scripts would be helpful, I think.
- If multi-script languages are to be shown per-script, the page the user ends up at after clicking one of the links should be in the same script that the link was. Yair rand (talk) 01:02, 26 June 2012 (UTC)
- Thanks for your comments.
- Regarding font support, our idea is to use sensible defaults and provide choice only for the languages where they make sense (languages for which their script has problems). For most users these options won't be shown. The use of "English" as an example languages in the Mock ups was not intended to suggest that font selection will be shown for this specific language.
- Regarding the accessibility of UI settings, we have observed that the degree in which users need this tools varies depending on their specific language, location and other factors. Access to settings is designed to be hidden under a menu so that users that need these features can find the path to them easily but users who do not need them are not distracted and can just ignore one icon.
- Thanks for pointing another method for anticipating user languages, it seems an interesting approach but we need to analyze the cost/precision ratio more deeply.
- It is good that you raised these privacy concerns. For registered users, their edits are being tracked and visible under "my contributions". For anonymous users, some privacy concerns may appear, so hearing from the community and provide the functionality in the most transparent way will be required as you suggest.
- Regarding searchbox, the idea is to be as flexible as possible including the most variants in which any user can search for their language.
- Regarding multi-script languages that is our intention. Pginer (talk) 12:30, 6 July 2012 (UTC)
Cherokee and Inuktitut
Hello, I think we should also add Cherokee and Inuktitut fonts. I do not have time to search free fonts now so I let this message here. Pamputt (talk) 08:08, 11 July 2012 (UTC)
- I doubt anyone will do it for you. Nemo 20:05, 12 August 2012 (UTC)
- Indeed. I do not know if Inuktitut and Cherokee are now included but if it is not the case, you can try to add it with the GNU FreeFonts (https://www.gnu.org/software/freefont/). It is wirtten that Cherokee and Unified Canadian Aboriginal Syllabics are managed. Pamputt (talk) 13:26, 16 August 2013 (UTC)
"Common languages"
Is the current version of the Universal language selector the thing in use in http://wikidata-test-repo.wikimedia.de . It looks nice except that I get a rather strange list of suggested "common languages": : Alemannisch brezhoneg català corsu English euskara français italiano Nederlands occitan português. It may be related to my browser being in French, but even so it does not make much sense to me. --Zolo (talk) 07:26, 14 September 2012 (UTC)
- Thanks for your feedback Zolo,
- Common language section includes languages from different sources (each one may also contribute in some circumstances to add not-so common languages):
- Browser accepted languages. Browsers can be configured with several accepted languages and some of those may be included in the list (not only the main one, French, in your case). So if you have alternative accepted languages they may be there.
- Previous selected languages. This is helpful for users that usually change between a small set of languages. But if you explore at some point an unusual language just for curiosity it will be there the next time.
- IP-based geolocation: languages spoken in the user location. The ULS uses CLDR data and geo-location is made at country level at the moment. So if any of the "uncommon" languages you found is due to the previous two reasons, then it is likely due to the precision level (all French languages are not spoken in all parts of France but we do not have zone-specific info), or problems in the region-language info from the CLDR (the repository of information we are using to determine the languages spoken in each region).
- In case the problem is related to geo-ip, we'll take a look to the languages that are associated with France to verify if there is some incorrect data. Pginer (talk) 18:43, 21 September 2012 (UTC)
- Thanks for the explanation. I get the same list with Firefox and Chrome, so I guess it is due to the CLDR. The list mostly consists of regional French languages. It may sound fine but most of them are more or less kept alive for fun and folklore, and very few people use it as their primary language. A list of languages actually spoken in France would be rather different. I could not find many relevant statistics, but here is the list of languages people use to talk to their kids :