Wikimedia Quality Services

The Quality Services team is a sub-team of the Developer Experience group within the Wikimedia Foundation. Established in November 2024 after a split of the QTE team, we improve the overall quality of the software developed by the Foundation.

Join our IRC channel: #wikimedia-qte connect.

Check out our blog posts at phab:phame/blog/view/21/.

Key guiding principles

  • Ensuring Quality is a part of everyone's work.
  • Ensuring Quality is in every part of the SDLC.

Things we do

Test Engineering Support
Team Test Engineering Contacts OKR Alignment
Abstract-Wikipedia Elena Tonkovidova WE2 - Vital Knowledge
Trust and Safety Dom Walden/Derrick Jackson WE4 - Safety and Security
Connection Vaughn Walters WE1.4 - Task prioritization
Community Tech Dom Walden/George Mikesell WE1 - Contributor Experiences, PES1.2 - Improve product prioritization via community signals
Editing Rummana Yasmeen/Esther Akinloose WE1 - Contributor Experiences
Contributor Growth Elena Tonkovidova WE1 - Contributor Experiences
Data Products Emeka Chukwukere
Mobile Apps Anthony Borba WE3 - Consumer Experiences
Reader Experience Edward Tadros / Elena Tonkidova WE3 - Consumer Experiences
Reader Growth Edward Tadros / Elena Tonkidova WE3 - Consumer Experiences
Language and Product Localization George Mikesell WE2 - Vital Knowledge

How we work

As of , the Quality Services team follows an embedded model with a unified team under the Developer Experience organization.

The Quality Services team supports 11 developer teams - each developer team having their own processes and structure and unique challenges and expertise.

The Test Engineer embedded on each organization is currently responsible for ensuring the quality of their team's work. This involves both monitoring Phabricator, as well as reporting on the overall health of the features.

How we test

Software testing happens on each team individually, therefore QS testing strategy has evolved team by team to serve the needs of the individual teams. This means that teams differ in numerous aspects:

  • Automation
  • Manual / Exploratory Testing
  • Tools used for testing
  • Documenting tests
  • Documenting bugs
  • Maintaining metrics

This section will detail how we currently test for each team supported.

Abstract-Wikipedia (Elena)

Trust and Safety (Dominic and Derrick)

  • Test Strategy doc: No singular document describing approach for team
  • Bug Tracking: Trust and Safety Phabricator
  • Automation Strategy: Trust and Safety commits code directly to others' repos, so follows their strategy
  • QA Process: Tickets come in to QA column, a QA engineer tests it.
  • Test plan documentation: Per project, User:DWalden (WMF)/Test2wiki k8s migration
  • Metrics tracked: None QA related

Connection (Vaughn)

Community Tech (Dom/George)

  • Test Strategy doc: No doc that I am aware of, and it would vary depending on the project anyways.
  • Bug Tracking: Community Tech Phabricator
  • Automation Strategy: Community Tech delivers code directly to the repo, and follows their testing protocols/directions from there
  • QA Process:Split into two teams that are time zone friendly. Testers pick up any tasks that are in the QA column. If an issue related to the task is found, it will be moved back to the 'In Development' column with a ping to the engineer on the Phab task and Slack for a heads up.
  1. USA- Sea Lion Squad
  2. World- Fox Squad
  • Test plan documentation: Test Plan in general, but will be different per project
  • Metrics tracked: None that I am aware of for QA

Editing (Rummana/Esther)

  • Test Strategy doc:
  • Bug Tracking:
  • Automation Strategy:
  • QA Process:
  • Test plan documentation:
  • Metrics tracked:

Contributor Growth (Elena)

Data Products (Emeka)

  • Test Strategy doc:
  • Bug Tracking:
  • Automation Strategy:
  • QA Process:
  • Test plan documentation:
  • Metrics tracked:

Mobile Apps (Anthony)

  • Test Strategy doc: In draft
  • Bug Tracking: Phabricator board - iOS and Android
  • Automation Strategy: Android: Smoke test (espresso) + XCTest (unit/integration)
  • QA Process: QA handles regression/smoke test suites per RC, records bugs and retests as necessary
  • Test plan documentation: Smoke test documented in google sheets with per-ticket testing tracked in Phabricator.
  • Metrics tracked: # of bugs found weekly, # of smoke tests run, # of consumer reported bugs
  • Smoke Test Suite: Smoke Test Documentation

Structured Data (Elena) - Archived

Reader Experience (Edward / Elena)

  • Test Strategy doc: TBD
  • Bug Tracking: TBD
  • Automation Strategy: Define smoke test suite per project, automate repeatable tests.
  • QA Process: Currently only testing what makes it to the QA column. Deployment validation testing has not been implemented.
  • Test plan documentation: Individual testing is documented in Phab tasks. Plan is still evolving.
  • Metrics tracked: TBD

Reader Growth (Edward / Elena)

  • Test Strategy doc: TBD
  • Bug Tracking: TBD
  • Automation Strategy: Define smoke test suite per project, automate repeatable tests.
  • QA Process: TBD
  • Test plan documentation: TBD
  • Metrics tracked: TBD

Language and Localization Product (George)

  • Test Strategy doc: No doc that I am aware of, and it would vary depending on the project anyways
  • Bug Tracking: LPL Phabricator
  • Automation Strategy: None
  • QA Process: Test what comes in to the QA column
  • Test plan documentation: Test Plan in general, but will be different per project
  • Metrics tracked: None

Testing Concepts

Quality assurance (QA), like all engineering concepts, has its own set of vocabulary and concepts. QA engineers and test engineers use terms like "Regression testing", "Smoke testing", "automation" and acronyms like "UAT" and "CUJ" as well as concepts like risk-based testing. This section helps provide clarification as to what these mean in the context of the Wikimedia Foundation.

Smoke testing

Smoke testing is a small set of tests to determine that core functionality performs as expected.

Regression testing

Regression testing is a large set of tests (typically all available) that determines that previously developed functionality works correctly.

Automated testing

Automated testing uses computers and software to accomplish tasks important in validating software.

Unit and Integration testing

Unit testing is the smallest possible test. It tests assumptions about code written by an engineer. For example, a unit test validates that a function returns the correct value or that a method modifies the underlying object in an expected way. As engineers we can use unit tests to ensure that "negative" cases (inputting unexpected values or adding latency in the form of pauses) does not inadvertently harm the functionality.
Integration testing validates that two or more distinct pieces work together.
// This is pseudocode

function addNumbers( a, b ) {
  return a + b;
}

assert( addNumbers( 1, 2 ) ) = 3 // this is a unit test, validating addition works
assert( addnumbers( 1, potato ) ) = error thrown // A negative unit test (validating it fails correctly)

function subNumbers( a, b ) {
  return a - b;
}

let finalValue = addNumbers( 1, 2 ) + subNumbers( 1, 2 )
assert finalValue = 0 // this is an integration test, showing two distinct pieces of code work together

Critical User Journey (CUJ)

The most important user journeys within a given scope (feature, full site).

User Acceptance Testing (UAT)

A type of testing done by users with open-ended questions to determine if it solves the proposed need. Typically not only finds bugs but also design improvement opportunities.

End to End testing (E2E)

A scope of testing that includes pieces outside of the worked on scope, typically. Marked by a distinct "start" and "end" point.

Risk-based testing

Risk-based testing is an approach to testing that prioritizes what to validate based on the ultimate risk presented to the organization or consumers. For example, if you do not adhere to risk-based testing, you may take the approach to test the "code last touched" as a way to limit the amount of bugs introduced. In risk-based testing, you may prioritize testing primary consumer user journeys over the item last touched.
Risk based testing requires agreements between engineering, QA, product management and other leaders to ensure that risk is fully understood, as there is typically a de-prioritization of "non-risky" changes.

Automated tests available

Selenium

Cypress

See also

Category:WMF Projects Category:WMF Projects 2024q4 Category:WMF Projects 2025q1 Category:WMF Projects 2025q2 Category:WMF Projects 2025q3