SRE/Infrastructure naming conventions

This page documents the naming conventions of servers, routers, and data center sites.

Our servers currently fall in broadly two categories:

  • Clustered servers: These use numeral sequences with a descriptive prefix (see #Networking and #Servers). For example: db1001.
  • Miscellaneous servers: These used unique hostnames (see #Miscellaneous servers). For example: helium. This naming convention is deprecated and not used for new hosts, but some older miscellaneous-named hosts still exist.

Name reuse

Historically, we did not reuse names of past servers for new servers. For example, after db1001 is decommissioned, no other server will be named db1001. Ganeti VMs sometimes reuse hostnames, but bare metal typically will not.

The notable exception is networking gear, which are deterministically specified by rack. For example the access switch in Eqiad rack A8 is named asw-a8-eqiad. If it is replaced, the new switch will take the same name.

All hardware in the datacenter space is tracked in Netbox, which can be used to check for existing hostnames for both hardware and ganeti instances.

Data centers

Data centers were traditionally named as vendor initials (at time of lease signing) followed by the IATA code for a nearby major airport. For example, for Eqiad data center the vendor is Equinix, and IAD the large nearby airport. This convention was used from 2003 upto 2023. Because vendors go through acquisitions, and original initials no longer apply after some time, starting with Magru in 2023, only the airport code is used along with a freely chosen prefix.[1]

DCVendor (originally)Airport Code
codfwCyrusOneDFW
drmrs Digital Realty MRS
eqdfwEquinixDFW
eqiadEquinixIAD
eqordEquinixORD
eqsinEquinixSIN
esamsEvoSwitchAMS
knamsKennisnetAMS
magru Ascenty GRU
ulsfoUnited LayerSFO

Networking

Naming for network equipment is based on role and location.

This also applies to: power distribution units, serial console servers, and other networking infrastructure.

Name prefixRoleExample
aswaccess switchasw-a1-eqiad
crcore routercr1-eqiad
mrmanagement routermr1-eqiad
lsw leaf switch lsw1-e1-eqiad
ssw spine switch ssw1-e1-eqiad
mswmanagement switchmsw1-eqiad & msw-b2-eqiad
pfwpayments fire wallpfw1-eqiad
ps1 / ps2power strips/distribution unitsps1-b3-eqiad
scsserial console serverscs-a8-eqiad
fasw Fundraising access switch fasw-c-codfw
cloudsw Cloud L3 switches cloudsw1-c8-eqiad

OpenStack deployments

[Datacenter Site][numeric identifier](optional dev suffix to indicate non-external non-customer facing deployments) - [r (if region)][letter for AZ]

  • Current Eqiad/Codfw deployments will not fully meet these standards until rebuilt: [eqiad0 (deployment), eqiad (region), nova (AZ)]
DeploymentRegionAvailability Zone
eqiad0eqiad0-reqiad0-rb
eqiad1eqiad1-reqiad1-rb
codfw0devcodfw0dev-rcodfw0dev-rb
codfw1devcodfw1dev-rcodfw1dev-rb

Disks

  • Arrays must use the Storage array device role in Netbox.
  • Naming follows two conventions:
  • Array is attached to a single host:
  • hostname_of_host_system-arrayN
  • Example: ms2001-array1, ms2001-array2
  • all arrays get a number, even if there is only a single array.
  • Example: dataset1001-array1
  • Array is attached to multiple hosts
  • Labs uses this for labstore, each shelf connects to two different hosts. As such, the older single host naming scheme fails.
  • servicehostgroup-arrayN-site
  • Example: labstore-array1-codfw, labstore-array2-codfw

Kubernetes

Any cluster that is not the main wikikube cluster should use a consistent identifier and follow these conventions:

  • Control plane service name: <identifier>-ctrl
  • Ingress service name: <identifier>-ingress [-ro|-rw] for active/active or active/passive
  • Hostnames for control plane : <identifier>-ctrlXXXX.$site.wmnet
  • Hostnames for kubelets : <identifier>-workerXXXX.$site.wmnet

Servers

Datacenter numbering

Any system that runs in a dedicated services cluster with other machines will be named after their role/service task. As a rule, we attempt to name after the service, not just the software package. Also, servers within a group are numbered based on the datacenter they are located in.

Data centerNumeral rangeExample
pmtpa / sdtpa (decommissioned)1-999cp7
eqiad1000-1999db1001
codfw2000-2999mw2187
esams / knams3000-3999cp3031
ulsfo4000-4999bast4001
eqsin5000-5999dns5001
drmrs 6000-6999 cp6011
magru 7000-7999 cp7001

When adding a new datacenter, make sure to update operations/puppet.git's /typos file which checks hostnames.

Hostname prefixes

The full list of hostname prefixes currently in use can be gathered from a cumin host (cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet) with:

sudo cumin --no-color 'A:all' 2>/dev/null | nodeset -S '\n' -e | sed 's/\..*//g' | sed 's/[0-9]\{4\}//g' | sort | uniq

Be aware that hosts with dev in their name could have the dev part before or after the 4 digits number.

Name prefixDescription Status Points of contact
acmechief ACME certificate manager In use Traffic
acmechief-test ACME certificate manager staging environment In use Traffic
alert Alerting host (Icinga / Alertmanager) In use Observability
amssqesams caching server No longer used (deprecated)
amslvsesams LVS No longer used (deprecated)
analyticsanalytics nodes (Hadoop, Hive, Impala, and various other things) Being replaced by an-worker Data Platform SREs
an-conf Analytics Hadoop cluster zookeeper quorum In use Data Platform SREs
an-coordAnalytics Hadoop cluster coordination node (Presto and Hive) In use Data Platform SREs
an-db Data Platform Postgresql database cluster In use Data Platform SREs
an-druid Druid Cluster (Analytics) In use Data Platform SREs
an-launcherAnalytics job scheduler node In use Data Platform SREs
an-masterAnalytics Hadoop cluster namenode In use Data Platform SREs
an-mariadb Data Platform mariadb databases (analytics_meta) In use Data Platform SREs
an-presto Analytics Presto cluster workers In use Data Platform SREs
an-redacteddb analytics dedicated mariadb servers with sanitized data, as per the wikireplicas In use Data Platform SREs
an-toolAnalytics tools node (YARN UI, Turnilo In use Data Platform SREs
an-test-client Analytics Hadoop-test client (equivalent to stat servers, but for test cluster) In use Data Platform SREs
an-test-coordAnalytics Hadoop-test cluster coordinator (Hive, Presto, MariaDB) In use Data Platform SREs
an-test-master Analytics Hadoop-test cluster namenodes In use Data Platform SREs
an-test-ui Analytics Hadoop-test YARN UI In use Data Platform SREs
an-test-worker Analytics Hadoop-test cluster workers In use Data Platform SREs
an-test-druid Analytics Druid-test worker In use Data Platform SREs
an-test-presto Analytics Presto-test worker In use Data Platform SREs
an-web Analytics webserver (wikistats, published datasets
an-workerAnalytics Hadoop cluster workers In use, replacing analyticsNNNN Data Platform SREs
an-airflowAirflow instances provided to client teams by Data Platform Engineering Being migrated to dse-k8s Data Platform SREs
aphlictnotification server for Phabricator In use Collaboration Services
apt Advanced Package Tool Repository (Debian APT repo) In use Infrastructure Foundations
apus Apus Cephadm cluster In use Data Persistence
aqs Cassandra cluster for Analytics Query Service (+others) In use Data Persistence
archiva Archiva Artifact Repository Being decommissioned Data Platform SREs
auth Authentication server In use Infrastructure Foundations
authdns Authoritative DNS (gdsnd) In use Traffic
backup Backup hosts In use Data Persistence
backupmon Backup monitoring hosts In use Data Persistence
bastbastion host In use Infrastructure Foundations
censorship Censorship monitoring databases and scripts No longer used (deprecated)
centrallog Centralized syslog In use Observability
cephosd Ceph servers for use with Data Engineering and similar storage requirements In use Data Platform SREs
certcentral Central certificates service No longer used (deprecated)
chartmuseum Helm Chart repository ChartMuseum In use Service Operations
cirrussearch Mediawiki Cirrussearch (OpenSearch) In use Data Platform SREs
civi Fundraising CiviCRM In use FR-Tech SREs
cloud*-devAny cloud role + '-dev' = internal deployment (PoC, Staging, etc) In use WMCS
cloudbackupBackup storage system for WMCS In use WMCS
cloudcephmonCeph monitor and manager daemon for WMCS In use WMCS
cloudcephosdCeph object storage data nodes for WMCS In use WMCS
cloudcephConverged Ceph object storage and monitor nodes for WMCS (only used for testing) No longer used
cloudcontrolOpenStack deployment controller for WMCS In use WMCS
clouddbWiki replica servers for WMCS In use WMCS, with support from DBAs and Data Platform SREs
clouddumps Dumps distribution servers (NFS,Web,rsync) (formerly labstore) In use WMCS, Data Platform SRE
cloudelasticReplication of Cirrussearch (OpenSearch) for Data Platform SRE In use Data Platform SRE
cloudgwCloud gateway server for WMCS In use WMCS
cloudmetrics Monitoring server for WMCS In use WMCS
cloudnetNetwork gateway for tenants of WMCS (Neutron l3) In use WMCS
cloudservicesMisc OpenStack components (Designate) for WMCS In use WMCS
cloudstoreStorage system for WMCS In use WMCS
cloudvirtOpenStack Hypervisor (libvirtd + KVM) for WMCS In use WMCS
cloudvirtan OpenStack Hypervisor (libvirtd + KVM) for WMCS (dedicated to Analytics) No longer used
cloudvirt-wdqs OpenStack Hypervisor (libvirtd + KVM) for WMCS (dedicated to WDQS) No longer used WMCS
cloudweb WMCS management websites (wikitech, horizon, striker) In use WMCS
collab Spare hardware for single-host-per-dc Collaboration services (Phabricator, Gerrit, Contint) Planned Collaboration Services
conf Configuration system host (etcd, zookeeper...) In use Service Operations
config-master host running the config-master site In use Infrastructure Foundations
contint Continuous Integration In use Collaboration Services
cpCache proxy (Varnish) In use Traffic
cumin Cluster management (cumin/spicerack/debdeploy/etc...) In use Infrastructure Foundations
datahubsearch DataHub OpenSearch Cluster - used for DataHub In use Data Platform SREs
datasetdataset dumps storage No longer used (deprecated)
dbDatabase host In use Data Persistence
dbmonitor Database monitoring In use Data Persistence
dborchDatabase orchestration (MySQL Orchestrator) In use Data Persistence
dbprovDatabase backup generation and data provisioning In use Data Persistence
dbproxyDatabase proxy In use Data Persistence
dbstoreAnalytics private mediawiki database replicas In use Data Platform SREs & Data Persistence
debmonitor Debian packages monitoring In use Infrastructure Foundations
deploy Deployment hosts In use Service Operations
dns DNS recursors In use Infrastructure Foundations
docDocumentation server (CI) In use Collaboration Services
doh Wikidough Anycasted In use Traffic
druidDruid Cluster (Public) In use Data Platform SREs
dse-k8s-etcdetcd server for the kubernetes cluster of Data Science and Engineering In use Data Platform SREs
dse-k8s-ctrlcontrol plane server for the kubernetes cluster of Data Science and Engineering In use Data Platform SREs
dse-k8s-workerworker node for the kubernetes cluster of Data Science and Engineering In use Data Platform SREs
dumpsdatadataset generation fileset serving to snapshot hosts In use Data Platform SREs
durum Check service for Wikidough In use Traffic
elasticElasticsearch servers In use Data Platform SREs
esDatabase host for MediaWiki external storage (wiki content, compressed) In use Data Persistence
etcdEtcd server In use Service Operations
etherpadEtherpad server In use Collaboration Services
eventlog EventLogging host In use Data Platform SREs
flink-zk Dedicated zookeeper cluster for Flink in use Data Platform SREs
flowspec Network controller In use (testing) Infrastructure Foundations
fr* Fundraising servers, e.g. frdb, frlog, frpm (puppetmaster) In use fr-tech SREs
ganeti Ganeti Virtualization Cluster In use Infrastructure Foundations
ganeti-test Ganeti Virtualization Cluster (test setup) in use Infrastructure Foundations
gerrit Gerrit servers (code review) In use Collaboration Services & Release Engineering
gitlab Gitlab servers (code review, CI, CD) In use (phab:T274459) Service Operations
grafanaGrafana server In use Observability
graphiteGraphite server In use Observability
icinga Icinga servers In use Observability
idmIdentity manager (Bitu) In use Infrastructure Foundations
idpIdentity provider (Apereo CAS) In use Infrastructure Foundations
installInstallation server In use Infrastructure Foundations
kafkaKafka brokers No longer used
kafka-main Kafka brokers In use Infrastructure Foundations
kafka-jumbo Large general purpose Kafka cluster In use Data Platform SREs & Infrastructure Foundations
kafka-logging Logging/o11y Kafka cluster In use Observability
kafkamonKafka monitoring (VMs) In use Data Platform SREs & Infrastructure Foundations
karapace DataHub Schema Registry server (standalone) - Used for DataHub In use Data Platform SREs
knsqknams squid No longer used (deprecated)
krb Kerberos KDC/Kadmin In use Infrastructure Foundations & Data Platform SREs
kubernetes Kubernetes cluster (k8s) In use Service Operations
kubestage Kubernetes staging cluster In use Service Operations
kubestagetcd Etcd cluster for the Kubernetes staging cluster In use Service Operations
kubetcd Etcd cluster for the Kubernetes cluster In use Service Operations
lablabs virtual node No longer used (deprecated)
labcontrolController node for WMCS (aka "labs") No longer used (deprecated)
labnetNetworking host for WMCS No longer used (deprecated)
labnodepoolDedicated WMCS host for Nodepool (CI) No longer used (deprecated)
labpuppetmaster Puppetmasters for WMCS No longer used (deprecated)
labsdbReplication of production databases for WMCS No longer used (deprecated)
labservices Services for WMCS No longer used (deprecated)
labtest* Test hosts for WMCS No longer used (deprecated)
labvirtVirtualization node for WMCS No longer used (deprecated)
labweb Management websites for WMCS No longer used (deprecated)
lists Mailing lists running Mailman In use Legoktm and Ladsgroup
logging-hdLogging Cluster - OpenSearch data node (hdd class) In Use Observability
logging-sdLogging Cluster - OpenSearch data node (ssd class) Planned Observability
logging-feLogging Cluster - OpenSearch/OpenSearch-Dashboards/Logstash node Planned Observability
logstashopensearch/logstash/opensearch-dashboards node In use Observability
lvslvs load balancer In use Traffic
maps Maps cluster In use Content Transform Team and hnowlan
maps-test maps test cluster No longer used (deprecated)
matomo Matomo analytics serer (formerly named Piwik) In use Data Platform SREs
mcmemcached server for mediawiki In use Service Operations
mc-gpmemcached gutter pool server for mediawiki In use Service Operations
mc-wf memcached servers for wikifunctions In use Service Operations
mc-misc memcached servers for anything else in need of memcached Planned Service Operations
ml-stagingMachine learning stanging env etcd and control plane machines In use ML team
ml-serveMachine learning serving cluster (ml-serve-ctrl* are VMs for k8s control plane) In use ML team
ml-cache Machine leaning caching nodes In use ML team
ml-labMachine learning experimenting/sandbox machines for ML models (similar to statboxes, but owned by ML). In use ML team
mirror public mirror, e.g. Debian mirror, Ubuntu mirror In use Infrastructure Foundations
miscweb miscellaneous web server In use Collaboration Services
moss Old hostnames (being phased out) for the Apus Cephadm cluster In use (but naming deprecated) Data Persistence
msmedia storage No longer used (deprecated) Data Persistence (Media Storage)
ms-backupmedia storage backup generation (workers) In use Data Persistence (Media Storage)
ms-bemedia storage backend In use Data Persistence (Media Storage)
ms-femedia storage frontend In use Data Persistence (Media Storage)
mutual-os Mutualized (shared) opensearch cluster planning Data Platform SRE
mwMediaWiki application server (MediaWiki PHP webservers, api, jobrunners, videoscalers) In use Service Operations
mwdebug MediaWiki application server for debugging and deployment staging (Ganeti VMs) In use Service Operations
mwlog MediaWiki logging host In use Service Operations
mwmaintMediaWiki maintenance host (formerly "terbium") In use Service Operations
mxMail relays In use Infrastructure Foundations
mx-outOutbound mail relays In use Infrastructure Foundations
mx-inInbound mail relays In use Infrastructure Foundations
nasNAS boxes (NetApp) Unused
netflow Network visibility In use Infrastructure Foundations
netmonNetwork monitor (librenms, rancid, etc) In use Infrastructure Foundations
netbox Netbox front-end instances In use Infrastructure Foundations
netbox-dev Netbox test instances In use Infrastructure Foundations
netboxdb Netbox back-end database instances In use Infrastructure Foundations
nfsNFS server Unused
peekSecurity Team workflow and project management tooling In use Security Team
ocgoffline content generator (PDF) No longer used (deprecated)
ores ORES cluster In use Machine Learning SREs
orespoolcounter ORES PoolCounter In use Machine Learning SREs
oresrdbORES Redis systems No longer used (deprecated)
pay* Fundraising servers, e.g. payments, pay-lb, pay-lvs In use FR-Tech SREs
pcParser cache database In use SRE Data Persistence (DBAs), with support from Platform and Performance
pdfPDF Collections No longer used (deprecated)
peoplepeopleweb (people.wikimedia.org) In use Collaboration Services
parseparsoid Soon to be no longer used (deprecated) Service Operations
parsoidtest parsoid Soon to be used Service Operations
phabPhabricator host (currently iridium is eqiad phab host) In use Collaboration Services
ping Ping offload server In use Infrastructure Foundations
planetPlanet server In use (mistake) Collaboration Services
pki PKI Server (CFSSL) In use Infrastructure Foundations
pki-root PKI Root CA Server (CFSSL) In use Infrastructure Foundations
poolcounter PoolCounter cluster In use Service Operations
prometheus Prometheus cluster In use Observability
proton Proton cluster No longer used (deprecated)
puppetboard PuppetDB Web UI In use Service Operations
puppetdb PuppetDB cluster In use Service Operations
puppetmaster Puppet masters In use Infrastructure Foundations
puppetserver Puppet Servers In use Infrastructure Foundations
pybal-testPyBal testing and development In use Traffic
rbfRedis Bloom Filter server Unused
rcsObsolete:RCStream server (recent changes stream) No longer used (deprecated)
rdbRedis server In use Service Operations
registry Docker registries In use Service Operations
releases Software Releases In use Service Operations
relforge Discovery's Relevance Forge (see discovery/relevanceForge.git, T131184) In use Search Platform SREs
restbaseCassandra cluster for RESTBase service (+others) In use Data Persistence
rpki RPKI#Validation In use Infrastructure Foundations
scaService Cluster A - Includes various services No longer used (deprecated)
scbService Cluster B - Includes various services. It's effectively the next generation of the sca cluster above No longer used (deprecated)
schema Event Schemas HTTP server In use Data Platform SREs & Service Operations
search-loader Analytics to Elasticsearch model data loader In use Search Platform SREs
sessionstore Cassandra cluster for sessionstore In use Data Persistence
snapshotData dump processing node In use Data Platform SREs
sqsquid server No longer used (deprecated)
srvapache server No longer used (deprecated)
statstatistics computation hosts (see Analytics/Data access) In use Data Platform SREs
storagestorage host No longer used (deprecated)
stewardsspecial hosts for wiki stewards (see T344164) In use SRE collaboration services
testreduceparsoid visual diff testing In use Service Operations
thanos-bePrometheus long term storage backend (swift storage) In use Observability / Data Persistence
thanos-fePrometheus long term storage frontend (swift proxy software) In use Observability / Data Persistence
thumbor Thumbor In use Service Operations (& Performance)
titan Thanos frontends In use Observability
tmhMediaWiki videoscaler (TimedMediaHandler). See T105009 and T115950. No longer used (deprecated)
torrelay Tor relay No longer used (deprecated)
urldownloader url-downloader In use (added in T224551) Service Operations
virtlabs virtualization nodes No longer used (deprecated)
vrtsVRTS ticketing system In use Collaboration Services
wcqswikicommons query service In use Data Platform SREs
wdqsWikidata Query Service "full" graph - deprecated (why? See T337013 ) In use (deprecated, being replaced by wdqs-main, wdqs-scholarly, and possibly wdqs-categories) Data Platform SREs
wdqs-categories Wikidata Query Service Deepcat search graph in testing, see T374016
wdqs-main Wikidata Query Service graph split - main. See T337013 In use Data Platform SREs
wdqs-scholarly Wikidata Query Service graph split - scholarly. See T337013 In use
webperfwebperf metrics (performance team). See T179036. In use Performance & Service Operations
wikikube-ctrl Wikikube Kubernetes cluster control plane In use Service Operations
wikikube-worker Wikikube Kubernetes cluster worker nodes In use Service Operations
wtpwiki-text processor node (parsoid) No longer used (deprecated) Service Operations
xhguiA graphical interface for PHP debug profiles. See Performance/Runbook/XHGui service. In use Performance & Service Operations
dragonfly-supernodeSupernode for Dragonfly P2P network (distributing docker images) (T286054) In use Service Operations
zuul::main / zuul::executor / zuul::trusted_runnerCI: new zuul (v3+) ganeti VMs (T393873) In use Collaboration Services / Release Engineering

Miscellaneous servers

Historically, we used per-datacenter naming schemes for any one-off or single host. This included any software that wasn't load balanced across multiple machines, or general task machines that could cluster (to an extent) but required opsen work to do so.

Instead of being named for their purpose, these hosts were named according to a naming convention for their datacenter:

  • Hosts in eqiad were named for chemical elements, in order of increasing atomic number.
  • Hosts in codfw were named for stars. Stars in the Orion constellation were reserved for fundraising (Alnilam, Alnitak, Bellatrix, Betelgeuse, Heka, Meissa, Mintaka, Nair Al Saif, Rigel, Saiph, Tabit, Thabit).
  • Hosts in esams or knams were named for notable Dutch people.

These naming schemes are deprecated in favour of specialized cluster names above. Even if you're certain that the foobar service will only ever use a single host, you should name that host "foobar1001" (or 2001, 3001, etc. as appropriate to the datacenter).

One-off names were easy to come up with—especially for machines that did more than one kind of thing, where it's hard to identify a single descriptive name—but they were also opaque. Engineers had to know that the eqiad MediaWiki maintenance host was "terbium" and the codfw package-build host was "deneb." Naming these machines "mwmaint1001" and "build2001" is easier for sleepy oncallers to remember in an emergency, and friendlier to new hires who have to learn all the names at once.

Some older hosts in production still use these naming schemes, but new hosts should not use them.

Category:Operations policies
  1. P&T Weekly Status Updates: 2023-12-04, Wikimeida Foundation, Google Docs (restricted)
Category:Operations policies