Extension:External Data/Configuration

External Data configuration settings consist of two parts: configuration for data sources and a few other settings.

Data sources

Most of the extension settings regulate calls to {{#*_external_table:}}/{{#external_value:}} in standalone mode, or {{#get_*_data:}} parser functions in compatibility mode, or the corresponding Lua functions, configuring so called data sources. There are at least two data sources relevant to any parser function call; a more specific data source overriding a more universal one; and the last ('*') containing global fallback settings.

Settings for the data sources are stored in the associative array $wgExternalDataSources. Its keys are names of data sources and values are arrays containing settings for the data source.

The relevant data sources for the data retrieval parser functions (and their Lua analogues) and the relevant settings are:

Function $wgExternalDataSources index
(parser function parameter)
Settings Description
firstthenthenlast
Any data retrieving function Data source ID
(source, etc.)
'*' paramsAn array of additional parameters to the data retrieving function used to substitute wildcards (like $param$) in configuration settings. Array member like 'param' => 'default' will set a default 'default'value for 'param', while 'param' will make 'param' required.
param filtersAn array of validators for additional parameters. A validator may be a string with a delimited regular expression that should match on the parameter value, or a callable that should return true. It is important to valdate parameters that are substituted for wildcards to prevent injections.
hiddenIf set to true, this data source can only be called with {{#get_external_data:source=id|...}} or {{#get_external_data:id|...}}, and the error messages will be suppressed, as if suppress error has been passed to the parser function.
{{#get_web_data:}}
mw.ext.externalData.getWebData()
URL
(url)
host
(from
url)
second
level
domain
(from
url)
'*' replacementsReplacements in the URLs
allowed urlsA whitelist of URLs
encodingsA list of charsets to try
allow sslWhether to allow SSL
optionsHTTP options
{{#get_soap_data:}}
mw.ext.externalData.getSoapData()
throttle keyThrottle key
throttle intervalInterval between two throttled calls, in seconds
always use stale cacheAlways allow stale cache
min cache secondsCache for at least so many seconds
{{#get_file_data:}}
mw.ext.externalData.getFileData()
File ID
(file)/
directory ID
(directory)
'*' pathFile or directory path
depthAllowed directory iteration depth
{{#get_ldap_data:}}
mw.ext.externalData.getLdapData()
LDAP domain
(domain)
'*' serverLDAP server
userLDAP user
passwordLDAP password
base dnBase DN
{{#get_db_data:}}
mw.ext.externalData.getDbData()
Database
connection
ID (db)
'*' serverDatabase server
typeDatabase type
nameDatabase name
userDatabase user
passwordUser password
directorySQLite directory
flagsDatabase flags
prefixTable prefix
prepared Prepared statement(s). If a string, this is the only prepared statement for the connection, and the query parameter in the wikitext is not needed.

If an associative array, there are several prepared statements, indexed by query. Each of them can be, in turn, a string containing a prepared statement with no parameters or only string parameters, or an array of the form [ 'query' => 'SELECT ...', 'types' => 'si' /* or other parameters types */ ]

typesParameter types for the prepared statements
cache secondsCache MongoDB result for so many seconds
{{#get_program_data:}}
mw.ext.externalData.getProgramData()
Program ID
(program)
'*' commandShell command
inputParameter name that shall be fed into program's standard input
tempName of the temporary file to be used instead of standard output
limitsResource limits
envEnvironment variables
ignore warningsIgnore warnings that a successfully executed program may send to stderr
preprocessA callable that preprocesses program's standard input
postprocessA callable that postprocesses program's standard output
nameProgram name for Special:Version
program urlProgram website fot Special:Version
versionProgram version for Special:Version
version commandA shell command that outputs program version for Special:Version
tagTag for the tag emulation mode
throttle keyThrottle key
throttle intervalInterval between two throttled calls, in seconds
always use stale cacheAlways allow stale cache
min cache secondsCache for at least so many seconds

Remember that {{#*_external_table:}}/{{#external_value:}} in standalone mode or {{#get_external_data:}} can replace any of the {{#get_*_data:}} parser functions, as well as mw.ext.externalData.getExternalData() can replace any of the mw.ext.externalData.get*Data() Lua functions.

Any parameter of {{#get_*_data:}} can be omitted, provided it is set in the corresponding $wgExternalDataSources['…'] array. The obvious exception is the parameter used as the key to $wgExternalDataSources: url, directory, file, domain, db, program and source (see below).

The parameter source can replace url, directory, file, domain, db and program, provided that the corresponding $wgExternalDataSources['source'] contains all the settings necessary to choose and initialise the proper connector. Furthermore, if the value of source does not contain equal signs, source = can be omitted, i.e., this parameter can be passed anonymously.

Any configuration setting can include wildcards surrounded by dollar signs, like this: 'url' => 'https://raw.githubusercontent.com/lipis/flag-icons/main/flags/4x3/$iso2$.svg'. These wildcards will be substituted from additional parameters to {{#get_*_data:}}. The additional parameters should be declared as required or receive a default value in $wgExternalData['…']['params'], e.g.: $wgExternalData['…']['params'] = [ 'iso' ];. It is important that a validator is set up for these parameters: $wgExternalData['…']['param filters'] = [ 'iso' => '/^[a-z]{2}$/' ];. This mechanism allows formation of shell commands used by server-side programs.

With 'hidden' => true, wiki admin can define hidden data sources, the very nature of which is hidden from wiki user. Example of such a source:

$wgExternalDataSources['flags'] = [
	'url' => 'https://raw.githubusercontent.com/lipis/flag-icons/main/flags/4x3/$iso2$.svg',
	'params' => [ 'iso2' ],
	'param filters' => [ 'iso2' => '/^[a-z]{2}$/' ],
	'format' => 'text',
	'hidden' => true
];

Is such a source is defined, the following wikitext will show the SVG code for Israeli flag:

{{#get_external_data: flags | iso2 = il }}
{{#external_value:__text}}

Hidden data sources can only be called with {{#*_external_table:}}/{{#external_value:}} in standalone mode, or {{#get_external_data:source=(id)|...}}, or {{#get_external_data:(id)|...}}, not using other specific {{#get_*_data:}} functions and there identifiers like db, domain, etc.

The default value for $wgExternalDataSources is:

$wgExternalDataSources = [
	'*' => [
		'min cache seconds' => 3600,
		'always use stale cache' => false,
		'throttle key' => '$2nd_lvl_domain$',
		'throttle interval' => 0,
		'replacements' => [],
		'allowed urls' => [],
		'options' => [ 'timeout'=> 'default' ],
		'encodings' => [ 'ASCII', 'UTF-8', 'Windows-1251', 'Windows-1252', 'Windows-1254', 'KOI8-R', 'ISO-8859-1' ],
		'params' => [],
		'param filters' => []
	]
];

Other settings

Other settings are:

  • $wgExternalDataVerbose = true; — show error message, if an internal variable is not set. Note also, that {{#external_value:}} allows passing a default/fallback value as its second parameter,
  • $wgExternalDataAllowGetters = true; switches on the compatibility mode, under which the {{#get_…_data:}} data retrival functions are still available, as well as mw.ext.getExternalData.getData() functions other than mw.ext.getExternalData.getExternalData().
    Without the compatibility mode, the only way of accessing external data is the standalone mode of {{#…_external_table:}} and mw.ext.getExternalData.getExternalData().
    When wikipages are parsed with Parsoid, only the standalone mode or using Lua guarantee that the data is fetched prior to its display,
  • $wgExternalDataIntegratedConnectors = [...]; — a set of rules regulating the choice of class to handle connection to a data source depending on parameters of the {{#…_external_data:}} working in standalone mode. Injecting a new rule calling a new class extending EDConnectorBase allows to add new functionality to this extension,
  • $wgExternalDataConnectors = [...]; — an additiona set of rules regulating the choice of class to handle connection to a data source depending on the parser function invoked and its parameters in compatibility mode,
  • $wgExternalDataParsers = [...]; — a set of rules regulating the choice of text parser to convert text, returned by an external service, to variables. Injecting a new rule calling a new class extending EDParserBase allows to add new functionality to this extension.

For default values of these three variables refer to the file extension.json.