Extension talk:SphinxSearch/2020

The talk page now uses liquid threads, all open messages prior to liquid threads have been converted into a talk subpage and can be found here.
It is recommended to use SphinxSearch 0.8.5 and a recent stable release of Sphinx 2.1. Please bear in mind that this extension only handles the communication between MediaWiki and Sphinx and any specific issues related to the search feature (character sets, ability to search with *, search categories, minimum length on search terms etc.) are handled in Sphinx (see sphinx.conf file) and those questions should be redirected to the Sphinx forum.
The development on this extension is done on a voluntary basis and while this forum provides a platform to share experiences and solutions, it is up to its community members to fill in suggestions.

Help

When seeking help and/or support, you might want to consider to mention your system environment (SphinxSearch Extension version, Sphinx version, MW version etc.) otherwise it might be difficult for people to make appropriate recommendations.

For Windows users and related issues see here, for Linux users and related issues see here, and some advice on how to configure a SQLite setup see here.


When might we expect compatibility with 1.34?

Although the download page does offer a version for 1.34, at least in my experience that throws errors after installation, for example "[728f57963e6ba6317d0fee1a] 2020-01-30 15:54:50: Fatal exception of type "ArgumentCountError". To be fair, the page we're discussing here only shows compatibility up through 1.33. Is an update something we should expect soon? WhitWye (talk) 15:56, 30 January 2020 (UTC)

Issues with SphinxSearch in 1.34

I upgraded mediawiki to 1.34 and SphinxSearch to the version that matches version 1.34. When I run a search I get this error:

[895083e3cfe4edcfab999700] /testwiki/index.php?search=Celia&title=Spezial%3ASuche&go=Seite ArgumentCountError from line 28 of /srv/www/htdocs/testwiki/extensions/SphinxSearch/SphinxMWSearchResult.php: Too few arguments to function SphinxMWSearchResult::getTextSnippet(), 0 passed in /srv/www/htdocs/testwiki/includes/widget/search/FullSearchResultWidget.php on line 65 and exactly 1 expected

When I took a look at SphinxMWSearchResult.php found that getTextSnippet is expecting the parameter terms

       public function getTextSnippet( $terms ) {


However I found this in the 1.34 release notes.

* SearchResult::getTextSnippet( $terms ) the $terms param is being deprecated

  and should no longer be passed. Search engine implemenations should be

  responsible for carrying relevant information needed for highlighting with

  their own SearchResultSet/SearchResult sub-classes.


Is the error caused by a bug or by a configuration issue? 206.130.173.59 (talk) 19:22, 2 March 2020 (UTC)

This is a bug in the SphinxSearch extension code. I submitted a pull request here with the necessary changes: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/SphinxSearch/+/587360/ Lavamind (talk) 15:45, 16 April 2020 (UTC)
Thank you, Lavamind! 2601:646:C900:1338:D1E3:28DE:1BAE:1145 (talk) 20:14, 30 April 2021 (UTC)
for 1.35 add the if
public function getTextSnippet( $terms = [] ) {
global $wgAdvancedSearchHighlighting, $wgSphinxSearchMWHighlighter, $wgSphinxSearch_index;
if (empty($terms)) {
$terms=$this->mterms ;
}
195.234.58.40 (talk) 11:10, 19 July 2022 (UTC)
sorry, I didn't read all your comments:
the solution from Lavamind is the solution 195.234.58.40 (talk) 11:16, 19 July 2022 (UTC)

Sphinx sql_query parameter broken in MW 1.34+

If anyone, after upgrading to MW 1.34, is finding out that any recently edited pages disappear completely from the index, fear not. There's a fix.

MediaWiki is changing the manner in which content is referenced in the database. To configure Sphinx to correctly index all pages using the new schema, change the following sphinx.conf parameter:

   sql_query  = SELECT page_id, page_title, page_namespace, page_is_redirect, old_id, old_text FROM page, revision, text WHERE rev_id=page_latest AND old_id=rev_text_id

to:

   sql_query = SELECT page_id, page_title, page_namespace, page_is_redirect, old_id, old_text FROM page, slots, content, text WHERE slots.slot_revision_id=page.page_latest AND content.content_id=slots.slot_content_id AND text.old_id=REPLACE(content.content_address, 'tt:', '') Lavamind (talk) 15:58, 16 April 2020 (UTC)
Thank you! 2601:646:C900:1338:7D7:4344:E56D:609F (talk) 19:58, 12 August 2021 (UTC)

How to customize advanced search namespaces?

I am using SphinxSearch 0.9.1 for MediaWiki 1.31 LTS. I am trying to customize the advanced search (searching using namespaces) that is included in my Special:Search page. I have found the following manual for altering the default namespace: Manual:$wgNamespacesToBeSearchedDefault. However, the changes I make on my LocalSettings.php does not seem to be reflecting on the Special:Search page.


Specifically, I have included the following code:


$wgNamespacesToBeSearchedDefault = [NS_HELP => false, ];


in LocalSettings.php to see if it would disable the "Help" namespace from the advanced search, but there was no change to the page.

Is there something I am doing wrong? Please help. I am new to MediaWiki and php in general, hence I understand that I may be missing a big point. Thank you! Shmaro38 (talk) 22:15, 10 June 2020 (UTC)

1.34 (possibly newer)

Line 76 of SphinxMWSearch.php is incorrectly coded.


SearchDatabase class requires an ILoadBalance construct

Without you get error:

SearchDatabase::__construct() must implement interface Wikimedia\\Rdbms\\ILoadBalancer,

        instance of Wikimedia\\Rdbms\\MaintainableDBConnRef given,

        called in /var/www/html/mediawiki/extensions/SphinxSearch/SphinxMWSearch.php on line 76


To fix, change line 76 from wfGetDB( DB_REPLICA ) to wfGetLB( DB_REPLICA ) 64.148.137.131 (talk) 12:49, 11 August 2020 (UTC)

Indexing HTML

Sphinx Search can index HTML files (according to this: Indexing html with Sphinx without complex scripts). Is this possible with this extension? Henryfunk (talk) 22:37, 27 November 2020 (UTC)