File:BareBonesSearch.webm

Summary

Description
English: Bare-Bones Basics of Full-Text Search—This is a version of a presentation I gave, re-recorded to share more widely. From the introduction: "Most of what we are going to cover is actually more basic than the stuff we do with CirrusSearch, Elasticsearch, and Lucene—which power the search on Wikipedia and other wikis—but it makes for a good mental model of the basic parts of an information retrieval system, and it provides a place to build on to discuss the more complex processing we actually do today. My goal is to start with no prerequisites and go over inverted indexes, tokenization and stemming, basic boolean and proximity retrieval operations, TF/IDF and the vector space model of similarity, field-level indexing, using multiple indexes, and then touch on some of the elements of scoring."
Date
Source Own work
Author Trey Jones (WMF)

Licensing

I, the copyright holder of this work, hereby publish it under the following license:
w:en:Creative Commons
attribution share alike
This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
Attribution:
Trey Jones
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
Category:CC-BY-SA-4.0#BareBonesSearch.webm
Category:Self-published work Category:Uploaded with video2commons Category:Videos of 2018#0118 Category:Search algorithms Category:WebM videos
Category:CC-BY-SA-4.0 Category:Search algorithms Category:Self-published work Category:Uploaded with video2commons Category:Videos of 2018 Category:WebM videos