All Projects → wagtail → wagtail-whoosh

wagtail / wagtail-whoosh

Licence: other
Search backend for Wagtail CMS using Whoosh engine.

Programming Languages

python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to wagtail-whoosh

wagtailyoast
Wagtail + Yoast
Stars: ✭ 22 (-4.35%)
Mutual labels:  wagtail
wagtail-react-blog
SPA built with React, Tailwind CSS and Wagtail Rest API
Stars: ✭ 66 (+186.96%)
Mutual labels:  wagtail
madewithwagtail
A showcase of sites and apps made with Wagtail CMS, the easy to use, open source Django content management system
Stars: ✭ 69 (+200%)
Mutual labels:  wagtail
wagtail-inventory
Search Wagtail pages by the StreamField blocks they contain
Stars: ✭ 45 (+95.65%)
Mutual labels:  wagtail
wagtail-treemodeladmin
An extension for Wagtail's ModelAdmin for a page explorer-like navigation of Django model relationships
Stars: ✭ 31 (+34.78%)
Mutual labels:  wagtail
devheldev
Our development site with Wagtail
Stars: ✭ 14 (-39.13%)
Mutual labels:  wagtail
pari
Django/Wagtail based PARI webapp
Stars: ✭ 32 (+39.13%)
Mutual labels:  wagtail
flask-whooshee
Customizable Flask - SQLAlchemy - Whoosh integration
Stars: ✭ 68 (+195.65%)
Mutual labels:  whoosh
wagtail-2fa
2 Factor Authentication for Wagtail
Stars: ✭ 63 (+173.91%)
Mutual labels:  wagtail
wagtail-django-recaptcha
A simple recaptcha field for Wagtail Form Pages
Stars: ✭ 47 (+104.35%)
Mutual labels:  wagtail
pipeline
The Polytechnic's content management system
Stars: ✭ 17 (-26.09%)
Mutual labels:  wagtail
wagtail-metadata-mixin
🔍 OpenGraph, Twitter Card and Schema.org snippet tags for Wagtail CMS pages
Stars: ✭ 42 (+82.61%)
Mutual labels:  wagtail
wagtailvideos
Videos for Wagtail CMS, including transcoding
Stars: ✭ 43 (+86.96%)
Mutual labels:  wagtail
wagtail textract
Text extraction for Wagtail document search
Stars: ✭ 27 (+17.39%)
Mutual labels:  wagtail
draftjs-conductor
📝✨ Little Draft.js helpers to make rich text editors “just work”
Stars: ✭ 39 (+69.57%)
Mutual labels:  wagtail
wagtail-pg-search-backend
PostgreSQL full text search backend for Wagtail CMS
Stars: ✭ 22 (-4.35%)
Mutual labels:  wagtail
wagtail.io
Source code of https://wagtail.org/
Stars: ✭ 25 (+8.7%)
Mutual labels:  wagtail
wagtailsvg
Wagtail + SVG
Stars: ✭ 26 (+13.04%)
Mutual labels:  wagtail
wagtail-react-streamfield
Powerful field for inserting multiple blocks with nesting. (NO LONGER MAINTAINED - See Wagtail 2.13 Release Notes)
Stars: ✭ 75 (+226.09%)
Mutual labels:  wagtail
wagtailclearstream
A work-in-progress app to make Wagtail's StreamField more modular
Stars: ✭ 33 (+43.48%)
Mutual labels:  wagtail

Search backend for Wagtail CMS using Whoosh engine.

Build Status

How to use

  • 0.1.x work with wagtail>=2.0,<2.2
  • 0.2.x work with wagtail>=2.2

pip install wagtail-whoosh

After installing this package, add wagtail_whoosh to INSTALLED_APPS. And then config WAGTAILSEARCH_BACKENDS

import os

ROOT_DIR = os.path.abspath(os.path.dirname(__name__))

WAGTAILSEARCH_BACKENDS = {
    'default': {
        'BACKEND': 'wagtail_whoosh.backend',
        'PATH': os.path.join(ROOT_DIR, 'search_index')
        'LANGUAGE': 'fr',
    },
}

Set ./manage.py update_index as cron job

Features

Support autocomplete

If you want to search hello world, you might need to use hello in previous versions. Now you can use hel and the backend would return the result.

# you need to define the search field in this way
index.SearchField('title', partial_match=True)

# or this way
index.AutocompleteField('title')

Specifying the fields to search

# Search just the title field
>>> EventPage.objects.search("Event", fields=["title"])
[<EventPage: Event 1>, <EventPage: Event 2>]

Score support

results = Page1.objects.search(query).annotate_score("_score").results()
result += Page2.objects.search(query).annotate_score("_score").results()
return sorted(results, key=lambda r: r._score)

Language support

Whoosh includes pure-Python implementations of the Snowball stemmers and stop word lists for various languages adapted from NLTK.

So you can use the built-in language support by setting like 'LANGUAGE': 'fr', the language support list is below.

('ar', 'da', 'nl', 'en', 'fi', 'fr', 'de', 'hu', 'it', 'no', 'pt', 'ro', 'ru', 'es', 'sv', 'tr')

If you want more control or want to do customization, you can use ANALYZER instead of LANGUAGE here.

An analyzer is a function or callable class (a class with a call method) that takes a unicode string and returns a generator of tokens

You can set ANALYZER using an object reference or dotted module path.

NOTE: If ANALYZER is set, your LANGUAGE would be ignored

from whoosh.analysis import LanguageAnalyzer
analyzer_swedish = LanguageAnalyzer('sv')

WAGTAILSEARCH_BACKENDS = {
    'default': {
        'BACKEND': 'wagtail_whoosh.backend',
        'PATH': str(ROOT_DIR('search_index')),
        'ANALYZER': analyzer_swedish,
    },
}

Optimisations

NGRAM lengths

In most cases, you can modify NGRAM_LENGTH to make the index operation faster.

The default minimum length for NGRAM words is 2, and the maximum is 8. For indexes with lots of partial match fields, or languages other than English, this could be too large. It can be customised using the NGRAM_LENGTH option:

WAGTAILSEARCH_BACKENDS = {
    'default': {
        'BACKEND': 'wagtail_whoosh.backend',
        'PATH': str(ROOT_DIR('search_index')),
        'NGRAM_LENGTH': (2, 4),
    },
}

further reading

Memory & CPU

By default the Whoosh indexer uses 1 processor and 128MB of memory max. This can be changed using the PROCS and MEMORY options:

Please only change them if you find memory and cpu limits, in some cases, changing them would not speed up the index

WAGTAILSEARCH_BACKENDS = {
    'default': {
        'BACKEND': 'wagtail_whoosh.backend',
        'PATH': str(ROOT_DIR('search_index')),
        'PROCS': 4,
        'MEMORY': 2048,
    },
}

note: memory is calculated per processor, so the above configuration can use up to 8GB of memory.

NOT-Supported features

  1. facet is not supported.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].