django_mysqlfulltextsearch/README.md

104 lines
3.0 KiB
Markdown

django-mysqlfulltextsearch
==============================
django-mysqlfulltextsearch is a simple plug-in for Django that
provides access to MySQL's FULLTEXT INDEX feature available in MySQL
5.0 and up.
Although Django supports the "search" lookup in QuerySet filters
[http://docs.djangoproject.com/en/dev/ref/models/querysets/#search],
the docs specify that you must create the fulltext index yourself.
This variant, inspired by code from a blog entry by Andrew Durdin
[http://www.mercurytide.co.uk/news/article/django-full-text-search/],
includes a return value "relevance," which is the score MySQL awards
to each row returned. This is a win for small sites that do not need
a heavyweight search solution such as Lucene or Xapian. ("relevance"
is supposed to be a configurable dynamic field name, but I haven't
provided a reliable path to change it yet.)
Along with the updated API, this code provides for index discovery.
If a table has exactly one fulltext index, you can create a
SearchManager without declaring any fields at all, and it will
auto-discover the index on its own. If you specify a tuple of search
fields for which no corresponding index exists, the returned exception
will include a list of valid indices.
Standard Usage:
---------------
Create the index. For the model "book" in the app "Books":
./manage dbshell
> CREATE FULLTEXT INDEX book_title on books_book (title, summary)
Or via South:
def forwards(self, orm):
db.execute('CREATE FULLTEXT INDEX book_text_index ON books_book (title, summary)')
Using the index:
from mysqlfulltextsearch import SearchManager
class Books:
...
objects = SearchManager()
books = Book.objects.search('The Metamorphosis', ('title', 'summary')).order_by('-relevance')
> books[0].title
"The Metamorphosis"
> books[0].author
"Franz Kafka"
> books[0].relevance
9.4
If there is only one index for the table, the fields do not need to be
specified, the SearchQuerySet object can find it automatically:
from mysqlfulltextsearch import SearchManager
class Books:
...
objects = SearchManager()
books = Book.objects.search('The Metamorphosis').order_by('-relevance')
Tips:
-----
Generating the index is a relatively heavyweight process. When you
have a few thousand documents, it might be best to load them first,
then generate the index afterward.
To Do:
-----
-- Easy
Make the "relevance" dynamic field name configurable.
-- Moderate
Provide means for matching against BOOLEAN, NATURAL LANGUAGE, and
QUERY EXPANSION modes. (Preliminary experiments with this revealed
some... interesting... problems with parameter quotation.)
-- Difficult
Provide means for using a SearchManager to access indices on joined
tables, for example:
Author.objects.search("The Metamorphosis", ("book__title",))
-- Insane
Provide for a way to have FULLTEXT search indices specified in a
model's Meta class, and have syncdb or south pick up that information
and do the right thing with it.