next up previous contents
Next: Create a test collection Up: Modifying SMART Previous: User interface

An example - Adding a stemming function

As an example, we will be adding a new stemming function for indexing texts in the french language. One of the first thing you might want to do is to rewrite the action "index.stem.remove_s", which attempts to make a word singular from its plural form. The actual version of this function, in libindexing/remove_s.c, removes trailing "s" from words not ending with "us", "ss" or "
s", and replaces ending in "ies" by "y". An appropriate rule for french would be to remove trailling "s"'s from every word, and replacing a word ending in "aient" by the ending "ait".

At this point, you might want to browse in appropriate documents to know more about the functions you are about to replace. An interesting source of information is the file smart.11.0/Doc/howto/defaults , which contains the list of default values for standard SMART parameters, and explains some of them. Another source is to look at the docsmart information retrieved automatically from the source code; for our example, the command

donalde$ docsmart remove_s
will give you some implementation details for the function.

Since we do not wish to lose the existing remove_s action, but rather to add a new action, we will leave the existing function untouched. Instead, we will add a new action to the hieararchy under local.*.

(If you haven't read the section of this document describing the SMART action hierarchies, now would be a good time to do so.) From the docsmart information we gathered, or from an existing spec file, we see that the remove_s action is located in the hierarchy under index.stem.remove_s. We will thus use the file containing the remove_s functions as an inspiration for our new version. The library that will have to be updated, as we add our new function, will be under local.index.stem.*, so the library to be modified will be the smart.11.0/src/liblocal/libindexing directory.

Now that we have all the elements to actually begin doing the modification; let's do it:



 
next up previous contents
Next: Create a test collection Up: Modifying SMART Previous: User interface
Christian Meunier
1999-05-02