As an example, we will be adding a new stemming function for indexing
texts in the french language.
One of the first thing you might want
to do is to rewrite the action "index.stem.remove_s", which
attempts to make a word singular from its plural form.
The actual
version of this function, in libindexing/remove_s.c, removes trailing
"s" from words not ending with "us", "ss" or "
s", and replaces
ending in "ies" by "y". An appropriate rule for french would be to remove trailling "s"'s
from every word, and replacing a word ending in "aient" by the
ending "ait".
At this point, you might want to browse in appropriate documents to know more about the functions you are about to replace. An interesting source of information is the file smart.11.0/Doc/howto/defaults , which contains the list of default values for standard SMART parameters, and explains some of them. Another source is to look at the docsmart information retrieved automatically from the source code; for our example, the command
donalde$ docsmart remove_swill give you some implementation details for the function.
Since we do not wish to lose the existing remove_s action, but rather to add a new action, we will leave the existing function untouched. Instead, we will add a new action to the hieararchy under local.*.
(If you haven't read the section of this document describing the SMART action hierarchies, now would be a good time to do so.) From the docsmart information we gathered, or from an existing spec file, we see that the remove_s action is located in the hierarchy under index.stem.remove_s. We will thus use the file containing the remove_s functions as an inspiration for our new version. The library that will have to be updated, as we add our new function, will be under local.index.stem.*, so the library to be modified will be the smart.11.0/src/liblocal/libindexing directory.
Now that we have all the elements to actually begin doing the modification; let's do it: