A "word" character is any letter or digit or the underscore character. The definition of letters and digits may vary, for example, in the "fr" (French) locale, some character codes greater than 128 are used for accented letters, and these are matched by \w.
Here's the problem:
I'm processing some French text files, yet my PC uses the English locale.
I need the "word" patterns to match any letter in the French alphabet, including characters with diacritics (and also the ligatures).
Being British, I don't want to switch my PC to use the French locale.
That would affect the whole of Windows, which would be overkill and confusing.
It would therefore be useful to have a TextPipe filter to switch the effective locale for subsequent filters.
It should only affect how TextPipe works, without changing the whole of Windows.
Most of my activities are with UTF-8 encoded files.
For the avoidance of doubt, I don't want to change the code page for the files being processed.
It's rather strange that TextPipe has a dependency on the locale for the \w and \W patterns.
cf. In Notepad++, the patterns \w and \W don't seem to be locale dependent.
NB. French is only one example - I process text files from a wide variety of languages.
These occasionally include some with non-Roman scripts.