The thesaurus file format will change from OOo version 1.x to 2.x
The engine, myThes has been developped by Kevin Hendricks (OOo
lingucomponent project lead). A standalone version is available at
The main changes introduced are
- datas are now plain text, no binary anymore
- each entry can have multiple meanings and can be morphologically
This new format is incompatible with old one. So existing thesaurus will
not work in OOo 2.0
I'm working on a small program translating the old thesauruses to new
format. It is an OOo macro accessing thesaurus API (mainly the
com.sun.star.linguistic2.Thesaurus service available in OOo 1.1.x and
the old .idx file which is plain text).
Once the data transformed (the .dat file is created), the new index .idx
file is generated using a perl script Kevin wrote.
It is almost finished and will be released under free licence so that other
native-lang OOo projects can transform their own thesaurus if needed.
(Post originally written by Laurent Godard on the old Nuxeo blogs.)