This article describes how to integrate the Enchant Hunspell spelling engine in Enterprise Server for use in the Multi-Channel Text Editor of Content Station.
Notes:
|
The integration of Enchant Hunspell consists of the following steps:
- Enchant Hunspell installation
- Enterprise Server configuration
Enchant Hunspell installation
Step 1. Installing Enchant
Step 1. Check the version of MacPorts by entering the following command in the Terminal:
sudo port version
Step 2. (Optional, only when the installed version is version 1.8 or older) Update MacPorts by entering the following commands:
sudo port -d selfupdate
sudo port upgrade outdated
Step 3. Install the Hunspell spelling engine by entering the following command:
sudo port install php5-enchant
Note: This implicitly installs Aspell, MySpell and Ispell spelling engines.
Step 4. In Enterprise Server, access the Health Check page.
Step 1. Click Advanced in the Maintenance menu or on the Home page.
Step 2. Click Health Check.
Step 5. In the menu on the left under Configuration Info, click PHP info.
Step 6. Search for 'enchant' to make sure that Enchant is configured and loaded into PHP.
Step 1. Create an enchant library folder in the PHP directory, such as C:\php\lib\enchant.
Step 2. Copy the following library files from the PHP directory into the new folder:
- libenchant.dll
- libenchant_myspell.dll
Step 3. Check if the Enchant PHP extension file is available, such as C:\php\ext\php_enchant.dll.
Note: If not available, get the extension that is compatible with your PHP installation and place it into the ‘ext’ directory (or reinstall PHP).
Step 4. Open the php.ini file and remove the semicolon (;) at the begin of the following line so it looks like below:
extension=php_enchant.dll
Step 5. Restart your web server IIS.
Step 6. In Enterprise Server, access the Health Check page.
Step 1. Click Advanced in the Maintenance menu or on the Home page.
Step 2. Click Health Check.
Step 7. In the menu on the left under Configuration Info, click PHP info.
Step 8. Search for 'enchant' to make sure that Enchant is configured and loaded into PHP.
Step 1. Create a temporary Enchant folder:
mkdir /EnchantWorkDirectory
Step 2. Go to the temporary folder:
cd /EnchantWorkDirectory
Step 3. Download the Enchant package:
wget http://www.abisource.com/downloads/enchant/1.6.0/enchant-1.6.0.tar.gz
Step 4. Extract the Enchant package:
tar xvzf enchant-1.6.0.tar.gz
Step 5. Go to the extracted Enchant folder:
cd enchant-1.6.0
Step 6. Configure Enchant:
./configure
Note: You could encounter something like below: No package 'glib-2.0' found No package 'gmodule-2.0' found Consider adjusting the PKG_CONFIG_PATH environment variable if you have installed software in a non-standard prefix. Alternatively, you may set the environment variables ENCHANT_CFLAGS and ENCHANT_LIBS to avoid the need to call pkg-config. See the pkg-config man page for more details. To solve this, install the 2 missing packages: yum -y install glib2-devel and configure Enchant (again): ./configure |
Step 7. Make sure ‘Build Myspell/Hunspell backend’ is set to ‘yes’ as below:
enchant-1.6.0
prefix: /usr/local
compiler: gcc
Build Aspell backend: yes
Build Ispell backend: yes
Build Uspell backend: no
Build Hspell backend: no
Build Myspell/Hunspell backend: yes
Build Voikko backend: no
Build Zemberek backend: no
Build a relocatable library: no
Step 8. Compile Enchant:
make
Note: When you encounter something like below: g++: command not found Do the following: yum -y install gcc-c++ make clean ./configure Make sure that after this command, ‘Build Aspell backend’ is set to ‘yes’ (see above). make |
Step 9. Install Enchant:
make install
Note: If you encounter something like below: libtool: link: unsupported hardcode properties Do the following: make clean ./configure Make sure that after this command, ‘Build Aspell backend’ is set to ‘yes’ (see above).
make install |
Step 10. Check the installed Enchant version:
enchant -v
Note: It should show something like: @(#) International Ispell Version 3.1.20 (but really Enchant 1.6.0) |
Step 11. Create (and go to) a temporary folder:
mkdir /workDirectory
cd /workDirectory
Step 12. Install all the dependencies:
yum -y install mysql mysql-server mysql-devel perl-DBD-MySQL perl-DBI httpd httpd-devel httpd-suexec apr apr-devel apr-util apr-util-devel gd gd-devel gd-progs libjpeg-devel libpng-devel freetype-devel freetype-utils libxml2-devel curl-devel libX11-devel gd-devel
Step 13. Download and extract the PHP package:
wget http://nl2.php.net/get/php-5.3.6.tar.gz/from/this/mirror
tar xvzf php-5.3.6.tar.gz
Note: In case the URL is no longer valid, search on php.net and replace the http link above.
Step 14. Go to the PHP folder:
cd php-5.3.6
Step 15. Configure PHP for 32-bit or 64-bit (the differences are highlighted):
For 32-bit:
./configure --build=i386-redhat-linux --host=i386-redhat-linux --target=i386-redhat-linux-gnu --program-prefix= --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib --with-libdir=lib --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-config-file-path=/etc --with-config-file-scan-dir=/etc/php.d --with-iconv --enable-exif --with-xmlrpc --with-openssl --enable-soap --with-libxml-dir=/usr --enable-bcmath --enable-ctype --enable-dom --enable-fileinfo --enable-filter --enable-hash --enable-json --enable-libxml --enable-pdo --enable-phar --enable-session --enable-simplexml --enable-tokenizer --enable-xml --enable-xmlreader --enable-xmlwriter --with-bz2=/usr --with-mhash=/usr --with-zlib=/usr --without-pear --disable-cgi --with-mysql=/usr --with-mysql-sock=/var/lib/mysql/mysql.sock --with-mysqli=/usr/bin/mysql_config --with-png-dir=/usr --with-gd --with-jpeg-dir=/usr --enable-mbstring=all --enable-sockets --with-enchant --with-apxs2
For 64-bit:
./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu --program-prefix= --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --with-libdir=lib64 --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/usr/com --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-config-file-path=/etc --with-config-file-scan-dir=/etc/php.d --with-iconv --enable-exif --with-xmlrpc --with-openssl --enable-soap --with-libxml-dir=/usr --enable-bcmath --enable-ctype --enable-dom --enable-fileinfo --enable-filter --enable-hash --enable-json --enable-libxml --enable-pdo --enable-phar --enable-session --enable-simplexml --enable-tokenizer --enable-xml --enable-xmlreader --enable-xmlwriter --with-bz2=/usr --with-mhash=/usr --with-zlib=/usr --without-pear --disable-cgi --with-mysql=/usr --with-mysql-sock=/var/lib/mysql/mysql.sock --with-mysqli=/usr/bin/mysql_config --with-png-dir=/usr --with-gd --with-jpeg-dir=/usr --enable-mbstring=all --enable-sockets --with-enchant --with-apxs2
Note: If you see this error: configure: error: Please reinstall the BZip2 distribution You can fix by running the following and run ./configure (as described above) again: yum install bzip2-devel |
Step 16. Compile and install PHP:
make
make install
Step 17. Copy the PHP configuration file and change according to the needs of Enterprise.
cp php.ini-development /etc/php.ini
Step 18. In Enterprise Server, access the Health Check page.
Step 1. Click Advanced in the Maintenance menu or on the Home page.
Step 2. Click Health Check.
Step 19. In the menu on the left under Configuration Info, click PHP info.
Step 20. Search for 'enchant' to make sure that Enchant is configured and loaded into PHP.
Step 2. Installing a dictionary
Step 1. Download your preferred Hunspell dictionary from OpenOffice Dictionaries in ZIP file format.
Step 2. Extract the downloaded language dictionary ZIP file into a temporary folder.
Step 3. Copy the Hunspell dictionary files with .aff and .dic extensions from the temporary folder in the following directory :
/usr/share/myspell/dicts
Step 1. Download your preferred Hunspell dictionary from OpenOffice Dictionaries in ZIP file format.
Step 2. Extract the downloaded language dictionary ZIP file into a temporary folder.
Step 3. Copy the Hunspell dictionary files with .aff and .dic extensions from the temporary folder in the following directory :
C:\php\share\myspell\dicts
Step 1. Download your preferred Hunspell dictionary from OpenOffice Dictionaries in ZIP file format.
Step 2. Extract the downloaded language dictionary ZIP file into a temporary folder.
Step 3. Copy the Hunspell dictionary files with .aff and .dic extensions from the temporary folder in the following directory :
/usr/share/myspell/dicts
Enterprise Server configuration
The Enterprise Server configuration consists of the following steps:
The available dictionaries need to be added to the configserver.php file by performing the following steps:
Step 1. In the configserver.php file, locate the ENTERPRISE_SPELLING option.
The structure is as follows:
define(‘ENTERPRISE_SPELLING’, serialize( array(
0 => array(
<list of dictionaries>
),
)));
Tip: The list of dictionaries can be as long as needed and may contain dictionaries from different spelling engines.
The structure for a dictionary looks as follows:
‘American English’ => array(
‘language’ => ‘enUS’,
‘wordchars’ => ‘/([‘.WORDCHARS_LATIN.’]+)/u’,
‘serverplugin’ => ‘HunspellShellSpelling’,
‘location’ => ‘/opt/local/bin/hunspell’,
‘dictionaries’ => array( ‘en_US’ ),
‘suggestions’ => 10,
),
language
The language code in llCC format (l = language code, C = country code, for example: enUS for English).
- Language code: the ISO 639 standard is used (see http://en.wikipedia.org/wiki/ISO_639). Codes can be looked up here: http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes.
- Country code: the ISO 3166 standard is used (see http://en.wikipedia.org/wiki/ISO_3166). Codes can be looked up here: http://en.wikipedia.org/wiki/ISO_3166-1.
wordchars
A definition of the type of characters used by the spell checker.
Use this feature to specify which characters the spell checker should check and which to ignore. This method makes it possible to have numbers, symbols, or characters from foreign languages ignored, even when these characters are for instance added to a valid word.
What characters to include is specified through ranges of Unicode (UTF-16) index numbers. A range is specified in such a way that it can be used directly in regular expressions. For example ‘A-Z’ means all alphabetic characters in uppercase.
Note: For additional information and a list of latin, Russian, and Japanese ranges, see the WORDCHARS_ section in the configsserver.php file.
serverplugin
The internal name of the server plug-in that integrates the spelling engine.
This name can be found in the Enterprise/server/plugins folder for standard shipped plug-ins (such as Hunspell) or in the Enterprise/config/plugins folder for custom plug-ins (such as the ones downloaded from Labs). The spelling plug-in’s folder name needs to be taken, respecting camel case:
- Hunspell: HunspellShellSpelling
- Google: GoogleWebSpelling
- Aspell: AspellShellSpelling
- Enchant: EnchantPhpSpelling
location
Full file path to the engine’s executable file, or web URL to the engine’s Web service entry point.
Always use forward slashes (/) to separate folders, even for Windows.
Check the installation path by doing the following:
- Mac OS / Linux: Enter one of the following commands:
which hunspell
which aspell
which enchant
- Windows: There is no direct way to find the location of installed spelling engines. Generally, spelling engines are installed in the Program Files folder.
Example: C:/Program Files/Hunspell/hunspell.exe
dictionaries
The names of dictionaries installed for the spelling engine.
Notes:
|
Example: 'dictionaries' => array( 'en_US' ),
suggestions
The maximum number of suggestions for a misspelled word to display to end-users.
Enter a value between 1 and 10.
doclanguage
(Optional) Document’s language code. Used in InDesign/InCopy to pre-select the dictionary for a certain article text fragment for spell checking.
When the doclanguage is not specified (which is default behavior), Enterprise Server derives this value from the language option (see above).
Note: When editing an article in Content Station and subsequently editing it in InDesign/InCopy, it is important that the same dictionaries are used in InDesign/InCopy as well. If the incorrect dictionary is used in InDesign/InCopy, the doclanguage option can be used to point it to the correct dictionary.
Example: A Dutch dictionary ‘nlNL’ has been configured in the language option. Let’s assume that InDesign/InCopy subsequently uses the ‘Dutch: Old Rules’ dictionary, whereas we want it to use the ‘Dutch: 2005 Reform’ dictionary instead. To resolve this, follow these following steps: Step 1. Look up the getLanguageCodesTable() function in the configlang.php file and verify the entry for nlNL: ‘nlNL’ => ‘Dutch’, This means that the ‘Dutch’ token stands for the ‘Dutch: Old Rules’ dictionary. Step 2. Open a new InCopy CS5 article. Step 3. In the Paragraph Styles panel, edit the Basic Paragraph Style as follows (optionally create a new paragraph style and edit that one): Step 3a. Select Advanced Character Formats. Step 3b. From the Language list, choose the required dictionary, in our example Dutch 2005 Reform. Step 3c. Close the dialog box. Step 4. With the paragraph style applied, enter some text. Step 5. Save the article to the Desktop. Step 6. Open the article in a plain-text editor or Web browser. Step 7. Search for the language tag named AppliedLanguage. It should show the value used internally, in our example $ID/nl_NL_2005. Step 8. Use the value following ‘$ID/’ to add the doclanguage option to the dictionary configuration structure in the configserver.php file: ‘doclanguage’ => ‘nl_NL_2005’, |
Step 2. Enter as many dictionaries as required.
Step 3. Save the file.
The Enchant Server plug-in is shipped with Enterprise. Verify that it has been installed correctly and that it is running properly by checking its status on the Server Plug-in page in Enterprise Server.
In Enterprise Server, click Server Plug-ins in the Maintenance menu or on the Home page.
When the whole spelling feature needs to be disabled, disable all installed spell checking plug-ins on the Server Plug-ins page. When this is needed during production (for instance for maintenance reasons) make sure all that all Content Station users are logged out, or request them to re-login.
To verify if the spell checking functionality has been integrated successfully, follow these steps:
Step 1. Run the following page in a Web browser:
http://<Server URL>/server/wwtest/spelling/workbench.php
Step 2. From the Dictionary list, choose the dictionary to test.
Step 3. Enter some text in the main window.
Step 4. Click Check Spelling.
The following columns appear:
- Checked Words. Shows how text is split up into words. This is based on the regular expression specified in the wordchars option in configserver.php. When words are split up incorrectly, verify the wordchars option.
- Misspelled Words. Shows the spell checking results of the spelling engine. When incorrect suggestions are made, check the installed dictionaries for your spelling engine.
Using multiple servers
When Enterprise Server is replicated over multiple Server machines, repeat the installation steps as listed in Integrating the Enchant Hunspell spelling engine in Enterprise Server 9 for each machine.
Recommended is to have an exact copy of the configserver.php file throughout all Server machines. If that is the case, make sure that the spelling engine installation paths are exactly the same, including the installation paths of the dictionaries.
The Enterprise Health Check does not check installation or configuration differences between the Server machines. The system administrator is responsible to keep the machines in sync; this includes the installed engine versions and dictionaries. The same applies to words that might get added to the dictionaries (which is not supported by Enterprise, but can be manually done through the shell or command prompt).
When you would use the global entry point URL to index.php (as used by client applications and configured in the WWSettings.xml file), the Health Check will be assigned to a certain available machine, resulting in the configuration paths on that machine being checked. Even when running this many times does not guarantee that all machines are hit. Instead, the Health Check should be run on a local URL for each machine.
Comment
Do you have corrections or additional information about this article? Leave a comment! Do you have a question about what is described in this article? Please contact Support.
0 comments
Please sign in to leave a comment.