This article explains how Solr Search stores data in its index and how a search string is compared with the data in the index.
Use this information to see if the Solr settings need to be adjusted to provide the end users with better search results.
How data is stored in the index
Any object in Enterprise (such as an article or a layout) is stored with various types of meta data. This data is also indexed by Solr for quick access whenever a search is performed.
This data is stored in different ways:
- Without change (such as the Issue ID)
- Split into 'tokens' (such as a file name or the slug line) by using white spaces and punctuation characters as delimiters.
Example: The following sentence:
"Please, email email@example.com by 03-09, re: m37-xq."
is split into the following tokens:
"Please", "email", "firstname.lastname@example.org", "by", "03-09", "re", "m37-xq"
By default, the maximum token length varies between 4 and 15 characters, meaning that tokens containing more characters are not included in the search results.
The content in which Solr searches
Another aspect of Solr which might influence the search results is that only the following types of content are referenced:
- Plain content
When a search term is used which is not part of these types of content, the search result turns op empty.
Using the Solr Analyze tool
The Solr Analyze tool can be used for analyzing how Solr interprets and processes a search query.
Note: The Analyze tool is not really looking in Solr but is a simulation only.
The following steps can be used for the Analyze tool of Solr 4.5.or higher
Step 1. Open the config_solr.php file.
<Enterprise Server path>/config
Step 2. Copy the URL which points to Solr.
Step 3. Paste the URL in the address field of a Web browser.
Step 4. From the Dashboard on the left of the page, choose the Core Selector.
Step 5. From the menu below the chosen core selector, choose Analysis.
The Analysis page appears.
Step 6. Verify that the Analyse Fieldname field is set to WW_CATCHALL.
Note: The WW_CATCHALL field is the field which Solr uses to perform the search. By default, this field consists of the content of the slugline, name, description, plain content, and the keywords.
Step 7. In Field value (Index), enter the name of an object or some random text, and in Field value (Query) enter a search string.
Step 9. Click Analyze Values.
The result of how the search string is broken up into tokens is displayed. If one of the tokens matches a token within the index, you have a hit. All hits appear highlighted in purple.
In the following example: the search string '1234' has returned a hit:
Testing the Solr integration
To test with the actual Solr index of your Enterprise installation, perform the following steps:
Info: These steps apply to Solr 3.6 only.
Step 1. Open the admin page of Solr in a Web browser.
Step 2. In the Query String field enter your search string.
Step 3. Click Search.
The result will be in XML format, starting with the query and followed by the number of hits.
The hits refer to documents within Solr so it should be the same as what you see in the client such as Content Station.
Analyze Solr to see if Solr is returning results. If you have hits in Solr but you do not see any result in the client then this could indicate a technical issue. If so, please contact WoodWing Support.
However, be aware of the fact that Enterprise not only takes the search string into account but also the selected Brand, Issue, and so on. The real query is therefore probably more complex.
- 29 November 2017: Updated section 'Using the Solr Analyze tool' by removing the steps for Solr 3.6.