Tuesday, December 27, 2011

Drive with FAST search for SharePoint 2010 installation is running out of space

On one of the projects I was seeing recurring behavior where “D:\FAST Search\data\ftStorage\sequences” folder quickly fills up with files like storage_6a.data , storage_cd.meta thus causing the drive to run out of space. That location is used to store index (“fixml”) information used for fault tolerance (ft) in a backup indexing row. We have deleted these files at some point since there was an environment health issue as well, hoping that fixing the environment issue was to blame for causing this folder to fill up, but it came back again and quickly filled up the drive. In searching for the cause of the problem here is what was found.

1. It’s likely this is happening if one or more servers are configured as a “secondary” indexing row.  See this example from a FS4SP deployment.xml file:

<searchcluster>

                <row id="0" index="primary" search="true" />

                <row id="1" index="secondary" search="true" />

</searchcluster>

The options are to disable the secondary/backup indexer(s), or add more disk space to the backup row.

If the data was deleted in this folder, it’s likely the backup index data is out of sync, and will need to be resynchronized with this procedure:

“Synchronize the primary and the backup indexer servers (FAST Search Server 2010 for SharePoint)” - http://technet.microsoft.com/en-us/library/gg482028.aspx

2. In our case the box that had this issue was not configured as a secondary row, but was a primary row.

The deployment.xml has not used row 0 as the primary row (which is required, ref http://technet.microsoft.com/en-us/library/ff354931.aspx#element_searchcluster)

This caused the ftstorage files to neither be placed on the intended server, nor rotated and thus eating up the disk space.

Here is an example of the deployment

<host name="FAST01 ">

<admin/>

<crawler role="single"/>

<webanalyzer lookup-db="true" link-processing="true" max-targets="2" server="true"/> <document-processor processes="4"/> <content-distributor id="0"/> <indexing-dispatcher/>

<query/> <searchengine column="0" row="0"/> </host>

<host name="FAST02 ">

<document-processor processes="4"/>

<content-distributor id="1"/> <indexing-dispatcher/>

<query/>

<searchengine column="0" row="1"/>

</host>

<searchcluster>

<row id="1" search="true" index="primary"/> <row id="0" search="true" index="secondary"/>

</searchcluster>

Once the IDs were reconfigured and then I ran Set-FASTSearchConfiguration on the admin box as well as the non-admin. The issue went away Smile

Enjoy

Friday, December 23, 2011

NOINDEX Handling by FAST and SharePoint Search

Just recently I had to go through troubleshooting of content being indexed by FAST and/or SharePoint Search and wanted to share experience and some collected wisdom.

Here is the case, very common: the client has branded master page that includes menu navigation items, these items were being indexed. Where in case of searching for “Vacation form” would bring every page that is using this master page with menu items, simply because link to vacation form was included into the navigation menu on this master page. After the client included ”noindex”, which works for FAST as well as for SharePoint it seemed to work. Few weeks later when I came onsite, I noticed that the Search SSAs for SharePoint as well as for FAST had their content sources misconfigured and without even knowing the client was using just SharePoint search for everything. At this point I’ve reconfigured the content sources so only “people” search is being server by SharePoint search and the rest of the content is searched by FAST. This is when we have noticed the same problem where menu items were being indexed again without respecting the NOINDEX tag. Once we purged the index and reindexed everything again, it all worked.

Here are the tags and the explanation of how they work:

1) <meta name="robots" content="noindex" /> : Supported using both crawlers, although differently. FAST crawler will drop these items. SP crawler will not drop the items, they will be dropped by the FAST pipeline. But the result is the same for both.

2) <noindex> This text will not be indexed </noindex> : Not supported regardless of crawler.

3) <div class =”noindex”> This text will not be indexed </div> : Supported. This is transparent to both crawlers, will be filtered in the FAST pipeline.

4) <span class =”noindex”> : Supported by FAST Search back-end but not SharePoint search.

5) SP crawler fix to ensure that the <meta name="xxx" content="noindex" /> was passed to the FAST pipeline (KB 2276336)

http://support.microsoft.com/kb/2276336

Part of the August 2010 CU.

Hope this helps to others.

Enjoy Holidays!!!!

Smile