Tuesday, May 26, 2009

FAST ESP 5.3 Installation trick

The installation is not that cumbersome, but the trick here is to install the right prerequisites especially java JDK(Java SE Development Kit with JavaFX (JDK 6u13 / FX 1.1)), you can find it at http://java.sun.com

If you do not have the right version of JDK, your installation will fail on Windows ( never done it in other OS), unfortunately you can’t really find it in documentation, so you can spend sometime struggling with it.

One of the great features of the FAST installer is that once installation is done, FAST generates installation profile XML document that can be reused in the future to perform that same type of installation scenario, this is especially useful in multi-node scenarios.

Monday, May 4, 2009

FAST search engine for SharePoint PART 3 (Content Sources and Connectors)

Each “content source” is represented as a “Collection” within FAST ESP. Data is being fed into Document Processing Pipelines for refinement through the use of “Connectors” that are defined for a specific collection.

There are three types of connectors for FAST search engine:

  1. FAST OOTB connectors
  2. Third Party (proprietary)  connectors
  3. Custom connectors, using FAST API

YES! FAST allows you to go against APIs

FAST OOTB connectors

Enterprise Crawler: used to feed content from Web Pages. Content sources, and many other settings for this connector are easily configurable through the Admin UI, including: Content Request rate, Start URIs, Include and exclude host name filters, content crawl interval, etc. Enterprise Crawler allows you to crawl unlimited number of start URIs, detects deleted content, and removes it from index, and retrieves both: static and dynamic content.

ESP File Traverser: traverses and submits files from file system to content pipelines in batches via Content API. Files that this connector serves can be in any binary or text format, as long as this format can be handled by processing pipeline (PDFs, TXT, XML, DOC, and many more).

JDBC Connector or FAST Smart Connector for JDBC: This connector uses database data or structured data for feeding into pipelines (Oracle, SQL, MySQL, DB2, etc). This connector uses JDBC driver that must be registered on the server prior to establishing a connection to the database. It extracts data to be indexed on a column level through the use of SQL query that you supply, this connector is managed through command line and through Web Interface as well.

Connectors mentioned above support content modification detection through the use of checksums as well as timestamps that are kept either in FAST built-in db or some other MySQL db.

Third Party Connectors 

There is not much I can say about these connectors except to list some of them: Lotus Notes, WebSphere, Exchange, Documentum, Hummingbird, and of course SHAREPOINT. So even if you are not looking to upgrade to the SharePoint 2010 when it becomes available with FAST, you still can integrate with SharePoint without reinventing the wheel :-)

Enjoy :-)