For my sample extensibility stage I decided to create a module counting the number of words in a document. To count the number of words you have to get hold of the extracted text in the file being indexed.
There is a special crawled property set in FAST for SharePoint only available inside the indexing pipeline. This property set has three fields: a field named "body" which contains the extracted text of the crawled item, the field "data" which is the binary content of the source document in base64 encoding, and "url" which is the link used when displaying results. The sameple stage will make use of the "body" field.
First I created a new property set to hold the the crawled property I will emit from my program. I could have used one of the existing ones, but I find it easier to have my custom properties in a separate property set for maintenance purposes. I name the property set "mAdcOW" and assign it an arbitrary guid. You can get a GUID in PowerShell with the following command:
1 | [guid]::NewGuid() |
The PowerShell command to create a new property set/category with my random guid looks like this:
1 | New -FASTSearchMetadataCategory -Name "mAdcOW" -Propset FA585F53-2679-48d9-976D-9CE62E7E19B7 |
The guid is important as it is later used in the pipeline extensibility configuration. Default, the property set will add newly discovered properties as they are seen during the crawl. This saves us the work of manually creating the crawled properties we are going to be using. In a production environment you would, however, script up the crawled property to avoid the need of crawling before you map the crawled property to a managed one.
For maintainability I create my own folder below the FASTSearch root for my module named C:\FASTSearch\pipelinemodules. Replace C:\FASTSearch with your actual FS4SP location.
Custom stage in C#
Now over to the actual pipeline stage. In Visual Studio create a new Console Application named "WordCount".
In Program.cs I have the following code:
01 | private static int Main( string [] args) |
02 | { |
03 |
#if DEBUG |
04 |
Thread.Sleep(1000 * 90); |
05 |
|