Indexing Windows Live Writer posts

by Matt 9. April 2008 10:44

While googling for something else, I came across a post that pointed out that Windows Live Writer's saved posts aren't being indexed. Well, the contents weren't - only the file properties. Which is odd, because WLW comes with an IFilter - a plugin that exposes the contents of a .wpost file to Windows Search's index.

image

The article mentions that you can fix this by going to the Indexing Options in the control panel (and going to Advanced -> File Types), selecting the wpost extension, and changing the radio button from "Index Properties Only" to "Index Properties and File Contents".

This works, but not as you expect. It's not using the Windows Live Writer IFilter.

When you select "Index Properties Only", the registered filter is removed from the file type. If a file has no filter registered, the indexer will use the system provided "File Properties Filter", which extracts various properties such as filename, size, dates (and maybe the OLE DocFile structured storage properties) but doesn't touch the contents.

Selecting "Index Properties and File Contents" doesn't magically wire up the correct filter. Instead, it registers the "Plain Text Filter", which just extracts as much text out of the file as it can, and then hands it to the indexer as content. You can use it on arbitrary binary files, but it won't understand the file format, so won't be able to output more advanced properties, such as Author, Subject or Perceived Type. If you try to use the advanced search features of explorer to find blog posts with a certain subject, it will fail. Not too much of a hardship, perhaps, because the text will still match the full content search, but by missing the Perceived Type, the indexer doesn't know if it's a document, email, picture, audio, video or whatever. Bang goes your filtering.

We can fix this, but let's see why it wasn't registered in the first place. A great tool to help with this is Citeknet's IFilter Explorer.

 IFilter Explorer - Citeknet

Take a look for the .wpost extension. It's not there. Now we know why the proper filter wasn't being used - it's not registered.

You might have noticed the bewildering array of tabs across the top of the list. Windows Search shares a history with a long line of search products from Microsoft, from server side search engines such as SQL Server full text search, Sharepoint and Exchange search, to desktops, with Windows Search (3.x), Windows Desktop Search (2.x - MSN Desktop Search), Indexing Service and even the aborted WinFS.

On a hunch, check out Windows Desktop Search 2.x.

There it is. The .wpost extension has the WebPostFilter class registered against it.

And that's because despite sharing ancestry and the IFilter technology, registration between the different implementations can be subtly (and not so subtly) different. For example, the SQL Server registration needs extra data in a system table.

There does appear to be a common thread amongst registrations, though, and this is partly described in the docs for the current version of Windows Search. Namely, registration hangs off the file extension in the registry, or off the document type pointed to by the file extension. Or even from the MIME content type (which I didn't know worked, but explains why so many xml files are indexed).

Windows Desktop Search 2.x simply had some overrides that were checked before the system defined places, and the Windows Live Writer developers chose to register it there:

HKLM\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters\Extension\.wpost

Now we know what the problem is, it's pretty straight forward to fix. We just need to deal with the mind-bogglingly odd way of registering IFilters.

Hanging off the file extension, the document type or the MIME type, you need to add a key called "PersistentHandler". This has a GUID that is stored in HKLM\CLSID. That GUID has a key called PersistentAddinsRegistered, which has another subkey named after the interface IID for IFilter. The default value of this is a CLSID for the IFilter COM object.

Phew.

I have absolutely no idea why they added that bonkers level of abstraction, but it's been there for years, so who are we to argue with tradition. To make it easy, save this as a .reg file and double click:

[HKEY_CLASSES_ROOT\.wpost]

[HKEY_CLASSES_ROOT\.wpost\PersistentHandler]
@="{60734E5A-7C25-479f-B101-F14DEAF5ACB6}"

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}]
@="Windows Live Writer persistent handler"

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}
\PersistentAddinsRegistered]

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}
\PersistentAddinsRegistered\{89BCB740-6119-101A-BCB7-00DD010655AF}]
@="{4DFA66FF-1EE1-4BAF-A034-0023FB7372EB}"

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}
\PersistentHandler]
@="{60734E5A-7C25-479f-B101-F14DEAF5ACB6}"

Note that I've wrapped a couple of lines for legibility. Oh, and that PersistentHandler GUID? Brand new one. Never before used. ({60734...} that is. {89BCB...} is the IID for IFilter and {4DFA6...} is the CLSID of the Windows Live Writer filter).

Advanced Options

Now you just have to get the indexer to re-index those files, and Bob's yer uncle. I took the lazy route, and just rebuilt the whole index (Control Panel -> Indexing Options -> Advanced -> Rebuild).

Painless, eh? What I want to know now, is what does the null filter do?

Tags:

Windows Desktop Search

Comments (15) -

deals for dudes
deals for dudes
5/6/2011 9:52:19 AM #

Pretty insightful publish. Never believed that it was this simple after all. I had spent a good deal of my time looking for someone to explain this subject clearly and you’re the only one that ever did that. Keep it up

Reply

LAURENCE  Debora
LAURENCE Debora
5/22/2011 4:56:09 AM #

Recherche thématique sur plusieurs thèmes, c'est sukoga.com, moteur.

Reply

LAURENCE  Debora
LAURENCE Debora
5/27/2011 8:32:01 AM #

Liste d'annuaires détaillées pour faciliter le référencement, Créez votre propre article.

Reply

best suv reviews
best suv reviews
7/20/2011 10:23:32 PM #

hi!,I like your writing so so much! proportion we communicate extra about your post on AOL? I require a specialist in this area to unravel my problem. May be that's you! Taking a look forward to see you.

Reply

iphone 4 g
iphone 4 g
7/23/2011 8:11:27 PM #

Una pagina sarà dedicata agli accessori, una alle giacche e ai giubbotti. Troverai le indicazioni per lo spaccio o negozio Moncler più vicino a casa tua e tutte le offerte più vantaggiose di questo prestigioso marchio.

Reply

first direct loans
first direct loans
8/2/2011 9:13:11 PM #

Hi,what an excellent article this is,I found it on bing and I like it very much,I agree with what you have said, lots of things will be learned form your site,but I still have some questions with the last part,can you explain it for me ?I will appreciate your answer,and I will be back again!

Reply

sökoptimering
sökoptimering
8/29/2011 2:44:09 AM #

Never believed that it was this simple after all. I had spent a good deal of my time looking for someone to explain this subject clearly and you’re the only one that ever did that.

Reply

Peter
Peter
11/10/2011 5:56:59 PM #

Excellent article, plenty of helpful knowledge.

Reply

Philix
Philix United States
11/20/2015 7:57:14 AM #

good site

Reply

Arlie Toth
Arlie Toth United States
1/15/2016 6:44:33 AM #

this is awesome like the book of awesome

Reply

Georgeanna Carrigg
Georgeanna Carrigg United States
1/15/2016 7:19:46 AM #

this is awesome like the book of awesome

Reply

Rueben Derosa
Rueben Derosa United States
1/15/2016 7:49:10 AM #

yeah, except the "dolphin saving you" one

Reply

Jose Bacone
Jose Bacone United States
1/15/2016 8:31:05 AM #

this is awesome like the book of awesome

Reply

Allene Kazmi
Allene Kazmi United States
8/25/2016 4:03:10 AM #

Hi First of all I would like to say what a nice blog! I had a short question that I’d like to inquire if you don’t mind. I was curious to comprehend how you center yourself and clear your thoughts prior to writing. I’ve had trouble clearing my thoughts in getting my mind out. I do enjoy posting blog but it just seems like the first 10 to 15 minutes are wasted simply just trying to figure out how to start. Any thought or tips? Thank you!

Reply

apk games free download
apk games free download United States
4/10/2017 9:57:49 PM #

check this service, it also may download apk directly from Google Play: http://www.apksmart.com

Reply

Add comment

biuquote
  • Comment
  • Preview
Loading

Rel=Me

Month List

RecentComments

Comment RSS