The Centralised Me

by Matt 30. April 2008 07:15

Yes, yes. All the cool kids are talking about Live Mesh. I'll get to that; this is related.

There was a very interesting post on TechCrunch a couple of weeks ago entitled "FriendFeed, The Centralized Me and Data Portability". It's really struck a chord with me.

It's introduced the concept (buzzword) of the "Centralised Me", which is lovely marketing, but might very well be a bit of a red herring.

The thinking goes like this: back in the day, all you had on the internet was your home page, and all your random thoughts, photos of your cats and interesting links went up there (usually edited by hand, in raw html. Hardcore). Nowadays, there's Flickr for your photos, YouTube for videos, Facebook for your friends, Twitter for your inane babble and so on. In other words, we've gone from having a very centralised view of "me", to a very decentralised view - "me" is spread across many sites.

The first problem I see with this is that it's not quite true. We might not have had Facebook, but we did have, for example, Usenet and mailing lists, both highly decentralised. Getting a single view of all of my interactions would have been a daunting task. And what about blog comments? Again, very decentralised.

And let's just think about that for a second.

Centralised and decentralised are just points of view.

All of my comments to blogs are very decentralised, but to the blog owners, those very same comments are completely centralised - attached to the blog posts they're in response to.

Facebook messages are centralised to all my friends on Facebook, as my Flickr photos are centralised to my Flickr friends. They're just decentralised to someone looking for "all" of my stuff.

Even if you look at the poster child of decentralised authentication that is OpenID, it's only decentralised as far as the protected web site is concerned. As far as I'm concerned, I always log in at the same point - my OpenID server. Centralised. All Yahoo IDs are now OpenIDs. Centralised.

So where does that leave us with social networks?

FriendFeed is an aggregator of your other social networks. You join up, and start broadcasting an aggregated view of every other social network you're a member of. You subscribe to other FriendFeed members, and you've now got a single port of call for all the updates you're interested in. It aims to be the Centralised Me.

But to quote the TechCrucnh article, this just means it's another "data silo".

And this brings us to the Data Portability Project. Ideally, it should help to protect us against data silos. In theory, as long as each silo implements the correct Microformats, and authentication (OpenID, OAuth) we should be able to access and copy/move our data out of a silo.

What I haven't seen is where we move the data to. Another silo?

The most interesting thing I have seen in this space is Google's Social Graph. It indexes the Microformat information found in web pages, and automatically builds up a social graph from this (see also Microformat's social network portability). Extrapolate a little here, and you can easily see how this technique could get a single view of my entire social network. It wouldn't matter where I put what, Google would find it for me. Google would find me. Not centralised, but aggregated. However, it's not as easy as that. Google can only access public information. It would be great for Twitter, and irrelevant for Facebook. We could give it credentials, but then it becomes another data silo.

There is no Centralised Me. There's just convenient and inconvenient.

Or rather, centralised and decentralised don't apply here. It's not a hub and spokes model. It's a (gulp) mesh.

Tags:

Web App + Offline = Crappy Client App

by Matt 22. April 2008 09:02

I'm with Furrygoat:

Q: What do you get when you cross a browser application with the ability to go offline?

A: A client application without any the goodness that the platform (be it Windows or OS X) has to offer.

Really? Do people really want this?

Don’t get me wrong, I get the convenience of having access to your data from whatever machine your on, but wouldn’t a better model be to store the data in the cloud and provide a good abstraction on top of it so that it could be accessed from either a really well done rich client and a web application?

Point in case: I find it interesting that most of the twitter feeds that I read are created by client applications accessing the twitter API.

Perhaps there’s been so much blah blah blah about web 2.0, social networks, etc., or that folks have just gotten so lazy that they’ve forgotten how to write client applications. It’s sad really.

Another case in point: how many bloggers rave over Windows Live Writer?

I think the future is going to be all about Services + Software. Use the full resources of the desktop when it's available to you, but the data is there in the cloud when you're not. The main benefit of browser based applications is their availability on *any* desktop. The main drawback is the offline question. But these aren't opposing ends of the same spectrum, and don't have to be fixed in the same application.

And when you find people looking to share cached javascripts between sites because the frameworks are too large, you know you're in trouble. This is just a client side install.

Update: I didn't think I'd see such agreement from such A-listers as Yahoo!'s Jeremy Zawodny or (the sadly missed) Dare Obasanjo. That pendulum keeps on swinging...

Tags:

The pendulum swings. Again.

by Matt 14. April 2008 06:47

We've had the boom times of Fat Clients. That AJAX acronym got us all excited over Thin Clients. And now we're back to renting time on mainframes...

As they say; so it goes.

Tags:

Remembering IFilters

by Matt 11. April 2008 05:34

OK. Quick bug fix to my previous post. I said that when you selected "Index Properties Only", it stripped the registered IFilter from the file type. Strictly speaking, it copies the existing "persistent handler" class id, saves it in a value of "OriginalPersistentHandler" and then deletes the current registration.

This way, when you reselect "Index Properties and File Contents", it can copy the original value back, and use the proper IFilter and not have to default to the Plain Text filter.

Just the facts, ma'am.

Tags:

Windows Desktop Search

Indexing Windows Live Writer posts

by Matt 9. April 2008 10:44

While googling for something else, I came across a post that pointed out that Windows Live Writer's saved posts aren't being indexed. Well, the contents weren't - only the file properties. Which is odd, because WLW comes with an IFilter - a plugin that exposes the contents of a .wpost file to Windows Search's index.

image

The article mentions that you can fix this by going to the Indexing Options in the control panel (and going to Advanced -> File Types), selecting the wpost extension, and changing the radio button from "Index Properties Only" to "Index Properties and File Contents".

This works, but not as you expect. It's not using the Windows Live Writer IFilter.

When you select "Index Properties Only", the registered filter is removed from the file type. If a file has no filter registered, the indexer will use the system provided "File Properties Filter", which extracts various properties such as filename, size, dates (and maybe the OLE DocFile structured storage properties) but doesn't touch the contents.

Selecting "Index Properties and File Contents" doesn't magically wire up the correct filter. Instead, it registers the "Plain Text Filter", which just extracts as much text out of the file as it can, and then hands it to the indexer as content. You can use it on arbitrary binary files, but it won't understand the file format, so won't be able to output more advanced properties, such as Author, Subject or Perceived Type. If you try to use the advanced search features of explorer to find blog posts with a certain subject, it will fail. Not too much of a hardship, perhaps, because the text will still match the full content search, but by missing the Perceived Type, the indexer doesn't know if it's a document, email, picture, audio, video or whatever. Bang goes your filtering.

We can fix this, but let's see why it wasn't registered in the first place. A great tool to help with this is Citeknet's IFilter Explorer.

 IFilter Explorer - Citeknet

Take a look for the .wpost extension. It's not there. Now we know why the proper filter wasn't being used - it's not registered.

You might have noticed the bewildering array of tabs across the top of the list. Windows Search shares a history with a long line of search products from Microsoft, from server side search engines such as SQL Server full text search, Sharepoint and Exchange search, to desktops, with Windows Search (3.x), Windows Desktop Search (2.x - MSN Desktop Search), Indexing Service and even the aborted WinFS.

On a hunch, check out Windows Desktop Search 2.x.

There it is. The .wpost extension has the WebPostFilter class registered against it.

And that's because despite sharing ancestry and the IFilter technology, registration between the different implementations can be subtly (and not so subtly) different. For example, the SQL Server registration needs extra data in a system table.

There does appear to be a common thread amongst registrations, though, and this is partly described in the docs for the current version of Windows Search. Namely, registration hangs off the file extension in the registry, or off the document type pointed to by the file extension. Or even from the MIME content type (which I didn't know worked, but explains why so many xml files are indexed).

Windows Desktop Search 2.x simply had some overrides that were checked before the system defined places, and the Windows Live Writer developers chose to register it there:

HKLM\SOFTWARE\Microsoft\RSSearch\ContentIndexCommon\Filters\Extension\.wpost

Now we know what the problem is, it's pretty straight forward to fix. We just need to deal with the mind-bogglingly odd way of registering IFilters.

Hanging off the file extension, the document type or the MIME type, you need to add a key called "PersistentHandler". This has a GUID that is stored in HKLM\CLSID. That GUID has a key called PersistentAddinsRegistered, which has another subkey named after the interface IID for IFilter. The default value of this is a CLSID for the IFilter COM object.

Phew.

I have absolutely no idea why they added that bonkers level of abstraction, but it's been there for years, so who are we to argue with tradition. To make it easy, save this as a .reg file and double click:

[HKEY_CLASSES_ROOT\.wpost]

[HKEY_CLASSES_ROOT\.wpost\PersistentHandler]
@="{60734E5A-7C25-479f-B101-F14DEAF5ACB6}"

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}]
@="Windows Live Writer persistent handler"

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}
\PersistentAddinsRegistered]

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}
\PersistentAddinsRegistered\{89BCB740-6119-101A-BCB7-00DD010655AF}]
@="{4DFA66FF-1EE1-4BAF-A034-0023FB7372EB}"

[HKEY_CLASSES_ROOT\CLSID\{60734E5A-7C25-479f-B101-F14DEAF5ACB6}
\PersistentHandler]
@="{60734E5A-7C25-479f-B101-F14DEAF5ACB6}"

Note that I've wrapped a couple of lines for legibility. Oh, and that PersistentHandler GUID? Brand new one. Never before used. ({60734...} that is. {89BCB...} is the IID for IFilter and {4DFA6...} is the CLSID of the Windows Live Writer filter).

Advanced Options

Now you just have to get the indexer to re-index those files, and Bob's yer uncle. I took the lazy route, and just rebuilt the whole index (Control Panel -> Indexing Options -> Advanced -> Rebuild).

Painless, eh? What I want to know now, is what does the null filter do?

Tags:

Windows Desktop Search

What happened to my abstraction?

by Matt 8. April 2008 17:17

Silverlight 1.0 has just had a minor update. Here's what's changed, according to Dr Sneath:

The changes are minor in nature and shouldn't affect existing applications; they include an audio bug fix for nForce 4 motherboards, an update to...

Goodness. 2008 and we're still updating *applications* to fix bugs in *motherboards*.

Tags:

Month List

RecentComments

Comment RSS