I've been wanting to have a play around with LINQ for quite a while now. I had a good look when VS2008 RTM'd and I was very impressed. It's a very elegant design, building slowly but surely on the building blocks of type inference, anonymous types, anonymous delegates and lambdas, extension methods and of course the master stroke of deferred execution via iterators. Clearly more than the sum of its parts. Mike Taulty has a great post that details how it's all composed. (And that's just LINQ to objects. The pluggable providers of IQueryable are a whole new ball game.)
But I hadn't got my hands dirty until today.
Too bad Ian Griffiths stole my thunder. I was looking at exactly the same problem as he's just posted about - Dare Obasanjo lamenting that anonymous types won't give him the same effect as tuples. Ian has (of course) nailed anything I wanted to say on the matter, so go read his post; anonymous types are not equivalent to tuples, and if you want to use them to acheive similar results, you need to restructure your code. And use LINQ.
In fact, just looking at the code makes you want to run for LINQ. It's a prime candidate. It's got several loops over several different collections. Each loop filters or maps a previous sequence to produce a new sequence, and the final loop is then iterated for display. That's exactly what LINQ is designed for.
And for a first attempt, I'm very pleased with how mine turned out. At the time his post dropped into my feed reader, here's what I had:
// Get a sequence of appropriate items
var items = from fileInfo in new DirectoryInfo(cache_location).GetFiles("*.xml")
let doc = XElement.Load(fileInfo.FullName)
let feedTitle = (string)doc.Element("Title") ?? string.Empty
from rssItem in
(from itemNode in doc.Descendants("Items")
where !bool.Parse((string)itemNode.Element("IsDeleted") ?? "False")
&& !bool.Parse((string)itemNode.Element("IsErrorMessage") ?? "False")
where rssItem.OutgoingLinks.Count > 0
Item = rssItem,
FeedTitle = feedTitle
// Map the appropriate items to a list of all outgoing links with a chain of votes
var linksWithVotes = from item in items
from outgoingLink in item.Item.OutgoingLinks
Item = item.Item,
FeedTitle = item.FeedTitle,
Weight = voteFunc(item.Item)
} by outgoingLink.Key;
// Collapse the groups down to a list of links with scores
var weightedLinks = (from linkWithVote in linksWithVotes
Url = linkWithVote.Key,
Weight = linkWithVote.Sum(x => x.Weight),
Votes = linkWithVote
}).OrderByDescending(x => x.Weight);
foreach (var item in weightedLinks.Take(10))
Now, I've got more than Ian does. That's because I was working from Dare's previous post about building a meme tracker for RSS Bandit in C# 3.0, which has the full program. So my first LINQ statement loads and transforms the rss items (modified to pull the items from my Sharpreader cache, not RSS Bandit). There are a couple of other points where I can learn from Ian's solution, mainly the use of the "into" clause.
I create "linksWithVotes" (Ian and Dare's "all_links") by finishing off with a group, meaning I have to deal with a group in my next query. Ian pushes the group "into" a variable, and then pulls the votes out of that in the final select, giving a much nicer final shape (a flat sequence).
Similarly, I called the OrderByDescending extension method directly. Ian gets it into the natural query by using another "into" and following it up with simple select.
And finally, I wasn't taking the minimum weight of the votes per feed title (Ian's a little confused about this one. I'm not massively sure on the reasoning, but I think it's so that if I always link to a particular url, only my oldest, weakest vote counts. I think it beats gaming, but I could be wrong.) And that's also solved by grouping "into" a variable and then selecting the min value, and summing that sequence.
I'm incredibly impressed that this whole set of loops can be reduced to a single foreach statement, but like Ian, I'm a bit unsure on the readability of this solution (mine or Ian's). As Ian says, this could be because of the nature of the algorithm, but I also wonder if it's partly the new-ness of the LINQ syntax, and having to mentally translate it into iterators and extension methods. Testing also looks tricky with this. But from a geeky-cool-new-toy perspective it's brilliant. It's so much more declarative - I've only got one loop, and that's over the 10 result items I'm interested in - I'm not even looping over the files to read them in.
To quote Ian's post:
"In summary, although I’m still finding my feet, I’m rather coming to like LINQ."