A .net framework bug

by Matt 19. December 2006 17:56

Yep. I've found an honest-to-goodness bug in the framework. And it's genuine, too. Not one of your "my code's not working, .net's broken" bugs. Nope. I've got proof.

Uri.IsHexEncoding checks for the pattern %hexhex is at a particular offset in a string. So you can write something like this to check if a string is uri encoded:

public bool IsUriEncoded(string uri)
	for(int i=0;i<uri.Length;i++)
			return true;
	return false;

For a string such as "/path/%2f/hello", it will return true. For a string such as "/path/%xx/hello", it will return false.

Fancy finding a bug? Get your mince pies round this - it's the source for Uri.IsHexEncoding (this snippet brought to you by the letter "Reflector"):

public static bool IsHexEncoding(string pattern, int index)
	if(((pattern.Length - index) >= 3)
		&& ((pattern[index] == '%') 
		&& (Uri.EscapedAscii(pattern[index + 1],
			pattern[index + 1]) != 0xffff)))
		return true;
	return false;

Spotted it yet? Try the string "%2y". It returns true.

The second parameter to Uri.EscapedAscii should be "pattern[index + 2]", not "pattern[index + 1]".

My feeling is that this method would be mostly used in a security checking context, so it's really important it works correctly.

So, a proper bug. I've reported it as a bug on Microsoft Connect (log in required). I've even added a comment to the msdn docs page (which is a really rather cool feature). I couldn't be more helpful even if I went round everyone's computers and fixed their code for them.

PS. It's dead easy to work around. You'll probably call it in a loop like the one above. Just change the line calling Uri.IsHexEncoding to:

if(uri[i]=='%' && Uri.IsHexDigit[i+1] && Uri.IsHexDigit[i+2)
	return true;



The continuing saga of StaticFileHandler - meet DefaultHttpHandler

by Matt 9. October 2006 16:32

I bloody well love Reflector. It's just magic.

After discovering that StaticFileHandler doesn't handle conditional gets and ranges, I've dug a little deeper, and, unsurprisingly, the plot thickens.

In .net 1.x, you could use StaticFileHandler to serve static files from an asp.net application - just use the IIS wildcard mapping so that all files went through the asp.net ISAPI dll, and modify your web.config.

In .net 2, things are a little better. Now, the global web.config file catches any GET, HEAD or POST requests to any file that hasn't already been handled and passes them off to DefaultHttpHandler, not StaticFileHandler. (Any other verbs get passed to HttpMethodNotAllowedHandler. That one's internal. 3 guesses what it does?)

This is quite a nice class, at least in theory. It's an asynchronous http handler that will serve static files. The bad news is that it serves static files using StaticFileHandler, with all the issues that has (no support for conditional gets or ranges). But the good news is that if you haven't modified any headers (such as caching) and you haven't already written something to the response stream, and the response stream doesn't have a filter (such as compression) and you're running in process in IIS6 then the request is passed back to IIS for processing. IIS will do the decent thing, and serve the static file with support for conditional gets and ranges and all will be good. If any of that exhaustive list is not true, then StaticFileHandler gets to serve the content (boo, hiss!). Note that you can still get compression, but you'll have to configure it at the IIS level now, not the HttpModule level. That's the good news.

This might seem of limited use - if you can't modify the headers, or add a compression filter to the output stream, what's the point? Well, you can monitor and log the request, for one thing. And more importantly, you can derive from the public DefaultHttpHandler and override the virtual OverrideExecuteUrlPath to route you off to somewhere else and IIS will serve it (sounds just like I was trying to do originally!)

Unfortunately, this class has it's problems, too. It doesn't allow you to route to your own static file handler. Even worse, though, is that you can only specify an override of the path when it decides (via conditions outside of your derived class's control) that IIS should serve the file. If it decides that StaticFileHandler should serve it, you have to have implicit knowledge of the inner workings of the DefaultHttpHandler class and know to do a Context.RewritePath in OnExecuteUrlPreconditionFailure. This is mentioned (albeit obliquely) in the docs for OverrideExecuteUrlPath. And I can't roll my own version of this as it uses internal methods to pass control back to IIS.

As long as you're strict about your global setup, this might be useful. I'll have to do a test to see if it'll fit my requirements, but this feels a little fragile - I really don't want it falling back to StaticFileHandler. I think I'll follow up on the Virtual Path Provider idea, or just buckle down and write a BetterStaticFileHandler.



Serving files from App_Data #2 - the lazy approach

by Matt 6. October 2006 17:18

I prefer "conservation of energy" myself.

Last time, we looked at why I would want to do such a crazy thing as serving files from the App_Data directory, when asp.net already forbids it for security reasons. We even looked at a rather naive approach for doing it, and saw that it was missing some rather important features - namely efficient memory usage, mime types and conditional gets.

This time, we'll get someone else to do all the heavy lifting for us. We're after a http handler that can serve static files. Asp.net has StaticFileHandler. Isn't that handy?

StaticFileHandler is how you can serve html, css, gif and jpg files when you route all files through asp.net. It pretty much does everything we want it to - it uses HttpResponse.WriteFile for small files and HttpResponse.TransmitFile for large files. It uses the undocumented and internal System.Web.MimeMapping class to map between an extension and it's mime type (hard coded, surprisingly enough) and most of all, it returns etags. It even ensures the file exists and you've got permissions and has code to handle resuming broken downloads. We've hit the jackpot. Almost.

So how can we use this? I was going to present a code sample here, but that "almost" above is stopping me. This isn't the post I set out to write - I can't advocate this method (which is a shame, 'cos I liked it - it was simple, looked like it worked, and was only slightly hacky). But i'm going to tell you about it anyway.

I created a class I called RewritingStaticFileHandler. It had a static constructor, which created an instance of StaticFileHandler and stashed it away (bizarrely, StaticFileHandler is internal, even though it's used in the default machine.config. Since it's in that file, I reckon it's fair game, so spin one up by reflection. This should have been a warning sign that things weren't going to work out too clever).

The ProcessRequest method of my new http handler called HttpContext.RewritePath, and passed in a new path in the App_Data folder with the file part coming from a query string parameter (danger Will Robinson!). It then called ProcessRequest on the stashed StaticFileHandler and voila - we're now serving files from App_Data, returning an etag, populating the mime data and I've only written 5 or 6 lines. Lovely. Almost.

Anyone spot the security hole? I'm setting the rewrite path to "~/App_Data/Images/" + Context.Request.QueryString["imageId"]. Pass in ../password.txt and it's going to serve the password.txt file from the App_Data directory. Not good.

This is quite straightforward to fix:

private void EnsureCanonicalPath(string path)
  string mappedPath = Context.Server.MapPath(path);
  if (Path.GetFullPath(mappedPath) != mappedPath ||
    throw new HttpException(404, "Not found.");

Turn the path into a physical path, call GetFullPath (which always returns a canonical path) and ensure it starts with the physical path of the images directory. This way you're ensuring that you only serve files from the images subdirectory.

So why am I not happy with this approach? Well, it's a little bit hacky using reflection, but that's not it. I fired up Reflector to check what StaticFileHandler did under the covers, and I don't like what I see. It has a number of bugs. Where do I start?

There is a method called SendEntireEntity, which returns a bool. This is the only time the incoming etag header is checked. If you're requesting a range, this method checks the etag header and if etags match, returns false - send a range. Otherwise it returns true. The return value is ignored and it always sends back the entire file.

Any conditional get headers, such as If-Unmodified-Since, Unless-Modified-Since, etc are not checked, so the etag is useless there.

The etag value itself is derived from the last modified time of the file concatenated with the current time. This is generated at the start of each request. So, if you are asking for a range, you'll pass in an etag you received when you started downloading the file. This etag will include the time of the request. The new request will get a new etag. So the etags are always different. Again, useless.

So why did it look like it was working? For small files, it sends back a cache expiry date for a day hence, so IE just wasn't bothering to request the file again. But when it will request the file, the handler will just return it in full.

I'm not very happy about this. I was expecting StaticFileHandler to be a bit better than this. Having this kind of problem has lots of implications - I'm going to have to make some changes at work based on this. Bah. Not the post I wanted to write. Still, it means there's another one to add to the queue - BetterStaticFileHandler.

And where does that leave serving files from App_Data? Well, this solution will still work, it's just not the most efficient - definitely not something I'd recommend in an enterprise, but probably not too much of a problem for a blog. It's also not the nicest solution, so let's have a look if there's anything more elegant. Let's have a look at Virtual Path Providers.



Serving files from App_Data #1 - the naive approach.

by Matt 5. October 2006 03:52

Don't worry, I've not gone crazy.

Asp.net 2 doesn't let you serve files from the App_Data folder, and for good reason - security. It's a folder your web app has write access to (a standard place on all installations without having to get your ISP involved - halellujah!) so in goes your databases and various other data files. You don't want anyone to be able to just download these, right? Right. Well, most of the time, anyway.

If you're dealing with a site that has mostly user-generated content, you're going to have to write a lot of data. The place of least friction for this data is of course App_Data. Think of a blog site (cough, SingleUserBlog, cough); all of the blog posts are going to be stored in the App_Data folder, perhaps as flat files or a database. And this is fine, because you don't serve those files directly - they're content to go in a page. But what about uploads? How can I include images in my blog? Uploading them is no problem, just chuck them in App_Data. But I can't serve them.

This is exactly the kind of solution a custom http handler is intended to solve. And the naive approach is to simply call HttpResponse.WriteFile and congratulate myself on a job well done.

So what's wrong with this?

Firstly, WriteFile reads the whole file into memory before chucking it to the client. For large files, it can fail. This isn't the best. But it can be worked around. I can use HttpResponse.TransmitFile to stream direct to the client. Or I could easily roll your own (this Microsoft KB article shows how).

Secondly, I'd have to add my own logic to setup the mime types. Boring, but still not reason enough to call it naive.

The third reason is the biggie. It's the single biggest mistake you see with custom http handler implementations, and that's forgetting caching and conditional gets. Bandwidth isn't free, and yet most http handlers will just serve the file. Again and again and again. To do things properly, I'd need to create and check etag values, and handle the various combination of the http headers (If-Unmodified-Since, Unless-Modified-Since, etc).

I could make this point into a blog post by itself. Fortunately, Kent Sharkey has already done it. Go read. Please.

Wouldn't it be nice if I could get someone else to do all this heavy lifting for me? (And I still haven't forgotten security.)




Month List


Comment RSS