URLs should express output-format, not source-format
I’m the kind of guy who always tries to optimize my ways, especially when it comes programming. Over the last years I’ve worked on/against a multitude of CMS, frameworks and API’s, and quite often make some thoughts how things could be better, easier and more intuitive.
One of these things are regarding how to format a URL.
This post isn’t really server-technically focused, as it is more a discussion of an idea. I’ll none the less assume that the reader has some familiarity with mime-types, basic webserver-settings (like handlers and URL-rewrite-techniques).
As far as I’m concerned, any URL should give a hint of what you will find there, and by stripping the file-extension and just sitting left with something that’ll look like a directory, or using the scripting-language’s extension, I believe that we works against this. File-extensions gives an important hint regarding what the content is. What do you expect when you see “.jpg”, “.gif” or “.png”? Yes, images. Then what about “.html”, “.txt” and “.odf”? It’s most likely some kind of text, perhaps formatted. So what does “.php”, “.py” and “.aspx” tell us? Not much – it mostly a directive to the webserver, telling it to use some kind of parser to compile/process the contents before outputting it. And it might output html, xml or pdf as well as anything else.
A couple of years ago it had a “coolism” about it, saying that “I know how to develop fancy websites”, but nowadays people mostly expect some kind of interactivity anyway.
All relevant webservers today have to possibility to map any kind of extensions to any kind of (supported) language-parsers. I.e. with Apache you can use the AddType-directive like this:
addhandler php5-script html xml json
– to parse all .html-, .xml and .json-files in the given directory as PHP 5.
Now, let’s say you have an API which allows you (or perhaps other users), to query a service for information – and allows it to be delivered in xml, json and html. This is usually solved by adding some parameter like “of” (output-format) or something like that in the request to tell which format you’d like. I prefer to keep everything as logical and reducing the need to look up in manuals as far as reasonably possible, so instead of sending GET-requests like this:
http://example.com/gateway.php?id=15&key=11341&of=xml http://example.com/gateway.php?id=15&key=11341&of=json http://example.com/gateway.php?id=15&key=11341&of=html
I’d rather prefer:
http://example.com/gateway.xml?id=15&key=11341 http://example.com/gateway.json?id=15&key=11341 http://example.com/gateway.html?id=15&key=11341
There are some obvious problem to this approach:
1: What if you want different parsers (let’s say php and python) for different files but which will serve the same kind of output? How do we tell the server which to use when?
– Possible solutions to this is to either separate them in different folders (the cleanest approach), or expand the extension to ie. .php.html and .py.html. Perhaps in combination with some mod_rewrite-kind of tricks.
2: What about the overhead for static files, which still is sent through an interpreter although nothing is changed?
– If you are running a service which is sensitive to these kind of margins, you should just use dedicated servers or CDN’s to deliver the static content no matter what.
3: How far shall we take it? Should we use extensions like .html5 and .xhtml etc?
– I do believe that these differences are mostly relevant to the software parsing them, so in these cases it may suffice with keeping with i.e. .html, and use the MIME-type to send further details.
Although all my examples might not pinpoint all the benefits I do believe that there is an idea in keeping the file-extension, both to assist external users as well as might reduce some clutter.
What are the pros and cons of an approach like this? I’ve barely touched a few in this post, but I hope I was able to mention some of the most essential ones.