Scott on Writing

Musings on technical writing...

Please Don't Forget - URLs are a Public Interface!

When designing software that can be consumed by third-party applications, one rule is very important: public interfaces should not introduce breaking changes, ever.  Never ever ever, not in a million years ever.  If you release an upgrade with breaking changes to the public interface, those third-party apps that relied on your published interface will fail, and that will piss off two groups of people:

  1. The creators of the third-party applications, and
  2. The users using the third-party applications

I put the third-party application creators ahead of the users because they will be really upset because those peeved users will be peeved at the third-party application, even though the third-party application was just abiding by the documented interface.

This all seems like common sense, no?  In COM development back in the days, that's all developers heard - if you must change the interface, you need to reversion.  Ditto for Web services developers today.  What is a bit baffling is why people don't treat website URLs are public interfaces that may be consumed by third-party applications, because that's precisely what they are.  The third-party applications are the other Web pages that link to the URL.

There's never any justifiable reason for any website to have a URL that was once existing ever return a 404; a URL that does this is breaking the public interface, a contract implicitly signed by the website creator when he or she created the Web page.  I don't care if you rearchitected your entire site; I don't care if it's a URL for a product you no longer sell, in that case display a page kindly explaining that the product is no longer for sale, providing links to similar products/categories you do sell, or -gasp- a link to a site that sells the product.  A contract is a contract is a contract.  The only excuse for a URL's death is if the company running the website goes out of business, and even then it's a weak excuse.

When you let a URL die, its death ripples to all sites around the world that link to the URL.  Users visiting those sites will no experience broken links, and blame not you, Mr. “I Don't Abide By My Public Interface,” but the site that linked to you expecting you to uphold your end of the bargain.  Do you know how difficult it is to fix broken links?  Sure, if you only have a few dozen broken links, no biggie.  But what happens when, say, you have thousands of Web pages with links to a site like, say, MSDN, and then one day MSDN decided to rearchitect the site and all those links that used to work no longer work?  What do you do then when you start getting a torrent of emails saying, “These links are broken, what's wrong with you?”  You start fixing them, naturally, one at a time, but that is painful, slow, prone to error, and totally unnecessary.  (I use MSDN as an example because this is precisely what happened back two years ago or so.  I wish I was kidding.) 

Yes, I know there are technological solutions one could utilize to aid in finding broken links quickly, but the point remains: it's work/time/effort/energy that shouldn't need to be done in the first place!  And, sure, there are other approaches one could take when linking to others' sites, such as by adding a layer of indirection.  For example, rather than linking directly to an off-site URL, link to a redirect page on site, like /redir.aspx?ID=x, where x is some ID field in a database table that ties the link to the URL.  That way, if there are multiple links to a single now-defunct URL across many pages, all pages can be updated by updating the appropriate database record.  While indirection has some nice side benefits - link click tracking, for example - it runs counter to the notion of the World Wide Web, in my opinion.  Plus, search engines might not pick up on the link or give it a proper context.

And you know, even if you do rearchitect your site, you can still save those old URLs.  It's called URL rewriting, and can be done in a number of ways, from ISAPI filters to 404 handlers that do an automatic redirect to the new URL.  Yes, this could result in a good deal of work in the rearchitecting of a large site, but that's what's to be expected when you have such a large public interface.  And regardless of how long it takes, that time and effort will pale in comparison to the energy required to fixing broken links from all of those linking to the site.

Ok, now that I got all that out, I feel better.  :-)

posted on Tuesday, April 12, 2005 6:23 PM

Feedback

# Re: Please Don't Forget - URLs are a Public Interface! 4/12/2005 8:19 PM haacked@gmail.com (Haacked)

Preach it brother!

# re: Please Don't Forget - URLs are a Public Interface! 4/13/2005 7:44 AM Andrew Johns

it's just a shame that Microsoft don't practice what you preach. Their content on MSDN regularly changes, making it very difficult sometimes to find what you're after, like the article on accessibility and asp.net that you've written. GRR!

# re: Please Don't Forget - URLs are a Public Interface! 4/13/2005 7:59 AM Scott Mitchell

Andrew, what prompted me, in part, to write this is that I find about one article of mine a week on MSDN that leads to a broken URL. Meh. The content's still there (except in the case of the accessibility article - anyone can email me and I'll send a Word doc over), but it's just mysteriously changed its URL. I just update the broken links at http://www.4guysfromrolla.com/ScottMitchell.shtml, but the links on the right of this blog and on other articles I've authored are probably sitting there broken right now, as I type this. <grumble, grumble... />

# Comment Spam - Scripts or Brute Force? 4/14/2005 11:49 AM Scott on Writing

# re: Please Don't Forget - URLs are a Public Interface! 4/21/2005 11:37 PM Brian Bischof

Too bad Sys-Con (Dot Net Dev Journal, Java Dev Journal, etc. etc.) doesn't read your blogs. They just revamped their website and the links that I see on websites to their stories don't work anymore. Idiots.

# re: Web aesthetics: don't use DLLs in your URLs 5/1/2005 11:38 AM Jason Salas' WebLog

# re: Please Don't Forget - URLs are a Public Interface! 9/28/2005 11:49 PM paul graham

if you want to get rid of comment spams mail me the samples and i will bring up a tool to fight comment spams at rargdgak@yahoo.com

# Comment Spam - Scripts or Brute Force? 12/16/2006 3:18 AM Mirror blog entries from the industry

I've always assumed that comment spammers are using scripts to spread their evil, evil comment spam.

# March's Toolbox Column Online 3/15/2008 1:59 PM BusinessRx Reading List

After a three month hiatus, I am back to authoring the Toolbox column for MSDN Magainze . (Thanks to

# March's Toolbox Column Online 3/15/2008 2:04 PM Community Blogs

After a three month hiatus, I am back to authoring the Toolbox column for MSDN Magainze . (Thanks to

# March's Toolbox Column Online 4/24/2008 1:00 PM .Net World

After a three month hiatus, I am back to authoring the Toolbox column for MSDN Magainze . (Thanks to

Title:  
Name:  
Url:
Protected by Clearscreen.SharpHIPEnter the code you see:
Comments   

My Links

Ads Via DevMavens

Archives

Post Categories

 

I am a Microsoft MVP for ASP.NET.
I am an ASPInsider.
<March 2010>
SMTWTFS
28123456
78910111213
14151617181920
21222324252627
28293031123
45678910

Comment Stats

DayTotal% of Total
Sunday 2056.8%
Monday 42514.1%
Tuesday 51917.2%
Wednesday 55618.4%
Thursday 58019.2%
Friday 54718.1%
Saturday 1886.2%
Total 3020100.0%

Hour1Total% of Total
12:00 AM 782.6%
1:00 AM 812.7%
2:00 AM 682.3%
3:00 AM 822.7%
4:00 AM 692.3%
5:00 AM 1264.2%
6:00 AM 1193.9%
7:00 AM 1816.0%
8:00 AM 1926.4%
9:00 AM 1585.2%
10:00 AM 1886.2%
11:00 AM 1936.4%
12:00 PM 2016.7%
1:00 PM 1846.1%
2:00 PM 1695.6%
3:00 PM 1354.5%
4:00 PM 1153.8%
5:00 PM 1073.5%
6:00 PM 1013.3%
7:00 PM 1073.5%
8:00 PM 923.0%
9:00 PM 882.9%
10:00 PM 913.0%
11:00 PM 953.1%
Total 3020100.0%

Comments by Blog Entry Date/Time

Day Entry MadeAvg.Total
Sunday 5.00160
Monday 4.80384
Tuesday 4.04477
Wednesday 7.39680
Thursday 6.26676
Friday 5.07466
Saturday 4.78177
Total 5.403020

Hour1 Entry MadeAvg.Total
12:00 AM 5.2937
1:00 AM 1.002
5:00 AM 0.000
7:00 AM 3.8550
8:00 AM 3.72134
9:00 AM 6.06297
10:00 AM 5.63276
11:00 AM 4.22194
12:00 PM 6.16351
1:00 PM 3.09133
2:00 PM 4.89230
3:00 PM 7.67322
4:00 PM 4.00108
5:00 PM 6.07170
6:00 PM 4.64116
7:00 PM 8.95188
8:00 PM 8.63164
9:00 PM 5.00115
10:00 PM 6.31101
11:00 PM 4.5732
Total 5.403020

Learn More About Comment Stats
1 - All times GMT -8...


Blog Stats

Favorite Web Sites

My Books

My MSDN Articles