January 2004 - Posts

They Say Immitation is the Sincerest Form of Flattery
29 January 04 04:54 PM | Scott Mitchell

Recently both Microsoft and Yahoo! released Toolbars akin to Google's Toolbar. All three companys provide pop-up blocking and quick searching, but the nice thing about Yahoo! and Microsoft is that their toolbars each have a little icon that informs you when you have new Yahoo! Mail or Hotmail - the advantage of having an email offerring at your portal. Not to be outdone in the art of copying the ideas of others without really adding any benefit whatsoever, Google is planning on launching their own Web-based email system, akin to Hotmail.

Personally, I use GoogleBar, a Google Toolbar clone that is designed for Mozilla and FireBird. (The standard Toolbars for Microsoft, Yahoo!, and Google only work only in IE 5+.) Trillian is nice enough to let me know when I have new Hotmail, so I see no need to start using the Microsoft Toolbar.

You can get Microsoft's Toolbar at http://toolbar.msn.com/, and Yahoo's Toolbar at http://toolbar.yahoo.com/

Filed under:
Are Today's RSS Aggregators Too Bloated?
28 January 04 12:54 PM | Scott Mitchell

I was reading Wesnew Moise's blog earlier this morning and when reading Are Objects Cheap, I stumbled upon this observation:

I use SharpReader, a RSS reader that is written in managed code and consumes a large amount of memory--almost 200 MBs on a typical session.

This reminded me of the days I used SharpReader, which have since long passed for the very reason Wes points out - SharpReader is a memory hog. I had no better luck with RssBandit either, and have since switched to using BlogLines since it only takes up the same resources as my browser.

Note: I have not tried RSS aggregators other than RSSBandit and SharpReader, so the following comments apply only to these two aggregators...

Why are RSS aggregators so bloated? Are they too bloated? I think Luke and Dare, the creators of SharpReader and RssBandit, aren't too worried about bloat because their primary customers are fellow developers, who have dev machines with oodles of RAM. Or maybe they are concerned about bloat, but instead focus their limited time on adding new features, keeping up with the specs, and improving other features of their products.

For me, I don't see myself returning to either of these two aggregators until the bloat problem is solved or I get more RAM. Having multiple browsers opened (IE and FireBird, with multiple windows of each), Outlook, several instances of VS.NET, SQL Enterprise Manager, Trillian, and other programs, my computer is already pushing its memory limits. In the past, my machine would hum along nicely if I didn't actually bring the aggregators to focus - they're large memory requirements were happily paged to disk. But once I clicked on, say, SharpReader, I'd have the ol' five second wait as Windows brought in the megs of memory that had been offloaded to disk back into main memory.

Ideally, an RSS aggregator would be able to keep pace with Outlook's memory requirements. Granted, these RSS aggregators are managed code and not fine tuned to reduce memory consumption, but still, Outlook is holding several thousand email messages, has Calendar features, uses numerous plugins like Spambayes, and still manages to keep itself around ~50 MB.

Let me close with saying that I do appreciate Luke and Dare's efforts - I'm not bashing them or their work, just providing constructive criticism.

Filed under:
Giving SpamBayes a Try
27 January 04 02:52 PM | Scott Mitchell

In an earlier post of mine, Building a Challenge/Response Spam Blocking System, I examined a simple challenge/response anti-spam system I built to halt the deluge of spam that clogs my inbox each day. In fact, I wrote a short article on it detailing the inner workings, observations, and commentary- Stopping Spam.

While the challenge/response system was effective in reducing my spam intake from about 100 messages a day to around 1 or 2 messages a day, the approach, in my estimation, was not ideal. One big disadvantage was that fewer people took the time to respond to the challenge email than I had anticipated. The reasons for this, I deduce, were two-fold. Some people don't want to take the time to follow instructions for a challenge email - maybe their message wasn't that important after all, maybe they're busy, or maybe they just don't like being told what to do. These people's messages, I reckoned, weren't that vital. I mean, if you can't take two seconds to respond to the challenge, then just how important is that email you're sending me?

What worried me, and led me to suspend my C/R anti-spam system, is that I noticed some people weren't responding to the challenge email because they never received it! This unfortunate circumstance could happen if their own spam blocking solution halted my challenge email. (A couple folks informed me that Outlook 2003 categorized my challenge emails as spam. Others using a similar challenge/response anti-spam system would never get my challenge as my challenge would generate a challenge on their side.)

I've been using the Spambayes Outlook Plugin the last couple of days and have been impressed with the results. Spambayes identifies spam using Bayesian techniques, which essentially means it relies on Bayes' Theorem to determine if a message is spam or not. Bayes' Theorem was postulated by Reverand Thomas Bayes in the 18th century, and gives a formula for determining conditional probability. It's useful for answering questions like, "If we know that someone voted for Bush in 2000, what is the probability that he lived in Texas?" Let T be the set of people who live in Texas and let B be the set of people who voted for George Bush. We define the probability that our voter lives in Texas if he voted for Bush, denoted P(T | B), as P(T ^ B) / P(B). Here, P(T ^ B) is the probability of a random US citizen that lives in Texas voted for Bush and P(B) is the probability of a random US citizen who voted for Bush.

Now, how can this be used to help stop spam? The way Spambayes works is by tokenizing each and every incoming email. It then looks up in an internal database and determines how likely it is that each token belongs to spam or ham (ham being non-spam). Spambayes parses each incoming email (and its headers) and asks, "If an email has token x, what is the probability that it is spam?" So, I might get a piece of Viagra spam and Spambayes would find tokens like 'V1agra', 'all night!', 'pleasure her', and so on, and, based on its knowledge from past spams and hams, it would deduce mathematically that there was a high probability that this was a spam message itself. Pretty cool.

Spambayes allows you to view the "score" for each email - namely, the probability that the email's spam. It shows how th email was tokenized and how often each token appears in spam vs. appears in ham. Can you guess what words correlate highly to ham for me? Since I receive a lot of email from listservs, where people post code, an email full of code syntax would be marked as ham. For example, the token 'DataGrid' has appeared in over 50 ham messages, but no spam messages, so there's a strong correlation between that token and ham. The token 'dim' has appeared in over 180 hams, and only one spam; dynamic in 99 hams, 3 spams; 'template' in 63 hams, no spams; 'dataset' in 61 hams, 0 spams; 'ddl' in 392 hams and 13 spams; function in 160 hams and not a single spam.

If you use Spambayes, your hammy tokens would, of course, be different than mine, since it adapts to the emails you receive. So now spammers now how to break my filters - flood their spams with words like function, datagrid, dataset, dim, and so on! :-)

Filed under:
skmMenu 2.0 and SkmMenu.com Web Site
24 January 04 05:00 PM | Scott Mitchell | 2 comment(s)

Several months ago I created skmMenu, an ASP.NET menu control, for a two-part article series I wrote for the ASP.NET Dev Center on MSDN. I decided to post the source up on a GotDotNet Workspace so readers of the article could enhance the functionality themselves and, hopefully, leave their work behind so others could benefit from it. There have been a handful of developers who have really stepped up to the plate and have helped enhance skmMenu, most notable Robert Vreeland.

With that being said, I'm proud to announce skmMenu 2.0, along with its own Web site, skmMenu.com. If you're looking for a free, open-source ASP.NET menu control, check out skmMenu. You can see it in action in the skmMenu.com Web site (the top-level menu and the left-hand side “quick menu” are both examples of skmMenu in action).

(If you find any typos/broken links/errors on the skmMenu.com Web site, or have any suggestions on how to improve its usability or functionality, please let me know. The site's still very light on examples and FAQs, although there are some examples online as well as nine downloadable examples. Also, there is online documentation along with a compiled help file. (Thank you, NDoc!))

Filed under:
Where's My Humility?
21 January 04 10:22 AM | Scott Mitchell

Stephen Ibaraki recently conducted an interview with yours truly, which you can read at http://www.stephenibaraki.com/cips/jan04/smit.asp. The interview talks more about writing and teaching and computing than specific technologies.

(FYI, the interview doesn't display properly in Mozilla Firebird. Stephen's Web server returns a Content-type HTTP Response header of type text/plain, thereby displaying the HTML in the browser window. IE, however, ignores this and renders the HTML...)

Filed under:
The Process of Writing
16 January 04 12:11 PM | Scott Mitchell

The other day I was talking about how one prepares and researches for an article to noted author and speaker Michele Leroux Bustamante (with whom I teach a “Fundamentals of Web Services” class with). Michele said that she spends several days on an article, utilizing articles, newsgroups, communications with contacts in Microsoft, and, of course, by crunching out code and seeing what happens. She said her technique was to focus 100% of her time on one particular article until she had it finished. Her comments piqued my interest, since I am on the opposite end of the spectrum. I shared my style with Michele, but thought it might also be useful to other writers to share my approach here.....

I find my best writing occurs when there are several weeks of time between initially given the writing assignment and actually starting writing. Too, I find sometimes writing goes best if spread out over several days. If I am asked to write a topic about which I only have a cursory understanding on, I usually start by spending an hour or two a day for a week reading articles, trying out code examples, just playing around with the concepts and specifics to broaden my understanding. Following that, I find it helps to continue poking around the Web for information, but, at the same time, I start formulating an article outline mentally. I rarely find myself writing down the outline, usually it's just a bunch of 1s and 0s in my brain, but I lay out the flow of the article mentally before writing. I'll usually do this to fill time, like if I'm waiting in line, taking a shower, before falling asleep or right after waking up, or when walking the dog. For topics I have an intimate understanding with, such as topics on the ASP.NET DataGrid, I usually forgo the initial research (since it's been done in countless hours past) and instead spend a day or two mapping out the article in the noggin'.

Once I have a mental vision of the article, and have had sufficient research time, I begin writing! Writing can vary from hours to a week, depending on my knowledge of the subject matter, the length of the article, and the complexity of the material. For example, I find myself taking close to a week to write each installment of my Extensive Examination of Data Structures article series, due in part to its length (Part 4 is over 8,000 words long with 16 figures!!) and the difficulty in describing a potentially complex topic in an easy-to-understand manner. I rarely find myself refactoring what I write. That is, I hardly ever go back and rearrange the presentation of the material, or strike a paragraph, or add extra content. If I do make edits to previously written content, it's usually because at some point later in the same article I realize I need to talk about something that requires an explanation beforehand.

Also, ever since my first book I've found that very little changes from my first draft and the final draft. For books, typically two grammatical editors and at least one technical editor will go through the manuscript, making suggestions, changes, and fixes, but usually the only fixes are slight grammatical ones. I humbly attribute this to good writing, not lazy editors. I've worked as a technical editor on two computer trade books before and, trust me, there can be a lot of work for an editor, especially for newer authors.

Filed under:
Polymorphism in ADO.NET
14 January 04 08:06 AM | Scott Mitchell

Maxim Karpov has recently written a very good blog entry titled: Polymorphic Behavior of [the] ADO.NET Object Model. Maxim begins his article with an examination of what, exactly, polymorphism is and then turns to illustrating how polymorphism is achieved in ADO.NET through interfaces (IDbConnection, IDbCommand, and so on). Maxim's article uses the Data Access Application Block 3.0 as a case study, showing how the DAAB 3.0's abstract class factory allows for the DAAB to work with any data store (not just MS SQL Server).

A good read for those new to object-oriented programming and wanting more of a background in one of the cornerstones of OOP (polymorphism), as well as a must-read for those who are using or are planning on using the latest version of the DAAB.

Filed under:
Ten Mistakes Writers Make
13 January 04 10:15 PM | Scott Mitchell

Pat Holt, former Book Review Critic for the San Francisco Chronicle, has a good piece on his Web site titled, Ten Mistakes Writers Don't See (But Can Easily Fix When They Do). Many of the tips apply to authors writing fiction, such as tips on describing scenes and recommended use of dialog. But there are numerous good tips for non-fiction writers as well.

One of the ones I identified with was Mistake #1, Repeats:

Just about every writer unconsciously leans on a "crutch" word. Hillary Clinton's repeated word is "eager" (can you believe it? the committee that wrote "Living History" should be ashamed). Cosmopolitan magazine editor Kate White uses "quickly" over a dozen times in "A Body To Die For." Jack Kerouac's crutch word in "On the Road" is "sad," sometimes doubly so - "sad, sad." Ann Packer's in "The Dive from Clausen's Pier" is "weird."

Crutch words are usually unremarkable. That's why they slip under editorial radar - they're not even worth repeating, but there you have it, pop, pop, pop, up they come. Readers, however, notice them, get irked by them and are eventually distracted by them, and down goes your book, never to be opened again.

I have some crutch phrases I find popping into my articles and books. For example, I often follow potentially complex statements frought with technical jargon with a sentence starting: “That is,” followed by a simpler explanation in everyday English. I like to pepper my writing with many examples, but have trouble finding verbage to use other than, “For example,” to start an example. (See the second sentence in this paragraph for a prime example......) I also seem to be fixated on starting sentences with, “Realize that...”

Filed under:
HTTP Compression
12 January 04 02:26 PM | Scott Mitchell

The topic du jour for the ASP.NET developer blogging community seems to be HTTP compression. HTTP compression involves the Web server compressing its HTTP response message using any one of a number of standard compression routines. This compressed message is then transmitted to the requesting client where the client decompresses it and then does whatever it wanted to do with the data (display it in a browser, save it as a file, etc.).

When a Web browser makes a request to a Web server it can send along an Accept-Encoding header with a comma-delimited list of compression types accepted (the common ones being gzip and deflate). The Web server, then, can then compress the response on the fly for these compression-aware Web browsers. The cost of compression is a bit of extra processing time at the server and client to compress and decompress, respectively. The benefit is in the decreased payload leaving the Web server, meaning the Web site's overall bandwidth is less and the data gets to the client sooner, which can be noticeably sooner for broadband-challenged visitors. Compression of HTML documents can range based on the degree of compression used, and the compressibility of the data. I've seen companies and individuals tout numbers ranging from 20% to 60%. For example, James Avery noted a 45% savings by enabling compression on IIS. (For a more in-depth discussion of HTTP compression, be sure to read: HTTP Compression Speeds Up the Web.)

So how does one go about enabling compression on their Web server? There are third-party products that will fit the bill, PipeBoost and XCompress being two such products. With Windows 2003 Server and IIS 6.0, IIS natively supports HTTP compression. To learn how to configure IIS 6.0 for compression, be sure to read Scott Forsyth's IIS Compression in IIS 6.0 article, as well as Brad Wilson's article IIS 6 Compression and ASP.NET.

If you do not have access to your Web server, or you are not running IIS 6.0, you can still benefit from compression by using a custom ASP.NET HTTP module. Ben Lowery has created a free HttpCompressionModule class that you can download and start using in your ASP.NET applications. For example, Jeff Julian has integrated Ben's compression module with .Text, the blogging software used to run this blog and many others. Specifically, Jeff's modification of .Text uses compression just on the RSS syndication feed (Rss.aspx), since this is typically the most requested file and is responsible for the majority of bandwidth used on a .Text Web site by far...

When visiting a Web site with compression enabled, the end user cannot determine if compression is used or not. That's how it should behave, after all - everything should work just as it normally would. For more savvy end users, though, there is a means by which one can tell if the page they are viewing was sent over the wire as compressed data or not. The simplest way is to inspect the HTTP response headers. In Mozilla Firebird this is as easy as downloading Live HTTP Headers extension. Look for a Content-Encoding header that has a value like gzip or deflate. This would indicate that the content was encoded using a particular compression algorithm. Another approach is to download a packet sniffer like Ethereal and capture the packets requested from the Web site. When inspecting the payload you'll see the HTML content looks like a bunch of goobledygook - this is the compressed data.

Filed under:
RssFeed Version 1.2 Released
08 January 04 05:11 PM | Scott Mitchell

Version 1.2 of RssFeed, an ASP.NET custom server control I created to display RSS syndicated content, is now available and ready for download at the RssFeed GotDotNet Workspace. I finally got around to completing the XML comments and, using NDoc, quickly built both online help files and a compiled help file (.chm). (NDoc is pretty cool and easy to use. I'd not used it before, and was impressed how quickly and easily it was to generate the help files.)

Version 1.2's main improvement is adding optional template support. I have some live demos of RssFeed you can check out to see the templates in action, including responding to button Command events. There are some other miscellaneous improvements as well, cleaning up the code and whatnot, adding comments, blah blah blah.

One feature that is still missing is proxy support. Users who connect to the Internet via a proxy will get Underlying connection was closed error messages. The fix, I believe, is as simple as adding a WebProxy class to download the data, but I don't use a proxy so don't know how I'd go about testing it. Hopefully someone who uses RssFeed and who has tweaked it to work on their proxy setup can send me the pertinent code...

Filed under:
Creating a Fully Editable DataGrid Server Control - Comments Appreciated!
08 January 04 10:47 AM | Scott Mitchell

The DataGrid provides row-by-row editing capabilities via the following players:

  • The EditCommandColumn DataGridColumn, which renders a column of Edit/Update/Cancel buttons in the DataGrid.
  • The EditItemIndex property, which indicates the index of the row that is editable, and
  • The DataGrid databinding process, which enumerates the DataSource and creates a DataGridItem for each record
  • The EditCommand/UpdateCommand/CancelCommand events, which fire when the Edit/Update/Cancel buttons are clicked. These events are useful because a page developer will create an event handler and perform the steps necessary to provide editing capabilities (such as setting the EditItemIndex property accordingly, issuing an UPDATE statement to the database, and so on....)

These four actors make editing database data on a row-by-row basis a relatively simple task, much much much simpler than was possible in classic ASP. Some situations, though, require a fully editable DataGrid, one where all rows are editable at once, rather than on a row-by-row basis. In such scenarios, the end user wants to be able to make any number of changes and then click an "Update All" button.

The DataGrid does not provide this capability inherently. In my book, ASP.NET Data Web Controls Kick Start, I examine how to use a standard DataGrid to provide such functionality, namely through making each editable column in the DataGrid a TemplateColumn where the TemplateColumn's ItemTemplate has the editing interface (such as a TextBox Web control, or DropDownList, or whatever Web control(s) is/are suitable for the editing interface). Additionally, a "Save All" button is included that, when clicked, iterates through the DataGrid's Items collection and issues an UPDATE statement for each DataGrid row. While this approach clearly works, it is a departure from the DataGrid's existing functionality for doing updates.

With these thoughts, I set out to create a fully editable custom control derived from the DataGrid class. My vision was to have a control that provided full editing capabilities via:

  • Some means to display "Edit All Rows"/"Update All Rows"/"Cancel Batch Update" buttons, similar to the Edit/Update/Cancel buttons in each row for the row-by-row editable DataGrid. Rather than having these buttons appear in each row, though, I'd just want them to appear once, like maybe above the DataGrid.
  • A BatchUpdate property - if true, all rows would be displayed as editable, if false, none would. Therefore, a page developer could toggle between all rows being editable or not by setting this property and rebinding the data to the grid.
  • EditBatchCommand/UpdateBatchCommand/CancelBatchCommand events, which the page developer could create event handlers for to take the necessary steps to update the database, or to toggle the BatchUpdate value.

Initially this seemed like an easy task. Created a class that derived from DataGrid, added a BatchUpdate property, overrode the DataGrid's CreateItem() method so that if an Item or AlternatingItem was being created and BatchUpdate was True, then the DataGridItem was created as an EditItem. Added the events, and then set out to add the buttons. And that's where I got stuck.

Initially I thought I'd just override the CreateChildControls() method and add buttons to the start of the control hierarchy, thereby having them appear above the DataGrid. No dice, as the DataGrid's PrepareControlHierarchy() method naively assumes that the first control in the hierarchy is the DataGrid's Table control. This approach worked, though, if I slapped the buttons at the end of the control hierarchy, but then the buttons would always appear at the bottom of the grid, and I wanted more flexibility.

An ideal approach, I reckoned, would to be to create a new DataGridItem type, like the Pager type, that would add a row to the top and/or bottom of the DataGrid with the buttons. However, this would involve a major reworking of the DataGrid, requiring the overriding of many methods and classes. Way more work than I wanted to do.

My end solution - and here's where I'm looking for comments - was to separate out the functionality into two server controls. An editable grid called EditGrid, and what I call an EditBar. The EditBar displays the "Edit All Rows"/"Update All Rows"/"Cancel Batch Update" buttons and has a property EditGridID, which must refernece the ID property of the EditGrid the EditBar works for. Then, when one of the EditBar's button is clicked, the appropriate EditGrid event is raised. Separating the EditGrid and EditBar has the advantage that the EditBar can be placed anywhere on the page and operate on the EditGrid, but it feels "messy" to me to have a control reference another control. I know the validation controls use a technique not unlike this with their ControlToValidate property, but I was wondering if anyone could think of a better way.

Any comments/ideas/suggestions most appreciated. For the record, I am working on an article on this topic, so any ideas or improvements suggested that are utilized will be given full credit in the article, naturally. Thanks!

Filed under:
A Bounty of System.Web.Mail Information!
07 January 04 04:27 PM | Scott Mitchell

I stumbled across Dave Wanta's new FAQ Web site, http://www.SystemWebMail.com, which has the be the definitive place for information on the .NET Framework's System.Web.Mail namespace. Dave's got the info broken down into six sections, from quickstart examples and advanced examples, to tips on troubleshooting System.Web.Mail classes, to descriptions and links to the email-related RFCs. Quite a repository of knowledge, way to go Dave!

(Dave's expertise on System.Web.Mail comes from the several products he has created, such as aspNetEmail and aspNetPOP3. I have used both aspNetEmail and aspNetPOP3 before in my own projects, and have been impressed with the ease of using them, Dave's level of support, and their performance. In fact, you can read about how I am using aspNetPOP3 to help stop spam...)

Filed under:
More Posts


My Books

  • Teach Yourself ASP.NET 4 in 24 Hours
  • Teach Yourself ASP.NET 3.5 in 24 Hours
  • Teach Yourself ASP.NET 2.0 in 24 Hours
  • ASP.NET Data Web Controls Kick Start
  • ASP.NET: Tips, Tutorials, and Code
  • Designing Active Server Pages
  • Teach Yourself Active Server Pages 3.0 in 21 Days

I am a Microsoft MVP for ASP.NET.

I am an ASPInsider.