March 2004 - Posts

Filtering Spam at the ISP Level
09 March 04 11:52 AM | Scott Mitchell | with no comments

As spam has spiraled into a pandemic problem, many folks have come up with many ways to reduce the sheer volume of spam individuals receive. Six months ago, or so, I wrote about a Challenge/Response Spam Blocking System I wrote to help tide the deluge of spam I was receiving. Since then I have moved to a Bayesian approach, using the Spambayes Outlook Add-In, which I have found to be a wonderous (and free!) product. Spambayes works so well because it is tailored to the email you receive. Since I receive a lot of technical emails from listservs and colleagues, especially on ASP.NET, tokens like ASP.NET, ADO.NET, DataSet, loop, method, object, class, etc. are all strong indications that the email's not spam.

Anywho, one approach that many ISPs are taking is to employ universal sam blocking at the ISP level. I have some reservations about this approach for a couple of reasons.

  1. It assumes that there's some global spam “signature,” but the spam (and non-spam) one receives, I contend, is personalized. That is, the probability of you getting particular types of spam likely has to do with factors on how a spammer got ahold of your email. For example, I get spams from software companies, which I'd wager others who are not in a technical field receive less of. I'd imagine those who regularly post their email on very high-traffic, entertainment sites, are more likely to get such targetted spam. Too, companies you volunteerly give your email address to might sell your address, or have it obtained by another company when they go out of business.
  2. Those messages marked as spams never make it to the recepient's inbox. I know this is the intended approach, but what about false negatives? That is, what if a legitimate email gets flagged as a spam by the ISP? It will never reach the recipient's inbox, where, had it, and been marked as spam by, say Spambayes, the recipient would still have an opportunity to find said email in the Junk Email folder and mark it as non-spam.

Now, I have no statistics to back up these claims (namely, that spam is somewhat personalized), but one thing I know doesn't work are universal filters at the ISP level. Yes, these filters are nice because the computer illiterate doesn't have to concern himself with the details of installing and configuring an anti-spam solution, and, yes, these spam filters might weed out the vast majority of the spams, but at the same time they are likely cutting out a handful of legitimate emails that happen to be ensnared by the anti-spam nets. This concern exists both from my personal experiences in issuing challenges in a C/R anti-spam system, and also from emails I've received from folks, noting how emails they send from their Web sites are routinely getting blocked up by spam blockers at the ISP level.

I tend to view spam blocking software like the idealization of the American justice system - I'd rather have a few spams get through than have one legit email be marked as a spam. Clearly, spam blocking at the ISP level does not permit this.

Filed under:
Data Access Application Block - Version 3.0
08 March 04 01:46 PM | Scott Mitchell | with no comments

In an earlier blog entry I shared the PowerPoint slides I created for a local user group talk I gave back in October on the Microsoft Data Access Application Block (DAAB). From Microsoft's site, they provide an MSI download for Version 1.0 and 2.0 of the DAAB. Unfortunately, both of these versions were built to work only with Microsoft SQL Server 7.0 and up (due to the fact that it interally uses the SqlClient data provider).

Over at GotDotNet, a group of developers has been building Version 3 of the DAAB. Version 3 uses an abstract data provider pattern so that the DAAB can work with any database that has a provider created for it. Using Version 3 you can just as easily tap into an Access or Oracle database as you can tap into a SQL Server database.

I decided to update the DAAB presentation I gave earlier, since I'll be giving a similar talk to another group in a couple of weeks. Anywho, if you're interested, here's the updated talk, covering version 3.0. (The talk isn't designed to be a deep examination of the technical underpinnings of the DAAB. Rather, it's short and sweet, giving the audience an overview of the DAAB, instructions on how to start using it, and example syntax. The overall talk was created to be a bit over 30 minutes, but you could probably read through it in 5 to 10 minutes.)

Enjoy!

Filed under:
Great Blog Content
08 March 04 12:08 AM | Scott Mitchell | with no comments

Almost a month ago I posted a question on my blog - “How do you get your technical information?” 18 folks took the time to respond (thanks!). The reason I asked that question was because I was kicking around of some sort of Web site / RSS feed that aggregated the “top notch” blog content out there. (I sort of blogged about this back in October 2003 in: A “Killer Blog” Directory.)

What is a bit frustrating to me is that finding great technical content in blogs is like searching for a needle in a haystack. Yes, there are some bloggers whose writings are technically prolific, but even those who try their best to focus strictly on their technological expertise still usually wander into political meanderings, jokes, personal anecdotes, etc. And as well they should, blogs are a form of personal expression and self-publishing, but if you are looking for non-personal, technical information and know-how, the search can be tough.

The signal to noise ratio is unbearably low, but there are some very high quality technical posts out there that would make excellent articles on techincally-focused Web sites. To name a few:

Along with many others. I imagined that a Web site and RSS feed that listed just the high quality blog posts focusing on a specific technology would be invaluable. So I spent an afternoon creating a database and writing some data-entry software to allow me to quickly add these “great” blogs to a database. Once I had enough, I reasoned, I could slap on a Web front end, provide an RSS feed to the data, and, voila, create something thousands of developers would find useful.

My collection of blog entries stopped about a week ago - I had reached slightly more than 50 blog entries. I stopped for a number of reasons:

  • It was taking too long to weed out the “good“ entries. I had to wade through a lot of entries that were personal, too short, light on detail, or just not interesting enough.
  • Many of the blogs are on alpha/beta technology. Microsofties are blogging like crazy about products like Longhorn and Whidbey which are still a stretch away from even being considered beta products, let alone ones that are widely used.
  • It was hard to find entries that focused tightly on a specific technology. Hence, I started taking in any blogs about .NET, from ASP.NET to VC++.NET tips and tricks. The result? 50+ recommended blogs on a wide spectrum of loosely-related topics.
  • I got bored / frustrating with the success ratio. It was not fun having to wade through the many non-technical entries to find the few diamonds in the rough.

In conclusion, while initially it seemed like such a compilation of resources would be invaluable, I think the upkeep and collection would be overwhelming. Now, one option might be to distribute the work, let people recommend particularly interesting blog entries, and then have a team of moderators either approve or reject the suggestions.

I guess what it boils down to is it would be cool if there were a better way to have the blogosphere (I hate that word) keep track of interesting posts. The way this is currently done is others post a link to the entry of interest, but these recommendations, like the interesting entry itself, fade from the front pages over time. What is needed is some centralized data store that maintains “recommended” entries, and ranks the entries by the number of people that recommend it. Ideally this centralized data store would allow full text searching of the recommended content and, again, weight the search by how many folks have taken the time to recommend it.

<idealism posture=“bravely looking off into the sunset“>
Blogs have the capability to be the biggest means of idea transference to date. (I just made up that term right now.) They can open a means for truly democratic conversation and publishing. There are still some vital pieces missing - a unified persona across all blogs, an improved commenting system, a tamper-proof way readers can rank/rate blog entries, etc. - but once these pieces have been added it would seem that blogs could provide an unparelleled means for democratic communication, publication, and knowledge sharing.
</idealism>

<realism>
Of course, anytime I find myself ideally romanticizing the possibility of self-publishing and what it affords, I force myself to read Why Your Movable Type Blog Must Die.
</realism>

Filed under:
More Posts « Previous page

Archives

My Books

  • Teach Yourself ASP.NET 4 in 24 Hours
  • Teach Yourself ASP.NET 3.5 in 24 Hours
  • Teach Yourself ASP.NET 2.0 in 24 Hours
  • ASP.NET Data Web Controls Kick Start
  • ASP.NET: Tips, Tutorials, and Code
  • Designing Active Server Pages
  • Teach Yourself Active Server Pages 3.0 in 21 Days

I am a Microsoft MVP for ASP.NET.

I am an ASPInsider.