May 26, 2009

Spam for breakfast

Had ninteen attempts last night. One with the multiple hijacked forums. I am predicting that this will die off fairly soon as this one had 23 different listings and none of them were new. One was the conversational type with the single URL. The other seventeen were the random six characters followed by a couple giberish URLs. That little snippit of regex I posted yesterday nails these to the wall. Of the ninteen attempts, none of them were sucessful. Zip Zero. Nada. For a fun post/rant on regex, check out Sleestacks -- The Rant:
The Trouble With Regular Expressions
I have this problem with regular expressions. They�re too handy for their own good. You say you don�t know what a regular expression is? Well, let me just tell you, you don�t know what you�re missing out on. Think of it like a really super-duper complicated way of searching for something real specific when it�s slapped in the middle of a whole back of other crap you don�t want to even mess with. But it�s not even just searching for something, you can search and replace really complicated text that�s just too big to do by hand.

More than that, it�s not like you just want to find every reference to the name �Meg� on your computer and change it to �Poop� � you can use regular expressions to transform Huge Text File (A) into just the little sub-sections of important stuff � or you can convert (A) from one textual format to another, like from HTML to Whiskey, or whatever the hell new-fangled nonsense is going on out there in web junky land.

There are plenty of programs out there that make use of regular expressions and you don�t even know about it. I first started playing around with them way back in 1995 or so, not long after Rich Siegel started selling BBEdit for Macintosh, and I�ve been using them since. Here�s an example � say you have a text file that looks like this:
Jimmy Jack Johnson sells 43 seashells to Yo Momma. She paid about three-fiddy.
You can do a regular expression find/replace on that text, which would look something like this:
's/^(J.*)\ J.*\ (J.*)\ (s)...s\ ([0-9]+).*\r/\1\ sucks\ \4\ \2\3\.\r/g�
Which would now make that first text look something like this:
Jimmy sucks 43 Johnsons.
Heh... If regex looks obscure, its because it is. One of the steepest learning curves I have ever experienced but one of the most useful things to know. Posted by DaveH at May 26, 2009 10:38 AM
Comments
Post a comment









Remember personal info?