Wednesday, April 14, 2010

More on regular expressions

Regular expressions are a very powerful tool, but they require a very precise understanding of what they are expected to accomplish. Regexps are a way to express complex rules in a compact way, but there must be rules to direct their behavior.  It's those rules that you have to take the time to learn, and it WILL take time.  Should you use regular expressions for all string search functions.  No, in general the native AutoIt string functions are faster than the regular expression method.
For example, you wouldn't use $sStr = StringRegExp("This is some string.", "(.+{7}", 1) when you can use
$sStr = StringLeft("This is some string", 7) to do exactly the same thing.
Now if it was a situation where for example the string contained digits that appeared at various locations in the string, the RegExp is better
$sStr = "This string 9085 contains digits at 7693 locations."
$sRtn = StringRegExp($sStr, "\d+", 3) will return both sets of digits
$sRtn = StringRegExp($sStr, "\d{3}", 3) will return the first 3 digits of each set
$sRtn = StringRegExp($sStr, ".*?(\d+).*", 1) returns the first set of digits
$sRtn = StringRegExp($sStr, ".*\D(\d+).*", 1) returns the last set of digits.

From this example you can see the power of SRE's compared to using native string functions.  A couple of those examples would likely have to be run through several native functions or would be impossible using anything except a regular expression.

I general, if you can do it in one or two simple native functions then do it that way.  If it's more complex, use the SRE.
I should also mention that when it comes to the speed differences, we are talking about micro-seconds in either case so it makes no noticeable difference unless you are doing hundreds or even thousands of iterations.

By the way, my new AutoIt regular Expression ToolKit is now ready for Beta testing so, if you want to help out, contact me through the AutoIt forums and the application is available here as a win32 installer in a zip file.

No comments: