Tuesday, April 20, 2010

How do I ask a question about regular expressions?

We are getting more and more regular expression questions being asked and people have a tendency to not give us enough details to work with.  The best method is to post an example of the string you are working with and follow that with a list of the expected matches.  If you are refering to web page html code then either give us a link to  the page or post enough of the page that it includes all of the functional bits that you want to parse and that means some of the code on both sides of the element that you want to work with.

That gives us a starting point and you will usually get a working answer or an answer that is very close and only needs a bit of touch up to be what you want it to be.

Wednesday, April 14, 2010

More on regular expressions

Regular expressions are a very powerful tool, but they require a very precise understanding of what they are expected to accomplish. Regexps are a way to express complex rules in a compact way, but there must be rules to direct their behavior.  It's those rules that you have to take the time to learn, and it WILL take time.  Should you use regular expressions for all string search functions.  No, in general the native AutoIt string functions are faster than the regular expression method.
For example, you wouldn't use $sStr = StringRegExp("This is some string.", "(.+{7}", 1) when you can use
$sStr = StringLeft("This is some string", 7) to do exactly the same thing.
Now if it was a situation where for example the string contained digits that appeared at various locations in the string, the RegExp is better
$sStr = "This string 9085 contains digits at 7693 locations."
$sRtn = StringRegExp($sStr, "\d+", 3) will return both sets of digits
$sRtn = StringRegExp($sStr, "\d{3}", 3) will return the first 3 digits of each set
$sRtn = StringRegExp($sStr, ".*?(\d+).*", 1) returns the first set of digits
$sRtn = StringRegExp($sStr, ".*\D(\d+).*", 1) returns the last set of digits.

From this example you can see the power of SRE's compared to using native string functions.  A couple of those examples would likely have to be run through several native functions or would be impossible using anything except a regular expression.

I general, if you can do it in one or two simple native functions then do it that way.  If it's more complex, use the SRE.
I should also mention that when it comes to the speed differences, we are talking about micro-seconds in either case so it makes no noticeable difference unless you are doing hundreds or even thousands of iterations.

By the way, my new AutoIt regular Expression ToolKit is now ready for Beta testing so, if you want to help out, contact me through the AutoIt forums and the application is available here as a win32 installer in a zip file.

Monday, April 5, 2010

Where are they now?

Every once in a while I get to reminiscing and today was one of those days.
I was looking through the members list and suddenly realized that a lot of my old forum friends (if I had any) have just up and disappeared on me. Now that shouldn't be a surprise since I've been in the forums since 2003 but it seems that with few exceptions there is suddenly no posts from people you have come accustommed to seeing. here is a short list of people I would love to touch base with again. If you are on the list or have the contact information for anyone on the list then please let me know.
  • CyberSlug
  • SvenP
  • ScriptKitty
  • Helge
  • tylo
  • Holger
  • PaulIA

Saturday, April 3, 2010

Bad replies

Yes we do get them once in a while and I'm as guilty as anyone at times.  A common error is to post a reply without testing it as shown here from an actual quoted reply.
[quote]
Well where is the problem whe you know what you want to do ? : )

[code]
$towrite = ClipPut($yourdata)

FileOpen($file)
FileWrite($file,$towrite & @Crlf)
FileClose($file)
[/code]

You can define the text you put into clip as variable and then write it to a file
[/quote]

Alright, here is what's wrong
According to the help file ClipPut() will return 1 for success and 0 for failure.  That means that the variable $towrite will contain the value 1, NOT the the clipboard data so that line should have been written as

ClipPut($yourdata)
$towrite = ClipGet()

The FileOpen() should have a flag set to determine the mode, although it can be open in read plus one of the write modes simultaneously, so that should have been
$hFile = FileOpen($file, 1) ;; here I'm using the append mode.
FileWrite($hFile, $towrite) ;; Better in this case would be FileWriteLine($hFile, $towrite)
FileClose($hFile)

Now why did I use $hFile?  Because FileOpen() will actually return a handle to the file so it's better to use that handle instead of the actual file name or variable that contains the file name.

It goes without saying that a simple quick test would have shown the problem.  Like I say, we all do it on occasion but really we should take a bit more care.  The people we reply to generally are new to coding and may not see what will be obvious to the rest of us.

StringRegExp()

Are you having a hard time catching on to the use of SREs?  You are not alone.  Since their very inception, regular expressions have been difficult for developers to learn.  One of the primary reasons is the fact that there are so many flavors of RegEx engines out there and a regular expression that works in Javascript (example only) won't necessarily work in .NET (example only) or AutoIt.

There are many Reg tools available but the big question with each is "What engine is the tool designed for."  Since we are primarily concerned with those that work with AutoIt, I will mention a couple that come close in most cases.
RegEx Buddy
RegEx Coach
Expresso

I have one that was originally released by a couple of forum members and I have been constantly modifying it to suit my purposes.  This one is explicitly for AutoIt and as far as I know has never been tested with any other language.  Since the engine used by AutoIt is the PCRE engine then it should also work with Perl but no guarantees there.  AutoIt has a few idiosyncrasies that make it just a touch different from normal PCRE regex's.  Right now I've only made this tool available by request via a PM.  That will change at some point and I'll make it available on one of my webs (that will most probably be my AutoIt Central site).  It will always continue to be free and open source as a way to partially pay back to the AutoIt community and in keeping with the intent of the original authors.  My hope is that there will be a public release within the next month but I must first do some more work on the snippet holder.  Originally a list control was used and I'm changing it to a listview instead.  I'm also hoping to get a menu included that will allow you to insert common matching code like (?i) and (?s) as well as the groups like [:alpha:] etc.

If you have specific issues with a regular expression, I suggest that you post it as a question in the AutoIt forums.  You will generally get a working version that does what you expected.

In the meantime don't get frustrated when you first attempt SREs.  The light will suddenly come on and you will be off to the races.  One thing that is becoming very pronounced is the tendency of some people to over think the regexp.  many are posting regular expression's which are far to complex for the situation at hand.  Keep it simple and you will have much more success.

If you are looking for a specific RegEx try searching on one of my favorite sites, The Regular Expression Library.
If you have any questions or suggestions for content on this post, please use the comments link.

Thursday, April 1, 2010

Forum critics

Well here we are into April and I'm planning on a fairly quiet month, but before that happens I want to make a couple of quick comments about some of the forum users and I won't be providing names or links to the posts.  If they read this, they may recognize themselves.

I (and many others) have been criticized in the past for the various styles and techniques we use when replying to forum questions.  One person recently was upset because he felt I was treating many posters as being inexperienced and new to development.  That's very true, I do. Now lets ask a couple of simple questions.
Q - Did the poster state that they had any experience
A - No.  Therefore, until they show otherwise I will consider then as new in in the case in question, the poster was a semi-experienced developer but was quite new to AutoIt.  That makes him a Noob in my books.

Q - Did the poster display that he had a deeper understanding of programing in general?
A - See the answer above.

Another one that I get hit with on occasion is my habit of pointing people to the AutoIt help file when they ask a question.  I do that under certain conditions and for what I consider to be valid reasoning.

The conditions that usually bring on this type of  response are those posts where the user has asked a question and not provided any code for us to look at.  In that case we have to make a guess as to what functions they may need so I reply with "see Function() in the help file >> Section >> page."  That is about all we can do when we don't have specifics and it also emphasizes the point that most answers can be found by reading the help file.  I remember one regular poster who for 2 years would not read the help file, just ask a question in the forums instead.  After repeated warnings about it he finally annoyed us to the point where no one would reply to him.  That made us look like the bad guys in the eyes of some people and led to some criticism which was totally undeserved.

Now I'm not saying that some critisizm is not deserved, however I am saying that there is often a reason behind the type of replies that people are given.  Think of the possibilities before shooting off your mouth in the forums.