GDB tips

gdb is quite a useful tool. Despite it’s text interface, it can do more than GUI debuggers. I am still learning it, and this post is used as a memo.

  • ptype is useful to print type, structure(class) definition. This is useful to check user defined data types(sometimes deeply defined in header files).
  • disassemble is useful to see assembly code of function/statement.
  • info registers shows registers information.
  • print a@10 shows 10 elements start from address of variable a.
  • x command examine memory content. x/8xw $eip displays 8 words since where eip register points to. x/i $eip
  • shows instruction eip register points to.

“Regular Expression” matches all non-prime numbers

I found a “regular expression” which matches all strings consists of non-prime number of 1s.
^1?$|^(111*)\1+$

This is quite surprising because it can be proven that non-prime number of 1s is not a regular language. That is to say, no regular expression can match that language.

However, this regular expression works.
Following python code will print out all prime numbers within [0,99]

[sourcecode language=”python”]
import re
filter(lambda n:not re.match(r"^1?$|^(111*)\1+$","1"*n),range(100))
[/sourcecode]

So why it works? Let’s look at the regular expression: ^1?$ matches “” and “1”, which is trivial. In ^(111*)\1+$, (111*) matches 2 or more 1s, so (111*)\1+ matches n*k 1s, k>=2, n>=2, which is exactly all composite numbers.

Then what’s wrong with the theory? Actually, nothing is wrong. The “regular expression” in most programming language is much more expressive than the formal regular expression defined in computational theory. In this case, we do not allow “\1” in formal regular language. Actually, this “\1” may potentially require unbounded amount of memory, because the program need to remember the matched string in “\1”, which can be arbitrarily long. In formal regular language, the memory cost should be bounded when regular expression is given: bounded by the size of corresponding DFA. Thus, “\1” kind of stuff gives regular expression much more power.

Next question is: is “regular expression” in programming language equivalent to Turing Machine?