I created this blog to share thoughts and learning. This blog is hosted on a poor raspberry pi sits under my table to save electricity bills. Please bear with me if you find the site slow.
I imported some posts I wrote in school which I still find interesting. Source code in existing posts is not pretty-printed for unknown reason and I am too lazy to fix them.
gdb is quite a useful tool. Despite it’s text interface, it can do more than GUI debuggers. I am still learning it, and this post is used as a memo.
I found a “regular expression” which matches all strings consists of non-prime number of 1s.
This is quite surprising because it can be proven that non-prime number of 1s is not a regular language. That is to say, no regular expression can match that language.
However, this regular expression works.
Following python code will print out all prime numbers within [0,99]
filter(lambda n:not re.match(r"^1?$|^(111*)\1+$","1"*n),range(100))
So why it works? Let’s look at the regular expression:
^1?$ matches “” and “1”, which is trivial. In
^(111*)\1+$, (111*) matches 2 or more 1s, so (111*)\1+ matches n*k 1s, k>=2, n>=2, which is exactly all composite numbers.
Then what’s wrong with the theory? Actually, nothing is wrong. The “regular expression” in most programming language is much more expressive than the formal regular expression defined in computational theory. In this case, we do not allow “\1” in formal regular language. Actually, this “\1” may potentially require unbounded amount of memory, because the program need to remember the matched string in “\1”, which can be arbitrarily long. In formal regular language, the memory cost should be bounded when regular expression is given: bounded by the size of corresponding DFA. Thus, “\1” kind of stuff gives regular expression much more power.
Next question is: is “regular expression” in programming language equivalent to Turing Machine?