IBM Wants Patent For Regex SSN Validation
theodp writes "What do you get when you combine IBM contributors with the Dojo Foundation? A patent for Real-Time Validation of Text Input Fields Using Regular Expression Evaluation During Text Entry, assuming the newly-disclosed Big Blue patent application passes muster with the USPTO. IBM explains that the invention of four IBMers addresses a 'persistent problem that plagues Web form fields' — e.g., 'a social security number can be entered with or without dashes.' A non-legalese description of IBM's patent-pending invention can be found in The Official Dojo Documentation. While IBM has formed a Strategic Partnership With the Dojo Foundation which may protect one from a patent infringement lawsuit over validating phone numbers, concerns have been voiced over an exception clause in IBM's open source pledge."
Online Prior Art at the Regex Library from 2004:
Put that into your favorite Javascript regular expression object and write a stupid onChange reference to it in your HTML and ... tada! Too complicated? Here's some more prior art. Or here. A little bit of Googling must be too much for the USPTO.
Are we suddenly shocked to discover one line of code can be patented when a whole mess of code can be patented?
My work here is dung.
The first claim mentions the real time nature of the validation. The example regexes are for validating a completed string. This is still silly and obvious but you may have a harder time finding specific prior art for this case.
I am becoming gerund, destroyer of verbs.
I'd like to assert that I've personally written prior art.
If you read the patent application, they aren't patenting just validation of a text field. They are patenting the idea of validating a string, one character at a time, as it is entered by the user. As the string is entered, when invalid characters are found using regex, a "visual change" is made to the input to let the user know they made a mistake.
An example they give is that in an email input field, as soon as the user enters a comma, the comma would change colors.
It's still not groundbreaking, but it's not quite as trivial as it sounds.
You didn't read the patent application, did you?
They are not patenting a regular expression to validate social-security numbers, they are patenting an entire validation system for web application, in which there is an API for a developer to specify a regular expression, and the framework will then validate the user input in real-time, while the front-end highlights the specific characters that caused the failure. The particular problem they are trying to solve is the user confusion when they submit a form which tells them that a field was rejected without telling them what's wrong with the input.
This is not to say that there isn't prior art for that, but as you can see it is much more than just a patent on a simple reg-exp pattern.
-dZ.
Carol vs. Ghost
Wow! All this steam and no one read the patent. It's been a while since the Slashdotter stereotype was so well validated.
The patent is for incremental validation as the characters come in. The text input widget is primed with the regex and validates each character as it is keyed, and reacts immediately if it gets an invalid-in-context character. The effect is that it's not possible to enter an invalid string.
Whether you think this is novel or not, it's not ordinary.
I'm a Programmer. That's one level above Software Engineer and one level below Engineer.
Strictly speaking, it does, but it might be large. As a quick and dirty test, here's the result of evaluating (regexp-opt (loop for x from 0 to 700 collect (format "%d" x )) nil) in Emacs:
"1\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|2\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|3\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|4\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|5\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|6\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|7\\(?:00\\|[0-9]\\)\\|8[0-9]\\|9[0-9]\\|[0-9]"
What regular expressions can't do is match strings that aren't described by a regular language. Roughly speaking, if what you're trying to match has a maximum length, you can match it with a regular expression. (For a more formal description, see the Pumping Lemma.)
I disagree philosophically with our current legal system allowing software patents. However it never ceases to amaze me how the internets take a patent, don't read it or understand it and then complain about things that don't even make sense in regards to the patent in question.
If you read the actual patent, it is talking about validating the text input as the characters are being typed in and highlighting the specific characters that don't match the regular expression. For example if you type in a SSN as: 1112-113-1111, then the 2 and 3 within the text field would be highlighted (e.g. highlighted red) as not matching the regular expression for a SSN. I think the key is that the error highlighting is done inside the text field. The highlighting of the text wouldn't occur until some timer expired (e.g. 200 ms without any new typing). This makes it so that the error highlights don't show up as you are typing but as soon as you stop. This is definitely more novel than the comments on this article make it out to be.
Should this or other software algorithms be patentable? No. However companies like IBM are forced to patent because if they don't then other patent troll companies sue them and win because they have trouble proving prior art. It is not illogical for companies like IBM to simultaneously pursue patent reform and continue to patent as much as possible under the current legislation. This is just taking advantage of the broken system while talking about how broken it is.