Slashdot Mirror


IBM Wants Patent For Regex SSN Validation

theodp writes "What do you get when you combine IBM contributors with the Dojo Foundation? A patent for Real-Time Validation of Text Input Fields Using Regular Expression Evaluation During Text Entry, assuming the newly-disclosed Big Blue patent application passes muster with the USPTO. IBM explains that the invention of four IBMers addresses a 'persistent problem that plagues Web form fields' — e.g., 'a social security number can be entered with or without dashes.' A non-legalese description of IBM's patent-pending invention can be found in The Official Dojo Documentation. While IBM has formed a Strategic Partnership With the Dojo Foundation which may protect one from a patent infringement lawsuit over validating phone numbers, concerns have been voiced over an exception clause in IBM's open source pledge."

9 of 281 comments (clear)

  1. Prior Art so Prior It Hurts by eldavojohn · · Score: 5, Informative
    Application Patent Date: November 20, 2007
    Online Prior Art at the Regex Library from 2004:

    ^(?!000)([0-6]\d{2}|7([0-6]\d|7[012]))([ -]?)(?!00)\d\d\3(?!0000)\d{4}$

    Put that into your favorite Javascript regular expression object and write a stupid onChange reference to it in your HTML and ... tada! Too complicated? Here's some more prior art. Or here. A little bit of Googling must be too much for the USPTO.

    Are we suddenly shocked to discover one line of code can be patented when a whole mess of code can be patented?

    --
    My work here is dung.
    1. Re:Prior Art so Prior It Hurts by Rei · · Score: 5, Informative

      The amazing part is that IBM is wasting this kind of money applying for a patent that has no chance of standing up in court, if they're even dumb enough to grant it in the first place. I'm in the process of applying for a software patent myself (I know, summon the chorus of boos; but having it could be the difference between being able to raise VC and not being able to raise VC for my starting business; loans, too, are often secured against your IP). These things don't come cheap -- mostly in terms of legal costs. As in a $5k retainer, $5-10k total for a single patent, more if it takes multiple patents to ensure sufficient protection, and if you want international protection, it can go up to $100k or so. Also, from discussions with my attorney, it's really hard to get away with the "bloody obvious" software patents anymore after all of the blowback from things like the Amazon 1-click patent.

      I'm surprised they'd waste the money trying. Perhaps their legal department didn't have enough work to do but they didn't want to cut staff.

      --
      Give a boy a gun and you arm him for a day. Teach him how to make a gun, and the whole metaphor breaks down.
    2. Re:Prior Art so Prior It Hurts by JWSmythe · · Score: 5, Informative

          It's not trivial, but not impossible.

          The first 3 digits are the area (state) code.

          The next 2 digits are the group.

          The last 4 digits are the serial number.

          There is no check digit, so no further math is required to validate it.

          State codes are listed here http://www.socialsecurity.gov/employer/stateweb.htm

          The highest issued group as of May 01 2009 is listed here: http://www.socialsecurity.gov/employer/ssns/highgroup.txt

          You can pull the high group file back to November 2003 from the SSA site here: http://www.socialsecurity.gov/employer/ssnvhighgroup.htm

          The group numbers are used out of order for "administrative" reasons.

          The groups are assigned as:

          ODD 01 -> 09
          EVEN 10 -> 98
          EVEN 02 -> 08
          ODD 11 -> 99

          Area 000 is never issued.
          Group 00 is never issued.
          Serial 0000 is never issued.

          The Area (state) code is based on where the card is issued, not where the person was born. If you were born in NYC, but your number was issued in California, you would have a California area (state) code.

          Now, the SSN is generally requested by the hospital, so if you have a baby born in the US, part of the stack of paperwork includes the SSN request form. In those cases, obviously the birth state and SSN state should match, unless for some odd reason the request is sent to another state.

          When I was born, there was no requirement to get a SSN issued immediately, and my family moved when I was 5, so my SSN was issued by the second state.

          The logic to test if a SSN has been issued is pretty easy with a couple tables in a DB, or a whole lot of hard coded crud that has to be updated monthly.

      --
      Serious? Seriousness is well above my pay grade.
    3. Re:Prior Art so Prior It Hurts by Chandon+Seldon · · Score: 3, Informative

      If it gets granted, how much lawyer time will it take to get overturned later?

      This is a setup for a denial of service attack on the budgets / legal resources of smaller companies in future legal engagements.

      --
      -- The act of censorship is always worse than whatever is being censored. Always.
  2. Real time is the key claim by wiredlogic · · Score: 3, Informative

    The first claim mentions the real time nature of the validation. The example regexes are for validating a completed string. This is still silly and obvious but you may have a harder time finding specific prior art for this case.

    --
    I am becoming gerund, destroyer of verbs.
    1. Re:Real time is the key claim by radtea · · Score: 3, Informative

      I assume that most Javascript validation waits until all of the text has been entered.

      Your assumption is false. It's called an OnChange event: http://www.w3schools.com/jsref/jsref_onchange.asp

      I am not a "Web programmer" but anyone with even a passing familiarity with JavaScript has seen this.

      The first claim in the patent is: "1. A system for providing real-time validation of text input fields in a Web page comprising:a validation-enhanced text input element configured to contain an attribute for a validation expression for a text field in a rendered Web page, wherein the validation-enhanced text input element is contained within a source code document corresponding to the rendered Web page; andan input text validator configured to validate a user-entered character of the text field against the validation expression in real-time and visually indicate invalid user-entered characters."

      So these losers have filed a patent application in which the first claim is exactly nothing but a completely standard bit of JavaScript code. People have been doing this kind of real-time validation and response for years and years and years. JavaScript is designed to do it.

      This is by far the most egregiously stupid patent application we have seen on /. in a long time.

      Why IBM is doing this is a complete mystery, although "never assume venality where stupidity will do" comes forcibly to mind.

      --
      Blasphemy is a human right. Blasphemophobia kills.
  3. Have you read the patent application? by dzfoo · · Score: 5, Informative

    You didn't read the patent application, did you?

    They are not patenting a regular expression to validate social-security numbers, they are patenting an entire validation system for web application, in which there is an API for a developer to specify a regular expression, and the framework will then validate the user input in real-time, while the front-end highlights the specific characters that caused the failure. The particular problem they are trying to solve is the user confusion when they submit a form which tells them that a field was rejected without telling them what's wrong with the input.

    This is not to say that there isn't prior art for that, but as you can see it is much more than just a patent on a simple reg-exp pattern.

            -dZ.

    --
    Carol vs. Ghost
    ...Can you save Christmas?
  4. Slashdot true to form by thethibs · · Score: 4, Informative

    Wow! All this steam and no one read the patent. It's been a while since the Slashdotter stereotype was so well validated.

    The patent is for incremental validation as the characters come in. The text input widget is primed with the regex and validates each character as it is keyed, and reacts immediately if it gets an invalid-in-context character. The effect is that it's not possible to enter an invalid string.

    Whether you think this is novel or not, it's not ordinary.

    --
    I'm a Programmer. That's one level above Software Engineer and one level below Engineer.
  5. Re:More to it than that. by QuoteMstr · · Score: 3, Informative

    Strictly speaking, it does, but it might be large. As a quick and dirty test, here's the result of evaluating (regexp-opt (loop for x from 0 to 700 collect (format "%d" x )) nil) in Emacs:

    "1\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|2\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|3\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|4\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|5\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|6\\(?:0[0-9]\\|1[0-9]\\|2[0-9]\\|3[0-9]\\|4[0-9]\\|5[0-9]\\|6[0-9]\\|7[0-9]\\|8[0-9]\\|9[0-9]\\|[0-9]\\)\\|7\\(?:00\\|[0-9]\\)\\|8[0-9]\\|9[0-9]\\|[0-9]"

    What regular expressions can't do is match strings that aren't described by a regular language. Roughly speaking, if what you're trying to match has a maximum length, you can match it with a regular expression. (For a more formal description, see the Pumping Lemma.)