IBM Wants Patent For Regex SSN Validation
theodp writes "What do you get when you combine IBM contributors with the Dojo Foundation? A patent for Real-Time Validation of Text Input Fields Using Regular Expression Evaluation During Text Entry, assuming the newly-disclosed Big Blue patent application passes muster with the USPTO. IBM explains that the invention of four IBMers addresses a 'persistent problem that plagues Web form fields' — e.g., 'a social security number can be entered with or without dashes.' A non-legalese description of IBM's patent-pending invention can be found in The Official Dojo Documentation. While IBM has formed a Strategic Partnership With the Dojo Foundation which may protect one from a patent infringement lawsuit over validating phone numbers, concerns have been voiced over an exception clause in IBM's open source pledge."
Online Prior Art at the Regex Library from 2004:
Put that into your favorite Javascript regular expression object and write a stupid onChange reference to it in your HTML and ... tada! Too complicated? Here's some more prior art. Or here. A little bit of Googling must be too much for the USPTO.
Are we suddenly shocked to discover one line of code can be patented when a whole mess of code can be patented?
My work here is dung.
What is this buillshit? "A persistent problem is dashes in SSNs"???
How fucking hard is it to strip non-numeric characters from a string?
I cannot believe there could be such programmer incompetence; no, it has to be some managerial cluelessness and hard-headness.
Are you fucking kidding me? Did they just really patent the format "###-##-####"? I didn't RTFA because I didn't want my head to explode.
this is my sig
^\d{3}-\d{2}-\d{4}$
ahh thats right baby, patent infringement!
I live on the edge...
^\d{3}-\d{2}-\d{4}$
from 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
to 45 2F 6E 40 3C DF 10 71 4E 41 DF AA 25 7D 31 3F
this is bullshit.
We parse SSNs all day long. I think WE may have prior art.
/^\d{3}-?\d{2}-?\d{4}$/g
How is that a persistent problem?
~ I am logged on, therefore I am.
I see lots of comments coming up about how ridiculous this is. Maybe that's the point. Maybe the best way to bring out patent reform to to patent every simple thing there is. You have to remember that IBM is paying to patent something as simple as:
s/(^[0-9])+//g
which most certainly has prior art all over the web. Why would it be worth IBMs money and time to do such a thing? The best reason I can come up with is that they want to prove a point. There's probably quite a bit an open-source firm can gain by causing a collapse of the software patent system, and this may be the best way to do it.
How is it an "issue" that SSNs can be entered with or without dashes? Just strip the dashes in post-processing, then add them back if you absolutely have to have them...
You say people put in an SSN without dashes when your website requests them?
put a damned example on your site, like this: nnn-nn-nnnn
The first claim mentions the real time nature of the validation. The example regexes are for validating a completed string. This is still silly and obvious but you may have a harder time finding specific prior art for this case.
I am becoming gerund, destroyer of verbs.
IBM deserves an Oscar and a Nobel Prize for this!!! This problem has persistently plagued me for ages! I'm glad someone finally came up with a solution to this. My only recourse up to this point has been to avoid SSN fields on any web form. If my boss wants something that requires a unique personal identifier I tell him it can't be done--not unless he wants to hire a team of interns to parse whatever voodoo people put into that SSN field!
Thank you so much for this new knowledge IBM! Now if we can do something about phone number fields I'll be in web developer heaven!
You can read more about it here
excitingthingstodo.blogspot.com
We need more overly-broad patents on embarrassingly horrible user interfaces. In fact, someone ought to patent *all* the common mistakes. That way their lawyers could run around suing everyone building crap.
Program Manager: What the hell is happening?! Why is the website down?! ... what are you trying to tell me? This is an emergency, accounting said our money is leaving! ... we could remove it and then it could read back the file after waiting for a few seconds. We would have to hope that more users don't come while we are performing emergency dash extraction.
Web Programmer: It's the users, sir, one of them put dashes in their SSN on the form!
Program Manager: I don't have time for this mumbo jumbo geek jargon
Web Programmer: Well, you see the dashes are inside the string.
Program Manager: Inside? How is this possible?
Web Programmer: Well, the user must have paused to push the dash key, sir.
Program Manager: So if the dashes are inside the string, we have to get them out. Is there someone we can pay for this service?
Web Programmer: I'm afraid it's too complicated for that. But maybe if we had it write to a file and one of us kept refreshing a text editor on that file
Program Manager: Goddamnit! Why didn't testing find this?!
Web Programmer: Well, they did but to fix this bug we just removed the dash keys on their keyboards.
Program Manager: Can we do that to each of the users?
*IBM employee enters with massive box labeled "Enterprise SSN Dash Extractor"*
IBM Sales Rep: Gentlemen, let IBM solve all your SSN problems for a mere $2,000 per site license!
My work here is dung.
Patent Application 973255489
"Method of enhancing sarcasm through the intentional introduction of typographical errors within multiple exclamation marks."
Within a set of not fewer than four (4) and not more than eight (8) Exclamation Marks ("!"), an Erroneous Character from the set of characters [1, 2, @, #, ~, `] is inserted after the third or fourth Exclamation Mark. The Erroneous Character is perceived by the reader as a typographical error consistent with hurried, careless typing, reinforcing any sarcasm contained in the textual comment preceding the Exclamation Marks.
The Masked Input Plugin already solves this pretty nicely.
$("#ssn").mask("999-99-9999");
is pretty easy to implement.
Yes, regular expressions are more powerful. They are also - sorry o mighty nerds of slashdot - completely confusing to the majority of more casual developers who want to be able to drop in a line of quick code and move on to making their drop shadowed corners even rounder.
Heck a lawyer patented the method for swinging on a swing
Why not IBM patenting something stupid like this! Maybe enough of these will bring the patenet system into reform or it's destruction...
Ref:
http://www.google.com/patents?vid=6368227
http://www.freepatentsonline.com/6368227.html
http://en.wikipedia.org/wiki/Reexamination
The Truth is a Virus!!!
I'd like to assert that I've personally written prior art.
As ridiculous as it is, Microsoft has a patent on the "elseif" statement, so every non-Microsoft programming language now has to suffer with just "else if". *slight sarcasm*
Microsoft, Apple, Google, Amazon what's the difference? All steal money from devs and control with walled gardens.
concerns have been voiced over an exception clause in IBM's open source pledge."
Oh boohoo. Why would one expect IBM continue to give you protection against a lawsuit using these patents against you when you engage in a patent lawsuit against them? I don't see how this would worry anyone in the OSS community as they aren't known for launching patent claims against other OSS. I really feel no sympathy for any patent trolls who try to sue against OSS and then get caught in a shitstorm from IBM.
Actually, they're trying to patent "A system for providing real-time validation of text input fields in a Web page comprising:a validation-enhanced text input element configured to contain an attribute for a validation expression for a text field in a rendered Web page, wherein the validation-enhanced text input element is contained within a source code document corresponding to the rendered Web page; andan input text validator configured to validate a user-entered character of the text field against the validation expression in real-time and visually indicate invalid user-entered characters," and "A method for providing real-time validation of text input fields in a Web page comprising:receiving a user-entered character in a text field displayed in a Web page;immediately validating the user-entered character against a validation expression contained within a validation-enhanced text input element associated with the text field, wherein the validation expression defines a set of acceptable characters and character positions for the text field; andwhen the user-entered character is determined invalid, visually marking the user-entered character," and "An input text validator for validating a text field of a Web page in real-time comprising:a partial input expression generator configured to generate an expanded version of a validation expression, wherein the expanded version of the validation expression defines a set of acceptable characters and character positions for a text field of a Web page; andan invalid text highlighter configured to visually highlight a user-entered character in the text field when the user-entered character is determined as invalid for the expanded validation expression."
Remember, patents are all about the claims. You don't know what they're "trying to patent" until you have read and understand the claims.
Today's Sesame Street was brought to you by the number e.
as an Open/Free software hero? This action seems quite consistent with the IBM of the 1970's.
I run into this problem with entering phone numbers into web forms. Some want them as xxxxxxxxxx, some as xxx-xxx-xxxx, some as (xxx)xxx-xxxx, and even other weirdness. Some sites take whatever I put in and mold it to their desired format; others tell me my input is invalid and make me enter it again (some even tell me the desired format). Some sites actually break it up into three input fields with appropriate limits on the number of characters.
I've seen similar cases with SSNs.
It's pretty obvious that some sites have no trouble parsing the input data and making it fit what's expected. How is this a novel concept to be patented?
Edward Burr
Having a smoking section in a restaurant is like having a peeing section in a swimming pool.
If you read the patent application, they aren't patenting just validation of a text field. They are patenting the idea of validating a string, one character at a time, as it is entered by the user. As the string is entered, when invalid characters are found using regex, a "visual change" is made to the input to let the user know they made a mistake.
An example they give is that in an email input field, as soon as the user enters a comma, the comma would change colors.
It's still not groundbreaking, but it's not quite as trivial as it sounds.
While actual SSN validation is slightly more complicated than a simple regexp, awarding a patent for an obvious algorithm is lame.
Lucky for me I can prove the algorithm I wrote based on Social Security Administration guidance existed before IBM was awarded any patent.
At the time of this post I announce that it is officially released under free licensing.
Did you ever wake up in the morning, with a Zombie Woof behind your eyes? -- FZ
A persistent problem that plagues Web form fields is the proper formatting of data into text fields. A disconnect often exists between a developer and a user as to the proper or an acceptable format for a specific text field. For example, a social security number can be entered with or without dashes.
They aren't trying to fix some nefarious social security number formatting issue.
Completely biased summary.
Morons.
https://black.cirt.vt.edu/valid_ssn/index.php
You didn't read the patent application, did you?
They are not patenting a regular expression to validate social-security numbers, they are patenting an entire validation system for web application, in which there is an API for a developer to specify a regular expression, and the framework will then validate the user input in real-time, while the front-end highlights the specific characters that caused the failure. The particular problem they are trying to solve is the user confusion when they submit a form which tells them that a field was rejected without telling them what's wrong with the input.
This is not to say that there isn't prior art for that, but as you can see it is much more than just a patent on a simple reg-exp pattern.
-dZ.
Carol vs. Ghost
Wow! All this steam and no one read the patent. It's been a while since the Slashdotter stereotype was so well validated.
The patent is for incremental validation as the characters come in. The text input widget is primed with the regex and validates each character as it is keyed, and reacts immediately if it gets an invalid-in-context character. The effect is that it's not possible to enter an invalid string.
Whether you think this is novel or not, it's not ordinary.
I'm a Programmer. That's one level above Software Engineer and one level below Engineer.
I was writing Unix Medical billing systems in the mid 80's on Unix (and using regex), when you gov ppl were still on mainframes. So, I SERIOUSLY DOUBT IT. And I doubt that I wrote the first regex for an SSN. Back then, the ssn WAS a single ID for everybody.
I prefer the "u" in honour as it seems to be missing these days.
A couple of corrections and notes.
1. I accidentally typed a 3-digit year in the date template example. It should be four.
2. "Very practical" should have been bold, not quoted. I inadvertently used a wiki convention out of habit.
3. The template feature of the language was probably added after the "fork" from the gov't project. However, I believe extensions to the language are still not permitted to be copyrighted (IIRC).
Table-ized A.I.
The title--surprise!--sucks. They're not patenting the use of regular expressions to validate anything. They're patenting a field that validates itself as you type, and highlights invalid characters:
You don't see this sort of thing implemented much because individual characters can't be "highlighted" within regular HTML input elements. And the overhead required for putting that sort of thing together just doesn't seem worth it, because you still need to perform traditional server-side validation.
This is also amusing:
Yeah, if you're an idiot.
--I'm so big, my sig has its own sig.
-- See?
A Google search for "social security number" regex returned 11,300 results. I guess all of those people needed a regular expression to NOT validate data. .NET code. Validator is the term Microsoft uses for ASP.NET controls that validate input.
Also, the patent uses the term validator which is not a word according to most spell checkers that I use. I know this because I type this word frequently when documenting
Its one thing to patent an existing idea, but don't steal the made up words from an existing implementation and expect to get away with it.
Damned hard, based on my testing over the last few years.
As an exercise in futility -- the next time you're buying something online, try entering your credit card number with spaces in it, so it's legible, and easier to compare to what's on your card.
It used to be that it'd occasionally work -- but I don't think I've had a single success in the last year or two. They either put in limits so I can't type enough characters, or it gets rejected with no useful message but works fine without them.
... on another note, I once had a member number for a company that on the membership card had a leading zero -- their site worked fine for years. They upgraded the site, and I couldn't log on anymore. After an hour with customer service, they finally told me to log in without the leading zero, and it worked.
Build it, and they will come^Hplain.
(\d{9}|\d{3}-\d{2}-\d{4})
/^([Ss]ame [Bb]at (time, |channel.)){2}$/
Isn't this just making a library for web forms that act like an IBM 3270 terminal? I mean, they patented having a terminal that does field validation decades ago. This is just an alternate software implementation of the same thing, in a web browser.
- "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'
Why don't we try to get USPTO's attention over to Slashdot? Then, if they think they don't understand what's going on with a patent, they can find other peoples' interpretation of it over here. They're bound to understand at least one of a hundred different wordings of that patent in Slashdot's comments.
Any ideas?
2) It takes 4 IBMers to figure this out?
I'm pretty sure this "problem" has already been solved. Perhaps it's still an issue with the Lotus application-to-web server. Since no one outside of IBM actually uses it, no one would have noticed that.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
I disagree philosophically with our current legal system allowing software patents. However it never ceases to amaze me how the internets take a patent, don't read it or understand it and then complain about things that don't even make sense in regards to the patent in question.
If you read the actual patent, it is talking about validating the text input as the characters are being typed in and highlighting the specific characters that don't match the regular expression. For example if you type in a SSN as: 1112-113-1111, then the 2 and 3 within the text field would be highlighted (e.g. highlighted red) as not matching the regular expression for a SSN. I think the key is that the error highlighting is done inside the text field. The highlighting of the text wouldn't occur until some timer expired (e.g. 200 ms without any new typing). This makes it so that the error highlights don't show up as you are typing but as soon as you stop. This is definitely more novel than the comments on this article make it out to be.
Should this or other software algorithms be patentable? No. However companies like IBM are forced to patent because if they don't then other patent troll companies sue them and win because they have trouble proving prior art. It is not illogical for companies like IBM to simultaneously pursue patent reform and continue to patent as much as possible under the current legislation. This is just taking advantage of the broken system while talking about how broken it is.
I have prior art!!!1!!1!!!
Sounds an awful lot like ASP.NET RegEx Validation using AJAX .... but what do I know, I don't work for IBM or the Patent Office.
.... i better get to the patent office quick.
\d{3}(-|\s|\.)?\d{2}(-|\s|\.)?\d{4}
wow look that includes dashes, spaces, periods and no separator
I wonder if anybody has patented fire yet. I've got to get a piece of that action.
This is not for validating SSN but doing real time validation of text field input.
SSN was only an example in the patent. It could be city names, credit card number, or whatever.
It makes me laugh to see how many users do not read be before they write.
I still think it is bullshit but it does have a chance of going through unless someone can find prior art of doing real time javascript/ajax text field validation that display the error as you type.
And do you parse them as they were being typed in and give feedback before the submit button is hit? Or were you too lazy to even read the title of the patent?
ASP.NET MaskedEditExtender
In ASP.NET's AJAX Control Toolkit, MaskedEditExtender masks a TextBox, while MaskedEditValidator uses regular expressions to validate it. That takes care of the web-based prior art pretty neatly.
IBM has clearly stated that it wont assert these patents against the open source community, but will only assert them to anyone who tries to use them against the open source community. 'Nuff said.
Can someone explain to me why 99% of e-commerce sites are unable to handle spaces in credit card numbers? It's is a pain in the ass to enter and visually verify a 16 digit number when spaces (which are printed on the card!) are not allowed.
Hogwash: dBASE has been doing just that since the early 80's using "format templates", specifically the "Picture" clause.
True, it didn't do it over the web, but adding such to web browsers could use similar technology. Here are some examples from Microsoft Foxpro (a dBASE clone, more or less):
http://support.microsoft.com/kb/119691
Table-ized A.I.
A regular expression is just a Finite Automaton ie. State Machine. In a regex, state transitions occur when an input character is read. While regexs are usually applied to a complete input string, the basic theory behind them, as well as any other FA, does not stipulate how the next input character is "read" -- all that is required is a stream of input characters.
So, in the case of the FA representing a form field validation regex, a character is "fed" into the the regex FA whenever a new character is entered, upon which it is checked whether the FA has transitioned to the "not accepted" state and if so the character corresponding to that transition is flagged as invalid.
If the user presses delete or backspace during entry, one method of avoiding revalidation of the current input string would be to save the state of the FA each time a character is read and associate it with that position in the currently entered text. One would then use this information, for example, when a user moves the cursor and presses backspace or delete, to restore the state of the FA associated with the character position just before any deleted characters and then re-run the FA on the characters entered past the point of deletion. If the user is allowed to insert characters at any point in the text input, then a similar scheme would be used in which the state of the FA associated with the character position just before the newly inserted character is restored and then the FA is re-run on all characters past that point.
It seems IBM's programmers and/or lawyers either failed or did not take any basic Theory of Computation classes when they received their education. Sorry for them, because they've made complete idiots out of themselves now, as every Computer Science professor, researcher and grad student on the Earth is probably laughing their asses off -- "Hey! Did you hear that IBM is trying to patent eighty-plus years of mathematical theory! No, it's true!"
jdb2
As others have already pointed out, this patent is not what the /. story claims.
That's to be expected though - I can't remember the last time /. corrected read and understood the contents of a patent app.
A more reasoned explanation about what's going on here, from one of the dojo toolkit founders:
http://alex.dojotoolkit.org/2009/05/a-quick-word-on-dojo-and-patents/
Did that say from the not worthy-inventions dept?
This sig cannot be proven true.
to anyone who does perl programming
input =~ m/([0-9]{3})-*([0-9]{3})-*([0-9]{4})/;
SSN = $1 . $2 . $3;
I developed a whole modular on-the-fly validation library, with cross-dependency support, easy use, and both a server and a client side. I still have it around here somewhere.
I wanna see them sue me on that. I'd kick the living crap out of them. ^^
Any sufficiently advanced intelligence is indistinguishable from stupidity.
I have an entire XML/XSLT/JavaScript/AJAX (written before the term AJAX was coined) that I wrote called "ximple". In it, I had extensive use of RegEx validation of the following forms:
These three things alone provided an extensive amount of real-time, no round-trip validation and was EXTREMELY flexible and easy-to-use (if you new RegEx).
Over-the-top Response Guy! Giving "Over-the-Top Responses" since 1970.
LOL!
Really, I cant think of anything better to respond to this nonsense, other than... LOL!!!
dont throught the extracted dashes i might need i have few SSN without ones plzzzzzzzzzz
Prior Art is ANY implementation of a patentable idea that predates the patent application, NOT just the first one. So chill, he's not claiming to have invented the thing, just that they may have code that did this before IBM claims to have invented it.
validator.w3.org
The W3C HTML validator.