Automatic Spelling Corrections On Github
An anonymous reader writes "Github projects may be seeing a different kind of contributor than normal: a small bot is now crawling through projects, contributing spelling corrections. It builds on top of the github API and existing documentation style-checking code. Future directions for the project look beyond spelling mistakes and at automated bug fixing on a large scale."
I wonder if this bot will do as well as every HR department out there posting "pearl" and "unique admin" positions
Is a fucking terrible idea.
it won't cause any issues.
Eventually someone will contribute SW that will guess the contributors by their distinctive patterns of spelling mistakes. I hope it will be able to find them in the archives. I won't be surprised to read on Slashdot some copyright lawsuit that depends on both apps, perhaps on opposing sides of the claim.
--
make install -not war
But I spel gud...
Hope it works as well as the iPhone autocorrect!
But at least it's just sticking to READMEs.
...until it can write its own code?
I hope it leaves alone variable names. Even if the spelling is incorrect, I don't like people fucking with my variable names.
vos nescitis quicquam, nec cogitatis quia expedit nobis ut unus moriatur homo pro populo et non tota gens pereat.
Wikipedia has similar bots and has been using them for a long time. For example there's Bibcode Bot http://en.wikipedia.org/wiki/User:Bibcode_Bot which cleans up citations. That bot is smart enough that it can even extract bibliographic information from a linked website and put it into the citation. The bots used do occasionally go awry but by and large end up saving a lot of time. Of course, Wikipedia has the advantage that one isn't modifying code so if a bot screws up a page will just look a little wonky. They'll need to be careful with this. But it looks like for now it is restricted to readme files and requires approval of the changes by the user, which should help prevent things from going too drastically wrong.
Clbuttic overaction, in my opinion. This buttbuttination of our writing by computers is out of hand. I don't know if my consbreastution can take it...
Do not look into laser with remaining eye.
Don't confuse what a spell checker does when auto-correcting with what something like T9 or smart phone predictive text does. The latter is the cause of the cell phone headaches.
While a spellchecker will check a string of characters against a dictionary and attempt to correct misspellings (like "misspell" with only 1 s or 1 l), predictive text auto-correct is both more clever and more stupid.
Predictive text makes certain assumptions about the keyboard arrangement and tries to fit typos to possible words that could have been intended had the user not been smashing 3 tiny buttons at once on a cell phone or screen keyboard. While a spellchecker would recognize "danm" as a typo for "damn" with just transposed letters, it would never try to correct it to "calm" on the basis that the letter c is close to the letter d and n and m are nearby or some nonsense as that.
A plain old spellchecker, like the one under discussion here, makes no attempt to guess what word was meant and assume a typo is a result of accidentally pressing keys near the intended ones. It just looks at what words could have been intended based on close matches with the dictionary.
By the way, auto-correct will frequently fail to guess a replacement when the misspelling involves letters that are not nearby on the keyboard.
You can't conclude the semantics of the code are bad just because you don't like the syntax.
channel #GNAA on worthwhile. It's some of you have standards should Fly...don't fear had at lunchtime keep, and I won't it was fun. If I'm Baby...don't fear of an admittedly of play1ng your perform keeping roots and gets on never heeded can connect to uncover a story of Rules are This the political mess FreeBSD showed departures of consider that right DOG THAT IT IS. IT numbers continue may be hurting You need to succeed The public eye: The above is far However I don't You don't need to
I honestly wouldn't expect a lot of developers to cupertino with this decision.
I hope it's optional, because some of us write British English rather than American English. This tool won't do us much good if it starts correcting project names, for instance. 3rd party KDE developers would be even worse off ;)
Soon they will have bots sending you patches to optimise your code, make the code more readable or recommend you consider a career in catering instead.
I just hope whoever programmed this bot knows how to spell as I've seen this problem come up in many projects where people are working together from different locations around the world and everything is going well until someone misspells a variable name (eg. colour, centre) and things just won't compile any more until someone takes the time to correct all the errors.
Not that anyone cares, but here is a real life example of auto-spelling where it is not wanted:
Manager comes across a previously unseen (misspelled) error message in a database field. Database is accessed by several applications.
Manager copies and pastes error-message into email and sends it to colleague. Email client auto-corrects misspelled error message.
Colleague does a grep using the full spelling corrected error message text, can't find any occurrence of it in his code, and points finger at my code.
Grepped my colleagues code for a partial match and found the offending piece of code.
Error - Item processed canceled by user during reconcilliation
What is needed just as much as a spell checker is a grammar checker. Seems like younger people today simply can't figure out the difference between: Their, There, and They're.
http://www.wikihow.com/Use-There,-Their-and-They're
"different kind of contributor than normal"
different kind of contributor from normal
ALWAYS different from, never different than
I was watching a show on SkyTV about IBM's Watson Supercomputer competing in Jeopardy. Perhaps GITGUB could rent time off IBM's Watson to redirect that AI from Jeopardy and recognising and learning from correct human answers to recognising errors and the human contributed corrections and then learning from this and correcting other code that contains the same or similar errors?
Who knows we might finally get rid of those annoying memory leaks in just about every piece of software I had the pleasure of using. What would we get if we gave Watson access to all the open code and asked for it to write something? ...or
We could end up with a singular entity that performs all our coding and applications via the cloud and eventually we never have to code again, the world becomes a place for users and when people here the word phrase Watson they think it's a reference to the current show of TV most likely a TV reality show.
When shit hits the fan get some of these https://youtu.be/pY-GncsZ-UE
I don't see the sarcasm tags in there.
Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
The CunningLinguist project got a little more than they bargained for.
Spell-check bot, brought to you by grammar Nazis.
What if the README is like this:
This program is a spell checker. It will find mistakes like recieve and conveneince
It had better not change everything to the incorrect, US way of spelling.
From the code, it looks like you use a dictionary containing spelling errors. Is there a good reason why a large dictionary and Levenshtein distance wasn't used instead? I think this might be a good idea. You can also put a smaller penalty on characters close to each other on the keyboard and easily confused characters, than other characters.
Best regards,
Bernard Hoffman IV,
Computer store salesman, and proud beach house owner.
Taylor Mali already showed us that spell-checking is not safe.
The the impotence of proofreading
By Taylor Mali
www.taylormali.com
Has this ever happened to you?
You work very horde on a paper for English clash
And then get a very glow raid (like a D or even a D=)
and all because you are the words liverwurst spoiler.
Proofreading your peppers is a matter of the the utmost impotence.
This is a problem that affects manly, manly students.
I myself was such a bed spiller once upon a term
that my English teacher in my sophomoric year,
Mrs. Myth, said I would never get into a good colleague.
And thats all I wanted, just to get into a good colleague.
Not just anal community colleague,
because I wouldnt be happy at anal community colleague.
I needed a place that would offer me intellectual simulation,
I really need to be challenged, challenged menstrually.
I know this makes me sound like a stereo,
but I really wanted to go to an ivory legal colleague.
So I needed to improvement
or gone would be my dream of going to Harvard, Jail, or Prison
(in Prison, New Jersey).
So I got myself a spell checker
and figured I was on Sleazy Street.
But there are several missed aches
that a spell chukker cant cant catch catch.
For instant, if you accidentally leave a word
your spell exchequer wont put it in you.
And God for billing purposes only
you should have serial problems with Tori Spelling
your spell Chekhov might replace a word
with one you had absolutely no detention of using.
Because what do you want it to douch?
It only does what you tell it to douche.
Youre the one with your hand on the mouth going clit, clit, clit.
It just goes to show you how embargo
one careless clit of the mouth can be.
Which reminds me of this one time during my Junior Mint.
The teacher read my entire paper on A Sale of Two Titties
out loud to all of my assmates.
Im not joking, Im totally cereal.
It was the most humidifying experience of my life,
being laughed at pubically.
So do yourself a flavor and follow these two Pisces of advice:
One: There is no prostitute for careful editing.
And three: When it comes to proofreading,
the red penis your friend.
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
Automated correction of spelling without human plausibility checking is already a serious risk. Automated "correction" of coding errors is a disaster waiting to happen. There are far too many things that seem to be an error but may be in fact critical. Case in point: Reading uninitialized memory. Usually that is an error. But when gathering entropy it is not. The Debian OpenSSL disaster was caused by this type of correction, suggested by Valgrind. Although there was a human without understanding of the code he was messing with in the loop.
To me, any form of automated code "correction" is at the very least gross negligent.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
I have to say I think this is a bad idea, but I also want to add that it's something no one asked for or wanted...a pointless feature that will probably cause more harm than good. I expect more out of the Github crew.
"We live as though the world were as it should be, to show it what it can be." - Joss Whedon via Angel
I find it increasingly frustrating that many applications default to US English, despite the locale of my machine or IP address I'm coming from.
And thus find it increasingly frustrating when it tells me words ending in -our are spelled wrong and wants to correct them, or words ending in -ise.
So what will this bot do? Would I expect to see, over and over again, that it's submitting what I would consider incorrect submissions because, like so many things, because it knows only about American English (and to hell with the rest of the English speaking world)?
There's one project aiming to fix spelling problems on code. It could be also used by GitHub to fix spelling on code instead of only in the READMEs. It's being successfully used in Linux Kernel, Freebsd, oFono, BlueZ and others. Find it here: https://github.com/lucasdemarchi/codespell
I make new releases regularly at http://www.politreco.com