New Online Dictionaries Automate Away the Linguistic Middleman
An article in The New York Times highlights two growing collections of words online that effectively bypass the traditional dictionary publishing system of slow aggregation and curation. Wordnik is a private venture that has already raised more than $12 million in capital, while the Corpus of Contemporary American English is a project started by Brigham Young professor Mark Davies. These sources differ from both conventional dictionary publishers and crowd-sourced efforts like the excellent Wiktionary for their emphasis on avoiding human intervention rather than fostering it. Says founder Erin McKean in the linked article, 'Language changes every day, and the lexicographer should get out of the way. ... You can type in anything, and we'll show you what data we have.'
You can type in anything and we'll show you the data we have sounds a lot like Google search.
It doesn't detect that telivision is an incorrect spelling because there are so many authoritative examples of that spelling: http://www.wordnik.com/words/telivision
Google seems to do a good job of detecting spelling errors and automatically updating it's dictionary and of course it also shows you websites where that word is used. I don't really see what Wordnik provides.
Oh, that's purely typographical. When moving blocks of metal type around, a full-stop/period or comma is more delicate than a quotation mark, since it's only x-height and not capital letter height. Typographers got in the habit of putting them on the inside to keep them safe. That's also why certain ligatures of f and the long s were preserved from scribal writing: those letters were designed to hook over others, and if the next letter was tall then it would create a structural instability (an x-height hole.) If modern punctuation had evolved before the invention of moveable type, we would probably put the quotation mark directly above the other punctuation mark, and use logical punctuation for ? and !. However, it didn't, so it was all put inside to stay consistent.
To be honest, I find it visually more pleasant. After looking at code that passes strings around as arguments in C-style imperative languages all day, it's nice to see something without a big gap on the baseline (this "is," an "example", for you.) Since the quotation mark is already floating up and away from the letters, it's less jarring to see it separated from the word than a comma or period. (This is more or less the modern aesthetic justification for keeping it the traditional way. However, modern typographers don't always agree with traditionalists: watch what happens when you point out that the "single" space used to separate sentences prior to the invention of the typewriter was actually larger than a standard double space.)
Bio questions? Ask me to start a Q&A journal. Computer analogies available for most topics!