Slashdot Mirror


A Spell-Checker for Scientific Terms?

deaflamb wonders: "I'm a biology major and have been writing a paper for a class. I'm using Microsoft Word on my mac. It's annoying me how often I have to click 'ignore' or 'add' on the spell checker when it comes across words only used in Science. Was wondering if there where any free scientific spell checkers out there that can be added into Word or OpenOffice (since I use that too), and how well they work?" It didn't take me long to find these guys, who look like they cover a significant portion of the terms used in the medical and science world, however, their price for a single user license for only one of their specialty packs can run into the hundreds of dollars. Might there be other options that are a bit more affordable or, as the deaflamb asks, free?

48 comments

  1. Not as far as I know by poopdeville · · Score: 3, Interesting

    But this is a great idea for a collaborative, freely available project. I'm a mathematician by trade and have run into a similar problem. Want to work together?

    --
    After all, I am strangely colored.
    1. Re:Not as far as I know by Steinfiend · · Score: 1

      I'd definitely be interested in helping out with this. Whilst not in a scientific field as such, the financial/IT field I'm in has a lot of words not usually found in a concise dictionary. What kind of format and what kind of distribution method would be best?

    2. Re:Not as far as I know by Intron · · Score: 2, Interesting

      Wouldn't the simple way to do this be to take the indexes of a few of publications in a given field and process them into a list? You aren't looking for definitions or anything, just correctly spelled words.

      --
      Intron: the portion of DNA which expresses nothing useful.
    3. Re:Not as far as I know by poopdeville · · Score: 1

      You and anyone else who is interested should send me an e-mail. As it stand, I'm thinking that if a database of words is compiled and categorized, we can generate several different dictionary formats easily. I'm busy working on another project right now, but I'll start writing code for this in a few days.

      --
      After all, I am strangely colored.
  2. Try this: by Gothic_Walrus · · Score: 4, Informative
    This page has a few dictionaries up for free. I don't know if they've got quite what you're looking for, but it's worth a shot.

    Beyond that...the textbook is always a good choice. Type it, check it a few times, and then add it to the dictionary. :)

    --
    Goo goo g'joob.
    1. Re:Try this: by TERdON · · Score: 1

      Scanning it and using OCR to read the words seems like a more sensible thing to do, then post-processing it by sorting out only unique words. Of course, a backside with this solution is that there WILL be some manual tidying-up to do...

      --
      I have a really elegant proof for Fermat's last theorem. If this sig was only a bit longer...
    2. Re:Try this: by pomo+monster · · Score: 3, Funny

      "Downside." Trust me. You mean "downside."

    3. Re:Try this: by TERdON · · Score: 1

      I'm writing in Swenglish, you insensitive clod! ;-) (no hard feelings, just had to explain the error...)

      --
      I have a really elegant proof for Fermat's last theorem. If this sig was only a bit longer...
    4. Re:Try this: by deaflamb · · Score: 1

      Some of those will actually work very well for the latin names involved with organisms. I tried rechecking with one of the dictionaries that related to my paper, but lots of the science specific terms that are not lating names still came up with erros. Thanks though. These will help a lot. Ray

      --
      I love the earth and the sky.
    5. Re:Try this: by pomo+monster · · Score: 1

      It was a good error--"backside" means "arse." :-)

      Now lemme see if I can butcher this phrase I learnt: "Du ar så amskralig att jag kissar i mina kalsonger nar jag ser det."

    6. Re:Try this: by TERdON · · Score: 1

      "You are so (amskralig) that i pee in my underwear (nar) I see it".

      The dots that sometimes are above "a" are important in Swedish! (to be really exact it isn't dots above a, ä and a are different letters in Swedish. Mixing them up is like mixing up u and o in English, or something...

      är = are (to be)
      ar = are (100 m^2)

      nar = not a word in Swedish
      när = when

      amskralig = not a word in Swedish, you possibly meant "anskrämlig", but that isn't really a common word...

      --
      I have a really elegant proof for Fermat's last theorem. If this sig was only a bit longer...
    7. Re:Try this: by Anonymous Coward · · Score: 0

      so can you use the method of following the letter with an 'e' to show umlauts?

      ex. ueber, aelter, for the German over/above and older.

  3. Same boat... by xiao_haozi · · Score: 3, Informative

    I am finishing up a BS in Biochemistry and Molecular Biology and have battled this throughout my collegiate years as well. I have searched long and far for a solution and have thus far not found anything. I have come across a few medical versions but even they tend to be for the lay person. One solution (somewhat) has been http://wikipedia.com/. I know this is not a dictionary but works if you need to double check a spelling, but mainly I have found it useful while writing scientific pieces to double check a few pathways or cell types. While its not comprehensive by any means, it is coming along at a suprisingly great rate. On a quick note... The Cell (which can be found at NCBI website) is a good book reference for such purposes as is Voet and Voet's Biochemistry.

  4. non standard phonetics by bluelip · · Score: 2, Interesting

    The last time I played w/ spellcheckers, the 'soundex' function was tops. It basically mapped phonetic sounds to values and summed them.

    The problem w/ scientific terms is that the rules and patterns that compose a soundex value don't hold up wo complex words.

    The same approach may possibly be taken, but the patterns and values will need to refined/redefined.

    --

    Yep, I never spell check.
    More incorrect spellings can be found he
    1. Re:non standard phonetics by bluelip · · Score: 1

      wikipedia has a better description of the soundex algorithm here:

      http://en.wikipedia.org/wiki/Soundex

      Soundex is a phonetic algorithm for indexing names by their sound when pronounced in English. The basic aim is for names with the same pronunciation to be encoded to the same string so that matching can occur despite minor differences in spelling. Soundex is the most widely known of all phonetic algorithms and is often used (incorrectly) as a synonym for "phonetic algorithm".

      --

      Yep, I never spell check.
      More incorrect spellings can be found he
    2. Re:non standard phonetics by GigsVT · · Score: 1

      Soundex really sucks. Don't use it. The whole "first letter stuck on the front" negates much of its value. Korn will never match Corn.

      Metaphone is better.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
  5. Simple solution? by sithkhan · · Score: 1, Interesting

    Why not go to Wikipedia or MIT.edu or some other website that that has text with the subject matter you are studying and simple copy and paste, and then add to dictionary? Or look on Gutenberg.org and copy and paste those. It worked for me when I was writing a paper on The Oddessey and had to use those uncommon Greek spellings. YMMV.

    --

    is it that bad seein a hot chick again? if i see a hot chick walkin down the hall i dont say "repost"
  6. Wiki-style Open Dictionaries? by grainfed · · Score: 1

    This sounds like a good case for a site with user contributions added to genre specific dictionaries - eg: Biology, medical, IT - pretty much any field with a very specific syntax could benefit...

    --
    ~/words_by_grainfed.txt
  7. Just suck it up... by Otter · · Score: 4, Insightful
    Realistically, you only use a tiny subset of scientific vocabulary. OK, as an undergrad you face more breadth than a researcher does, but still.

    Just suck it up. Add words to your dictionary as you go, and within a month you'll rarely see those squiggly red lines. For some reason, people are too intimidated to just start into it.

  8. Abiword Plugins, including Wikipedia by Noksagt · · Score: 2, Informative
    One solution (somewhat) has been http://wikipedia.com/. I know this is not a dictionary but works if you need to double check a spelling, but mainly I have found it useful while writing scientific pieces to double check a few pathways or cell types.
    AbiWord is a capabale F/OSS word processor which is available for most platrforms. It is lighter than OO.o Writer (though that also means it lacks SOME of OO.o's features). One of the really nice features is that it supports a number of plugins. There are plugins which allow you to search a selected word on Wikipedia, google, and dict.org. Also, on *nix, it can use GDict.
  9. well you see.. by Anonymous Coward · · Score: 1, Funny

    Those words are irreducibly complex, clearly indicating that a concious mind, or intelligent designer created them (yes, a human "actually" created them, but since this human was created by an intelligent designer, it comes out the same) . By using a spell checker, you are indicating that, somehow, YOU are able to ascertain the correct spelling, when only the intelligent designer could possibly know this (see my study of pirate population vs. climate warming for more proof). Therefore, simply asking about the existence of such a spell checker is blasphemy. I mean, er, unscientific.

    As a student of biology, weren't you taught the basics of ID? It's amazing what this country has come to. You should get on your knees and ask the intelligent designer for forgiveness in questioning His creation. Otherwise, you're going straight to purgatory. I mean, uh, straight to purge your mind of these thoughts. Or something.

  10. Pubmed by virology-not+for+com · · Score: 4, Insightful

    I had this idea a while ago. Probably every science major does. Anyways, dictionary files are simple, just the word followed by an endline. So all you need is a good database. A good one for biology is pubmed over at http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=P ubMed&itool=toolbar They offer a way to download all their abstracts. Most are spell checked, and they should, in culmination, include most every biological word. So, download their abstracts (i think they are in xml) parse them (delete duplicates and words in a normal dictionary... maybe words with numbers) and put them into a txt file followed by an endline. Done.

    1. Re:Pubmed by virology-not+for+com · · Score: 1

      This guy has made a dictionary for vet school that conatains over 30,000 words. I had a quick look through, and it had most of the words i searched for. It's available as a zip file, and designed for ms word. http://www.isdn.net/scottsmith/interests/medicalsc hool/Technology/MSWordDictionary/

  11. Google Scholar by phyy-nx · · Score: 2, Informative

    I know this isn't an autmated solution, but whenever I need to know how to spell a scientific word, I use Google Scholar. I take a guess at the spelling, search for it, and google will often prompt me with the correct spelling. If I get thousands of hits but all happen to be wrong then hey, at least I'm spelling the word the same way thousands of others have :)

  12. Backup your dictionary!!! by Lifix · · Score: 4, Insightful

    I can't tell you how few people remember to backup their dictionary when they backup a computer. I worked as an intern in the IT department at my school and tech savy faculty members would regularly loose 3+ years of work on a custom dictionary because they failed to back it up. I suggest you just struggle through, and add all the words to the dictionary, but remember to keep a copy somewhere.

    The ability to customize your dictionary is something that most people in the tech world don't talk about much, but I use it every day of my life: by day I work at a computer retail store, but at night and on the weekends I'm a writer and a custom dictionary keeps me from screwing up proper nouns that I am using.

    --
    In nature, there are neither rewards or punishments, there are only consequences.
  13. Nisus by Spock+the+Baptist · · Score: 1

    At one time you could get specialty dictionaries for Nisus from the Nisus company. I wrote my thesis with Nisus, and then imported the text without formating into Word 5 for the Mac. Had to do that as I submitted my thesis via e-mail attachment to my thesis director who had a M$ Windows box.

    Apple are you listening? The spell-checking feature built into Mac OS X should have a file or something of that nature that could be imported into the system dictionary file/folder.

    Indeed, and interesting project for us scientific Mac types would be an open source project that would be a simple 'vocabulary' file imported into the dictionary file of scientific terms from the various branches of the natural sciences, and mathematics, and well as for engineering, and computer science.

    Perhaps someone knows of such a file.

    --
    "Oh drat these computers, they're so naughty and so complex, I could pinch them." --Marvin the Martian
  14. site-wide dictionary? by Jump · · Score: 1

    Why not maintain a site-wide dictionary? Of course, if you want to save your
    own time only, you can only buy one. Actually, this is an area where users
    may help openoffice to become more successful. Create and maintain a scientific
    terms spelling database. Make it an optional extension for the spell checker.

  15. build one from existing texts by F�an�ro · · Score: 1

    you could just take some scientific texts from your profession that are available electronically and that are already proofread.

    tokenize them, sort the words, filter out duplicates (maybe only keep words that appear more than once), and voila, you have a spelling dictionary.

    Not sure about copyright but I think it should not be a problem for this use.

    1. Re:build one from existing texts by GiMP · · Score: 1

      I was thinking the same thing. Perhaps grab some books from the Gutenburg project and just feed them to the following:

      xargs -n1 echo dictionary.txt

    2. Re:build one from existing texts by GiMP · · Score: 2, Insightful

      Slashdot hid what looked like HTML, lets try that again:

      xargs -n1 echo < TheBook.txt | sort | uniq > dictionary.txt

    3. Re:build one from existing texts by cperciva · · Score: 1

      sort | uniq

      Useless use of | detected. Did you mean: sort -u ?

    4. Re:build one from existing texts by Anonymous Coward · · Score: 0

      Useless "use of" detected. Not only is it a complete waste of 2 words, but it actually makes the sentence harder to read.

    5. Re:build one from existing texts by GiMP · · Score: 1

      For that matter, a useless use of xargs. Here is a version with only one command not a bash built-in (sort):

      for x in $( YourFile); do echo $x; done | sort -u

  16. Medical Dictionary by sgent · · Score: 2, Informative

    I've used medical dictionaries in the past. Stedmens is probably the most well known -- and make their dictionaries available in digital form for import into Word, etc. They also have legal and some other terms. http://www.stedmans.com/category.cfm/210

  17. Solution for Mathematicians by students · · Score: 1
    Mathematicians and mathematical scientists (is there some other kind of scientist?) use LaTeX, which has a spell checker for technical terms. If you type your LaTeX wrong, you get this:
    error: ! Undefined control sequence

    Then you know to fix your spelling, and the line and column number.
    You can also run LaTeX through a normal spell checker, and it will ignore the technical terms. aspell does this automatically, but ispell needs the -t option. LaTeX is a pain to learn, but it is worth it.
    Unfortunately, this "fix" doesn't work for my history homework, which has far more strange words.
    1. Re:Solution for Mathematicians by twistedcubic · · Score: 1

      I think the OP was asking for a dictionary containing technical terms. I have to keep my own separate dictionary for math terms for use with aspell, but it would be nice if there were a community-maintained one.

  18. While We're At It... by hzs202 · · Score: 1

    A Spell-Checker for Scientific Terms?

    Hey why stop there? It would be great if someone developed a Spell-Checker for Hip-Hop terminology. Because I have been having so much trouble when I write my rap songs. I can't figure out where to put the "izzles" and "eezies".

    1. Re:While We're At It... by Anonymous Coward · · Score: 0

      I can't figure out where to put the "izzles" and "eezies".

      It's very simple. The man puts his izzle in the woman's eezie.

      Surely you learned about the birzzles and the beezies, no?

    2. Re:While We're At It... by lilmouse · · Score: 1
  19. Why not check Version Tracker by chivo243 · · Score: 1
    --
    Sig Hansen?
  20. Plants' Latin names by skinfaxi · · Score: 2, Informative
    I've been working on a web project about vegetable seed identification and did some searching for a very similar thing (a dictionary I could import into MS Word that had the Latin names of plants). I didn't find anything appropriate - it's particularly sticky because some folks use different names to mean the same plant, and some use the same name to refer to different plants! As in many scientific fields, there are efforts to standardize the nomenclature so everyone is talking about the same thing using the same terms.

    This page has an interesting list (of plant names) http://www.bgbm.org/IAPT/Nomenclature/Code/SaintLo uis/0118IndexScfNames.htm but isn't great for building a dictionary because it includes way -not- to spell words!

    Standardized Nomenclature of Medicine, Clinical Terms (SNOMED-CT) is a medical group struggling with the same issues. They don't seem to provide any kind of dictionaries, either. http://www.snomed.org/about/index.html

  21. "Was wondering if there where any" by Anonymous Coward · · Score: 0

    And it's "were" not "where." You need a better brain as well as the dictionary.

  22. use amalgamated OOo dictionaries by pbhj · · Score: 3, Interesting

    1, How about you have a website where people upload their private dictionaries (with a language option too??). This shouldn't be too hard using the OOo files as they are presumably XML, might be quite server intensive though. Strip the words and tabulate them (add them to a (pgsql?) database) - you can throw any away that you have enough of or that have already been rejected as misspellings (sp?!?).

    If a word appears with the same spelling in 100 (or downloaders preference) dictionaries then it is tagged for inclusion in the master dictionary.

    Uploaders could specify the general area they write in as well as the language, eg Physical Sciences, Literature, Agriculture, ... so a dictionary request could be limited by subject field too.

    Require a dictionary upload _OR_ payment of a fee to avoid freeloaders.

    2, ...
    3, Profit ??!

    [PS: I just looked and OOo uses .aff or .dic formats.]

  23. exact opposite by Anonymous Coward · · Score: 0

    yes lets make a proprietary extension that is universally useful available ONLY in OO. Then people MUST use the *Best* software. Yes and while we're at it.. lets interchange source code with a " Open Source Closed Format" so that no one can steal free software unless they themselves have released their bodies to the public under the terms of GPL v4.0.34(b)rc5.

  24. Try ispell's large dictionary by one-egg · · Score: 1

    The "large" ispell dictionary has a lot of biological and medical terms. It might serve as a good starting point.

  25. Dorland's Medical Dictionary by neMoSum · · Score: 1

    I work part-time as a freelance copyeditor for medical and scientific publishing. I use Dorland's Medical Dictionary as my reference for medical/scientific terminology. Of course the dictionary is geared towards medical use, so if the scientific terms you need are, for example, from botany or astrophysics, this probably would not be of use. Dorland's does, however, come with a CD-ROM which includes a dictionary/spellchecker for use with with MS-WORD, so I find it immeasurably valuable.

  26. 30,000 word medical dictionary by virology-not+for+com · · Score: 1