Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?
New submitter mni12 writes "I have been working on a Bayesian Morse decoder for a while. My goal is to have a CW decoder that adapts well to different ham radio operators' rhythm, sudden speed changes, signal fluctuations, interference, and noise — and has the ability to decode Morse code accurately. While this problem is not as complex as speaker-independent speech recognition, there is still a lot of human variation where machine learning algorithms such as Bayesian probabilistic methods can help. I posted a first alpha release yesterday, and despite all the bugs one first brave ham reported success. I would like to collect thousands of audio samples (WAV files) of real world CW traffic captured by hams via some sort of online system that would allow hams not only to upload captured files but also provide relevant details such as their callsign, date & time, frequency, radio / antenna used, software version, comments etc. I would then use these audio files to build a test library for automated tests to improve the Bayesian decoder performance. Since my focus is on improving the decoder and not starting to build a digital audio archive service I would like to get suggestions of any open source (free) software packages, online services, or any other ideas on how to effectively collect large number of audio files and without putting much burden on alpha / beta testers to submit their audio captures. Many available services require registration and don't support metadata or aggregation of submissions. Thanks in advance for your suggestions."
They like collecting stuff
I'd recommend using e-mail. It's open to everyone to use, and they probably already have registered one. They can provide any and all metadata in the free-form text field known as "body", and it even supports multiple file attachments!
Now NSA is targeting ham radios too?
What you need is an HF receiver and a long wire dangling out of the window as high as possible.
Or use one of the receivers with web interfaces.
You are aware that there are several projects for decoding Morse, yes?
Go to any websdr during a CW contest weekend and, well, there you go. Sure most of it is computer generated but the rest will be hand-keyed.
Picasa came to mind - this service supports audio files and, last time I looked, allows you to share stuff. Although I should add that it has been a while since I looked at this service. Complements on your your clearly written post... days of /. gone by
I wouldn't be surprised at all....
So.... I guess you've never heard of skimmer, the various remote receivers out there, and the SDR's that people are using to record large swathes of shortwave spectrum? You know people have been working on the problem for a while, as in decades? Skimmer decodes multiple streams of morse at once. Wake me when your stuff outperforms skimmer.
You know, you could hardly pick a less controversial topic than amateur radio. If you want to get everyone all wound up about your favorite boogyman, at least start off on one of the more irritable subjects we tend to yammer on about. The level of angst here is likely to be too low to channel.
Faster! Faster! Faster would be better!
I'd go about it completely different... Either setup a cheap SDR can record the CW yourself... You should be able to do this for well less than $100... Or setup a receiver (or SDR) on a specific freq, and ask ham operators to send morse code to you on that freq then correct via a web interface any mistakes.
Either way should be easier and more realistic than having hams send your CW recordings....
Ah, the happy memories of about 100 Morse Intercept Ops using R390s and SP600s along with Underwood manual typewriters and 6ply paper, as well as a rhombic antenna with each leg 1000 feet long, listening to some of the most poorly trained (and some of the best, too) cw operators in the (undisclosed evil empire). We didn't. Need no seeking Bayside filters, just lots a coffee.
You can collect a lot of morse code traffic in the wild. Just get yourself a good HF receiver with some filtering (notch filter and a DSP). Set up a dipole as your receive antenna cut to 1/4 the wavelength of the band you will be monitoring. Here is a handy band plan to guide you to where you will be able to find morse code which is normally called CW for continuous wave communications.
I recommend this over any attempt to collect samples directly from hams. I know I do morse code differently when using the radio for casual contacts than I do making exam tapes back when I was a volunteer examiner.
Another attribute that will affect morse code transmission is the type of morse key being used. I use either a straight key which is completely manual and my dots and dashes do vary depending on fatigue, or a paddle key where one paddle makes a dot and the other key makes a dash. The dots and dashes are consistent in duration but the space between them will vary depending on fatigue. I did try using a vibroplex key. The dash will vary in duration but the dots are constant in both duration and time between each dot. Most of my friends still use them (A mutual acquaintance owns the company), but I found myself constantly having to slow down because I would let that pendulum swing speed up my keying.
Happy hunting and 73.
These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
Write an article and submit to ARRL's QST and join and post to the AMSAT mailing lists as there are quite a few keys there as well. Talk to your local amateur radio club and get the word out and you might even talk to your area coordinator.
Freesound would be best place for audio, just make a unique tag and all submissions can be found. Encode all metadata as tags and it is machine parsable too.
You replied, though. The Troll achieved success.
I wrote a similar application in the late 1980's using a backpropagation neural net, and it was difficult to complete.
Asking for volunteer submissions is the easiest and obvious answer. There is a group of commercial operators at http://www.radiomarine.org/ who might have tapes for you.
Some of the CONET project recordings feature morse, but the ones that I have heard sound mechanically generated.
Finally, you can collect for yourself. HF is a desert these days, the last time I tried the only hams were active (there has not been commercial morse for decades). This also means the equipment is dirt cheap on eBay so if you have any outside space at all for just a long wire antenna you can collect your own samples.
Also, you might not be aware but there are several regional variations of international morse. Cyrillic operators have 5 extra characters, the Chinese have an abbreviated numeric system, etc.
SP600s? A real old timer. Of course, we were still using R390s and mills in the 80s.....at least at a U/I little operation behind a certain snack bar. The bigger place nearby had migrated to KSRs and touch typing, and phased out the R390s in 1983 or 84...
The thesis you are basing your work is from 1977; while no doubt current when it was written, there is has been a lot of work on human signal decoding since then.
I'd strongly suggest looking at Hidden Markov Models:
http://en.wikipedia.org/wiki/Hidden_Markov_model
While some recent methods have gone beyond HMMs for speech recognition, that's been the baseline "good" solution for the past decade.
Since this is a binary signal problem another approach to consider would be Markov Random Fields (MRFs) which could be used as an initial de-noising pass or even as a full decoder if you set the cost functions right.
Your idea of user adaptation is pretty reasonable, but my guess is the primary thing that matters would be an overall speed scaling. IOW for good decoding you probably just need to normalize the average letter rate between users.
Good luck.
Indeed. I'd never have seen the GGP if GP hadn't taken the bait. Let the moderation system do its job.
Posting AC for obvious reasons.
Little plug for some windows shareware that i used: pilotmorse - pilotmorse.com. There are quite a few cw titles out there, some free as in beer, but this little guy for 20 bucks taught me the letters better than any of them. This will not teach you to listen to cw from actual hams.. Its much more basic than that in some sense (its designed for pilots). But, if your goal is to know the letters, its money well spent.
a friend pointed this out to me the other day:
https://archive.org/details/SsMarineElectricWoohSos
there are 3 kinds of people:
* those who can count
* those who can't
I would guess that you could do a standard detect, to calibrate and run a second pass on each signal to improve accuracy. Unless of course you are trying to do near real time.
Obviously you realize there are differences in how people send CW. While I applaud your drive to make a smarter decoder - the reality is that you need to make sure it works on live traffic. So in that respect, you should hook it into some kind of SDR software like HRD or even make your own that can decode multiple streams of CW. If you don't have a radio, I suggest maybe a SoftRock receiver?
1. It gives you actual live conversations with all the mistakes and alterations. Not everyone uses computer generated CW. In fact, most brass pounders dislike it because it's boring to listen to and dry to copy.
2. There are sanctioned CW events all the time... QSO parties, commemorative stations and even at the beginning of next month there are straight key nights where people put the paddles away and break out the straight key.
3. I'm going to assume the end goal is to put this listening to live feeds anyways. You should work toward that goal now as then you can write code to compensate for QRM and QRN/fading.
Having people feed you 'tapes' won't accomplish your goal. You need to have it work with the real source.
boom goes the dynamite....
This is a problem that effect all kinds of machine learning. It is always very difficult to collect enough samples to teach good recognition skills. Whether it is hand writing, speech or as in this case Morse Code. I'm wondering if some open library that could be uploaded to for this kind of thing might not exist, or if not, it might be a good idea.
No sigs in BETA. Beta SUCKS.
I know I am being an (anonymous) asshole here, but how hard can this be? Morse code was designed to definitively decipherable based on pauses between letters. Sure they vary but is that variance greater than the inter-word dot/dash silence even given two different operators?
I mean, write a normal program that runs multiple hypothesis about inter-letter pauses for a while and accumulates the inter-letter pause average. The hypothesis is rejected when parsing using that inter-letter average results in words like: ghksdjhf .
When go all Bayesian, just because you can? This is a good target for GOFAI.
isn't that something that retired people use with vacuum tubes? don't think i've ever seen Morse code except in the movie Balto 1 and 3.
oh, and ham is a cut of meat. http://en.wikipedia.org/wiki/Ham
Not trying to start an argument, but why are people Morse code Skype and text messaging is available? Trying to relive the days of Balto by tapping dits and dashes with your finger over long wires in Alaska? 4G internet access is available in most of the United States and Canada, right? IRC and chat rooms are still popular.
enlighten me please
Couldn't you just create a computer generator for this audio, that uses a PRNG to intersperse pauses and other variations? You could create a much wider variety of conditions to put your parser through by controlling how much variation is in the length of each beep, pauses between beeps, pauses between letters. You could create a really bungling case or create a perfect case, and anything in between. Why not just do that?
"Stratigraphically the origin of agriculture and thermonuclear destruction will appear essentially simultaneous" -- Lee
First off, thank you Slashdot UI, for having me retype this whole thing again.
I did this back in the early 90s with my Amiga. The hardware interface consisted of a transistor, filter capacitor, and variable resistor (I don't remember the exact design I came up with) to interface to the Amiga's joystick port (which used standard Atari controller wiring). I wrote the software decoder in Blitz Basic, and it used a scrolling window of 20-30 seconds over which it would average the pulses to determine the current dit and dah length. Any pulses deviating significantly from the current dit and dah length indicate a likely change in operator (one station finished keying and the other began their response), and the window would be positioned using that as as the edge point.
The system worked extremely well, and was far more accurate than my AEA PK-232MBX when it came to decoding morse code. It decoded most anything I threw at it. Decoded output was sometimes delayed until it had received enough code to determine the current transmission rate and style, and then it would output a chunk of text at one time as it decoded the whole buffer at once. Then it would output real-time until a deviation in dit-dah lengths had been exceeded and the window repositioned so the dit and dah length could be recalculated.
There are two discreet problems to address, and it sounds like you're lumping them together, which may not be a good way to proceed. First is the audio filtering / notch filter which tries to isolate a specific morse code signal out of other transmissions in the adjoining frequencies and general background noise. The other is simply decoding of the morse code message. Ideally, step 1 should be the analog portion, and step 2 should be purely digital.
Better known as 318230.
Looks like you could benefit from FANN neural nets if I understand the question correctly.
when I began my CS studies. Your algorithm was probably more sophisticated than mine and my input was simply mouse clicking by the user. I wrote it in 1998 and got the idea for my CS 101 project from my interest to learn Morse code and at the time I couldn't find any software for that purpose. My algorithm took input and tried matching it to characters and until if managed to get a match, it assumed differences in length just to be lack of user skill but once it found a match, it assumed that it knew the rate. The data structure with characters as bytes with 1 = dit and 0 = dah made so much sense that it seemed almost as if Mr. Morse had thought of binary when designing the system. My software would not have handled changes in the rate but the most interesting discovery was how bad a mouse is as a "telegraph" since your hand perceives the press as much longer than the actual time the mouse has signaled "button down" simply because of how most mice are designed. An improvement I could have made to improve the mouse as an interface was to calculate an "offset" to add to each dit or dah before processing it.
will have a cheaper device out lickety–split!
It sounds like you're an old fart that finally retired and wants to tinker. Good for you. :)
I suggest looking at modern papers on speech recognition. You're essentially solving the same problem except you're using a simpler input, so don't bother trying to come up with your own algorithm. Just lift an algorithm from a paper. Yeah you'll have to train it, but it'll work 1000x better than anything you're likely to come up with unless you have a PhD in a related field.
go to WEBSDR.org it will solve your issue
CW is dead, buddy.
Dead as in "There are few people left on the planet who actively work CW on a high proficiency level without using a keyboard and a screen reader".
Today you can see ham shacks without a CW keyer as a norm, and if you see a CW keyer, the owner only in rare cases can go beyond 20wpm without breaking a sweat, making lots of errors all along the way and getting frustrated at hearing others do perfect CW, albeit with a keyboard.
To give you a sense of scale: There are no more than roughly 4-500 hams worldwide, who can use an electronic keyer in such a way that they can hold a meaningful conversation on the air at more than 40wpm at an acceptable error rate and who at the same time can follow such a conversation with their ears easily.
I know quite a few members of that minority and they are all like dinosaurs about to die out. The future lies in predictive keying by a computer, high resolution SDRs for decoding and give it another 10 years even the most ardent pro-CW people will make way for other digital modes that can handle all the distinct advantages of CW operating (FullBK/QSK, pile ups and propagation resilience) just as good or better.
Speaking for myself, by now I am fed up with going on the air and either listen to either machine CW or inept operators who never were afforded the luxury of good tutoring and coaching to make their CW better, more precise and fluent.
So, let me rephrase my initial sentence: CW may not be dead, but the true CW operator is a dying species and I can't see any merits to your project when the future is machine-only anyway.
On that website http://websdr.org/ you can find online software defined radio stations. You can use your computer to record CW without a receiver.
73's
Tell that to the lady next door who is positive that I'm the reason her 20 year old TV works like shit.
Are you truly wedded to WAV files? It seems to me that using an audio file format that supports embedded tags, such as MP3 (or AIFF, or many others), would allow you to keep the data and metadata together in a single entity.
Since you have a small collection of alpha/beta testers that will be submitting audio samples to you, is it much of a burden for you to provide them with a tool with a metadata input UI that packages it with the audio before upload to your favourite file repository?
Build a Neural Net to do what you wanted to do. Look at the example projects out there some include sound...