Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?
New submitter mni12 writes "I have been working on a Bayesian Morse decoder for a while. My goal is to have a CW decoder that adapts well to different ham radio operators' rhythm, sudden speed changes, signal fluctuations, interference, and noise — and has the ability to decode Morse code accurately. While this problem is not as complex as speaker-independent speech recognition, there is still a lot of human variation where machine learning algorithms such as Bayesian probabilistic methods can help. I posted a first alpha release yesterday, and despite all the bugs one first brave ham reported success. I would like to collect thousands of audio samples (WAV files) of real world CW traffic captured by hams via some sort of online system that would allow hams not only to upload captured files but also provide relevant details such as their callsign, date & time, frequency, radio / antenna used, software version, comments etc. I would then use these audio files to build a test library for automated tests to improve the Bayesian decoder performance. Since my focus is on improving the decoder and not starting to build a digital audio archive service I would like to get suggestions of any open source (free) software packages, online services, or any other ideas on how to effectively collect large number of audio files and without putting much burden on alpha / beta testers to submit their audio captures. Many available services require registration and don't support metadata or aggregation of submissions. Thanks in advance for your suggestions."
They like collecting stuff
Na, amateur radio transmissions are some of the most boring conversations known to man (and I am a ham radio operator). No sex, drugs and rock and rock - no eavesdropping. Besides, we're mostly harmless.
Back to the topic. Because the bands are proscribed, ie, there are frequencies that are just CW (and phone or digital or whatever), it would seem an easy job to just record a band for a while to grab some samples. Use a software defined reciever (to allow for easy scripting), work the grey line in your area. Even if your software isn't tuned well yet, I would hazard a guess that it is smart enough to detect CW vs. radio noise. Use that to start and stop the file. You probably don't need WAV, that's sort of overkill for CW. Even cruddy ol MP3 ought to give you more than enough headroom for further processing.
Faster! Faster! Faster would be better!
But it also means getting the metadata as free-form-text, which is likely to need interpreting before processing. A HTML form on the other hand will provide, by comparison, quite standardised data format. It also provides an easy file upload facility.
Writing something in PHP/Python that accepts uploads and stores metadata in a database is not very much work to hack together. The main work will be deciding the fields and so on. A form can require an entry in the field for antenna type, whilst in e-mail it's easy to forget a field.
The main challenge I guess is to get people to submit information...
Assembling etherkillers for fun an profit
Grrr. No editing.
And if you don't have a receiver handy, get a Fun Cube (not to be confused with the Time Cube) and hook it up to you computer with a random wire or dipole antenna.
Faster! Faster! Faster would be better!
Picasa came to mind - this service supports audio files and, last time I looked, allows you to share stuff. Although I should add that it has been a while since I looked at this service. Complements on your your clearly written post... days of /. gone by
So.... I guess you've never heard of skimmer, the various remote receivers out there, and the SDR's that people are using to record large swathes of shortwave spectrum? You know people have been working on the problem for a while, as in decades? Skimmer decodes multiple streams of morse at once. Wake me when your stuff outperforms skimmer.
You know, you could hardly pick a less controversial topic than amateur radio. If you want to get everyone all wound up about your favorite boogyman, at least start off on one of the more irritable subjects we tend to yammer on about. The level of angst here is likely to be too low to channel.
Faster! Faster! Faster would be better!
I agree. Got to a WebSDR like http://websdr.ewi.utwente.nl:8901/ and automate the process. You can get a large amount of OTA signals to examine, in the correct ratios, styles and weightings. This requires you to decide whether or not the signal under test is CW or not but that's part of your algorithm anyway. n6gn
It's really kind of overkill to do all that.
A simple phase-locked loop circuit is generally adequate to discriminate between tone/no-tone and you can buy them for pennies.
Once you have that done, tie your input data line to the PLL output and measure the widths of the tone pulses. A dash is going to average about 3 times as long as a dot, the inter-tone spaces are going to be about 1 dot-width with inter-character spacing being 1 dash-length. Actual dot and dash timing can be expected to vary from about 5WPM to 100WPM (IIRC), and anything faster than that is likely a machine.
CWG is correct in the main. Even before the Internet, typical Morse exchanges tended to consist of call sign, location, antenna type and local weather.
CW is mostly about just being able to contact other people briefly just to say you did it. And for beacons, such as the recent space experiments. For more interesting content, there's voice, RTTY and TV.
The Linux OS has a large suite of amateur radio programs, including decoders, loggers and equipment controllers. I think there are even special ham distros.
You can collect a lot of morse code traffic in the wild. Just get yourself a good HF receiver with some filtering (notch filter and a DSP). Set up a dipole as your receive antenna cut to 1/4 the wavelength of the band you will be monitoring. Here is a handy band plan to guide you to where you will be able to find morse code which is normally called CW for continuous wave communications.
I recommend this over any attempt to collect samples directly from hams. I know I do morse code differently when using the radio for casual contacts than I do making exam tapes back when I was a volunteer examiner.
Another attribute that will affect morse code transmission is the type of morse key being used. I use either a straight key which is completely manual and my dots and dashes do vary depending on fatigue, or a paddle key where one paddle makes a dot and the other key makes a dash. The dots and dashes are consistent in duration but the space between them will vary depending on fatigue. I did try using a vibroplex key. The dash will vary in duration but the dots are constant in both duration and time between each dot. Most of my friends still use them (A mutual acquaintance owns the company), but I found myself constantly having to slow down because I would let that pendulum swing speed up my keying.
Happy hunting and 73.
These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
Write an article and submit to ARRL's QST and join and post to the AMSAT mailing lists as there are quite a few keys there as well. Talk to your local amateur radio club and get the word out and you might even talk to your area coordinator.
150 kHz to roughly 2 GHz does happen to encompass the ham bands. There is a small region around 1.27 MHz where the local oscillator won't lock, but that doesn't jibe with a amateur band (IIRC, too lazy to look it up).
It's not free. It's a low volume device, so it costs a bit (125 GBP). Life is hard.
Faster! Faster! Faster would be better!
You replied, though. The Troll achieved success.
I wrote a similar application in the late 1980's using a backpropagation neural net, and it was difficult to complete.
Asking for volunteer submissions is the easiest and obvious answer. There is a group of commercial operators at http://www.radiomarine.org/ who might have tapes for you.
Some of the CONET project recordings feature morse, but the ones that I have heard sound mechanically generated.
Finally, you can collect for yourself. HF is a desert these days, the last time I tried the only hams were active (there has not been commercial morse for decades). This also means the equipment is dirt cheap on eBay so if you have any outside space at all for just a long wire antenna you can collect your own samples.
Also, you might not be aware but there are several regional variations of international morse. Cyrillic operators have 5 extra characters, the Chinese have an abbreviated numeric system, etc.
I have already many samples of CW contest traffic recorded from my Flex3000. Because most of it is computer generated the decoding challenge is mostly related to signal-to-noise ratio and interference, not so much on personal rhythm variances when people are using straight key.
The idea presented was to collect many different kinds of CW samples. I am looking more for variation than uniformity. Having an adaptive decoder algorithm that adjusts itself automatically to all kinds of CW is a challenge.
The thesis you are basing your work is from 1977; while no doubt current when it was written, there is has been a lot of work on human signal decoding since then.
I'd strongly suggest looking at Hidden Markov Models:
http://en.wikipedia.org/wiki/Hidden_Markov_model
While some recent methods have gone beyond HMMs for speech recognition, that's been the baseline "good" solution for the past decade.
Since this is a binary signal problem another approach to consider would be Markov Random Fields (MRFs) which could be used as an initial de-noising pass or even as a full decoder if you set the cost functions right.
Your idea of user adaptation is pretty reasonable, but my guess is the primary thing that matters would be an overall speed scaling. IOW for good decoding you probably just need to normalize the average letter rate between users.
Good luck.
Great suggestion - thank you! Looks like the site requires registration but it has been created exactly for this kind of audio related research. It has even APIs to access the data. I will investigate this a bit further.
a friend pointed this out to me the other day:
https://archive.org/details/SsMarineElectricWoohSos
there are 3 kinds of people:
* those who can count
* those who can't
Obviously you realize there are differences in how people send CW. While I applaud your drive to make a smarter decoder - the reality is that you need to make sure it works on live traffic. So in that respect, you should hook it into some kind of SDR software like HRD or even make your own that can decode multiple streams of CW. If you don't have a radio, I suggest maybe a SoftRock receiver?
1. It gives you actual live conversations with all the mistakes and alterations. Not everyone uses computer generated CW. In fact, most brass pounders dislike it because it's boring to listen to and dry to copy.
2. There are sanctioned CW events all the time... QSO parties, commemorative stations and even at the beginning of next month there are straight key nights where people put the paddles away and break out the straight key.
3. I'm going to assume the end goal is to put this listening to live feeds anyways. You should work toward that goal now as then you can write code to compensate for QRM and QRN/fading.
Having people feed you 'tapes' won't accomplish your goal. You need to have it work with the real source.
boom goes the dynamite....
This is a problem that effect all kinds of machine learning. It is always very difficult to collect enough samples to teach good recognition skills. Whether it is hand writing, speech or as in this case Morse Code. I'm wondering if some open library that could be uploaded to for this kind of thing might not exist, or if not, it might be a good idea.
No sigs in BETA. Beta SUCKS.
Totally agree. Ham radio has become one of the most boring things one can do in his spare time.
Years ago I was into it, and I was developing some advanced DSP stuff (sort of what is known now as software defined radio, but the algorithms I was using were different and better performing than those used by radio amateurs). As I started leaking some details about what I was developing, I suddenly realized that radio amateurs were not interested into experimenting new technologies: they just wanted to buy high tech toys. So I gave up everything: I wanted people to learn science, not how to fill up a check and buy a pre-built kit. I heartily suggest you to invest your skills and your spare time into something that is much more useful than ham radio. For example I give science seminars in the high school and serve in the board of a nationwide science association, just look around: there are plenty of opportunities.
Couldn't you just create a computer generator for this audio, that uses a PRNG to intersperse pauses and other variations? You could create a much wider variety of conditions to put your parser through by controlling how much variation is in the length of each beep, pauses between beeps, pauses between letters. You could create a really bungling case or create a perfect case, and anything in between. Why not just do that?
"Stratigraphically the origin of agriculture and thermonuclear destruction will appear essentially simultaneous" -- Lee
Great. You have now told our enemy, namely, the general population, that a.r.t. isn't monitored by the NSA. North America will now be awash in terrorists.
Sleep your way to a whiter smile...date a dentist!
First off, thank you Slashdot UI, for having me retype this whole thing again.
I did this back in the early 90s with my Amiga. The hardware interface consisted of a transistor, filter capacitor, and variable resistor (I don't remember the exact design I came up with) to interface to the Amiga's joystick port (which used standard Atari controller wiring). I wrote the software decoder in Blitz Basic, and it used a scrolling window of 20-30 seconds over which it would average the pulses to determine the current dit and dah length. Any pulses deviating significantly from the current dit and dah length indicate a likely change in operator (one station finished keying and the other began their response), and the window would be positioned using that as as the edge point.
The system worked extremely well, and was far more accurate than my AEA PK-232MBX when it came to decoding morse code. It decoded most anything I threw at it. Decoded output was sometimes delayed until it had received enough code to determine the current transmission rate and style, and then it would output a chunk of text at one time as it decoded the whole buffer at once. Then it would output real-time until a deviation in dit-dah lengths had been exceeded and the window repositioned so the dit and dah length could be recalculated.
There are two discreet problems to address, and it sounds like you're lumping them together, which may not be a good way to proceed. First is the audio filtering / notch filter which tries to isolate a specific morse code signal out of other transmissions in the adjoining frequencies and general background noise. The other is simply decoding of the morse code message. Ideally, step 1 should be the analog portion, and step 2 should be purely digital.
Better known as 318230.
I suddenly realized that radio amateurs were not interested into experimenting new technologies
I'm not a radio amateur but I certainly am interested in experimenting with new technologies. The other day, I was thinking whether it would be possible to combine a GPS unit, a PRNG, frequency hopping, exotic modulation schemes and SDR into a low-bandwidth, virtually undetectable means of clandestine communication. But I suspect that this is not exactly what amateurs are allowed to do anyway.
Ezekiel 23:20
enlighten me please
Internet access is pretty cool, irrespective of the delivery method, but get back to me on how well that works without power (storms or other disasters). If you have a ham license and a battery-backed transceiver, you can communicate easily over long distances. Because of its narrow bandwidth, CW works very well.
During a recent contest, a ham in the northeast U.S. communicated with Wake Island using CW and four watts of power. Pretty impressive.
Not everyone foams at the mouth over the latest toys.
Circle the wagons and fire inward. Entropy increases without bounds.
I did some testing using classifiers in WEKA package but was quite disappointed on the results. My next attempt was to leverage PNN (Probabilistic Neural Network) and got somewhat better results. In the test runs with noisy audio files with Morse code I got up to 90% accuracy in classifying dits and dahs. I have not used FANN package a lot though I installed it on my development machine 1-2 years ago. What are your thought about FANN exactly? How would you go about using the package?
Thanks for your advice, junior. I am not retired but just happen to be interested in Machine Learning methods and this problem seems to be difficult enough since only few people have created anything that would even closely perform at skilled human operator level. I did investigate some speech recognition algorithms such as HMM and SOM. I have spent also some time collecting data and training software to recognize real world noisy and messy signals. In fact the current shipping version of FLDIGI package has one of these algorithms (SOM) built in.
I don't have a PhD in related field but I have studied signal processing and even wrote some software for MRI image reconstruction and processing earlier in my career. The papers I have read on speech recognition over the last 20 years have certainly improved the state of the art but the methods are more incremental improvements than some ground breaking new discoveries.
BTW - How is that Siri working for you in a noisy car with windows open at highway speed? Humans can still understand each others in this kind of conditions.
As others have posted, I think you're making the problem far more complex than it needs to be by insisting on using "machine learning" techniques. All you need to do is some basic filtering to identify the beeps from the background noise, some averaging over a time window to determine which are dashes and dots, and the rest is just simple lookup tables.
I made the same mistake back in university. We applied machine learning techniques to playing a game (I forget which one), but it turned out that after three rounds of playing the game, the system just built a static lookup table and won all the time. The game just wasn't as random or intuitive as it seemed at first, so our solution to it was serious overkill for the problem at hand.
I do not fail; I succeed at finding out what does not work.
Na, amateur radio transmissions are some of the most boring conversations known to man (and I am a ham radio operator). No sex, drugs and rock and [roll]
Listen to the guys on 80 meter AM and you'll get that. :-) And 14.313 Mhz has people on drugs....
Tired of being "punished" by the Slashdot $rtbl since 2002. I'm now over at http://soylentnews.org/ .
They do HF, quite well, and they're fairly sensitive. $250 isn't a bad deal for a DC to daylight software defined receiver.
Tired of being "punished" by the Slashdot $rtbl since 2002. I'm now over at http://soylentnews.org/ .
Huh. I mostly do satellite stuff (when I'm active at all). Guess I'll have to try the lower bands if I want some excitement....
Faster! Faster! Faster would be better!
Cool. I've been wanting to get into satellites. I'm building a homebrew az/el rotation system controlled by Arduinos. With all the cubesats that have recently launched, satellite is where it's at.
Tired of being "punished" by the Slashdot $rtbl since 2002. I'm now over at http://soylentnews.org/ .
CW is dead, buddy.
Dead as in "There are few people left on the planet who actively work CW on a high proficiency level without using a keyboard and a screen reader".
Today you can see ham shacks without a CW keyer as a norm, and if you see a CW keyer, the owner only in rare cases can go beyond 20wpm without breaking a sweat, making lots of errors all along the way and getting frustrated at hearing others do perfect CW, albeit with a keyboard.
To give you a sense of scale: There are no more than roughly 4-500 hams worldwide, who can use an electronic keyer in such a way that they can hold a meaningful conversation on the air at more than 40wpm at an acceptable error rate and who at the same time can follow such a conversation with their ears easily.
I know quite a few members of that minority and they are all like dinosaurs about to die out. The future lies in predictive keying by a computer, high resolution SDRs for decoding and give it another 10 years even the most ardent pro-CW people will make way for other digital modes that can handle all the distinct advantages of CW operating (FullBK/QSK, pile ups and propagation resilience) just as good or better.
Speaking for myself, by now I am fed up with going on the air and either listen to either machine CW or inept operators who never were afforded the luxury of good tutoring and coaching to make their CW better, more precise and fluent.
So, let me rephrase my initial sentence: CW may not be dead, but the true CW operator is a dying species and I can't see any merits to your project when the future is machine-only anyway.
On that website http://websdr.org/ you can find online software defined radio stations. You can use your computer to record CW without a receiver.
73's