Slashdot Mirror


Ask Slashdot: How To Build a Morse Code Audio Library For Machine Learning?

New submitter mni12 writes "I have been working on a Bayesian Morse decoder for a while. My goal is to have a CW decoder that adapts well to different ham radio operators' rhythm, sudden speed changes, signal fluctuations, interference, and noise — and has the ability to decode Morse code accurately. While this problem is not as complex as speaker-independent speech recognition, there is still a lot of human variation where machine learning algorithms such as Bayesian probabilistic methods can help. I posted a first alpha release yesterday, and despite all the bugs one first brave ham reported success. I would like to collect thousands of audio samples (WAV files) of real world CW traffic captured by hams via some sort of online system that would allow hams not only to upload captured files but also provide relevant details such as their callsign, date & time, frequency, radio / antenna used, software version, comments etc. I would then use these audio files to build a test library for automated tests to improve the Bayesian decoder performance. Since my focus is on improving the decoder and not starting to build a digital audio archive service I would like to get suggestions of any open source (free) software packages, online services, or any other ideas on how to effectively collect large number of audio files and without putting much burden on alpha / beta testers to submit their audio captures. Many available services require registration and don't support metadata or aggregation of submissions. Thanks in advance for your suggestions."

12 of 79 comments (clear)

  1. Re:Try the NSA by ColdWetDog · · Score: 5, Informative

    They like collecting stuff

    Na, amateur radio transmissions are some of the most boring conversations known to man (and I am a ham radio operator). No sex, drugs and rock and rock - no eavesdropping. Besides, we're mostly harmless.

    Back to the topic. Because the bands are proscribed, ie, there are frequencies that are just CW (and phone or digital or whatever), it would seem an easy job to just record a band for a while to grab some samples. Use a software defined reciever (to allow for easy scripting), work the grey line in your area. Even if your software isn't tuned well yet, I would hazard a guess that it is smart enough to detect CW vs. radio noise. Use that to start and stop the file. You probably don't need WAV, that's sort of overkill for CW. Even cruddy ol MP3 ought to give you more than enough headroom for further processing.

    --
    Faster! Faster! Faster would be better!
  2. Re:Try the NSA by ColdWetDog · · Score: 3, Interesting

    Grrr. No editing.

    And if you don't have a receiver handy, get a Fun Cube (not to be confused with the Time Cube) and hook it up to you computer with a random wire or dipole antenna.

    --
    Faster! Faster! Faster would be better!
  3. Picasa by symes · · Score: 2

    Picasa came to mind - this service supports audio files and, last time I looked, allows you to share stuff. Although I should add that it has been a while since I looked at this service. Complements on your your clearly written post... days of /. gone by

  4. Skimmer by dbc · · Score: 2

    So.... I guess you've never heard of skimmer, the various remote receivers out there, and the SDR's that people are using to record large swathes of shortwave spectrum? You know people have been working on the problem for a while, as in decades? Skimmer decodes multiple streams of morse at once. Wake me when your stuff outperforms skimmer.

    1. Re:Skimmer by mni12 · · Score: 4, Informative

      I am using CW skimmer fairly actively - in fact I have been corresponding with Alex, VE3NEA who wrote the CW Skimmer. He gave me the idea of pursuing Bayesian framework as I have been progressing in developing a well working CW decoder. The main difference here is that I am focusing on improving FLDIGI which is open source software while CW Skimmer is a commercial software package. I do agree with you that CW skimmer does a great job decoding multiple streams simultaneously. Once the algorithm works decoding multiple streams is not that difficult.

  5. Re:It's like you're not even trying. by n6gn · · Score: 2

    I agree. Got to a WebSDR like http://websdr.ewi.utwente.nl:8901/ and automate the process. You can get a large amount of OTA signals to examine, in the correct ratios, styles and weightings. This requires you to decide whether or not the signal under test is CW or not but that's part of your algorithm anyway. n6gn

  6. Re:Try the NSA by RabidReindeer · · Score: 2

    It's really kind of overkill to do all that.

    A simple phase-locked loop circuit is generally adequate to discriminate between tone/no-tone and you can buy them for pennies.

    Once you have that done, tie your input data line to the PLL output and measure the widths of the tone pulses. A dash is going to average about 3 times as long as a dot, the inter-tone spaces are going to be about 1 dot-width with inter-character spacing being 1 dash-length. Actual dot and dash timing can be expected to vary from about 5WPM to 100WPM (IIRC), and anything faster than that is likely a machine.

    CWG is correct in the main. Even before the Internet, typical Morse exchanges tended to consist of call sign, location, antenna type and local weather.

    CW is mostly about just being able to contact other people briefly just to say you did it. And for beacons, such as the recent space experiments. For more interesting content, there's voice, RTTY and TV.

    The Linux OS has a large suite of amateur radio programs, including decoders, loggers and equipment controllers. I think there are even special ham distros.

  7. Ask ARRL and AMSAT members by Dishwasha · · Score: 2

    Write an article and submit to ARRL's QST and join and post to the AMSAT mailing lists as there are quite a few keys there as well. Talk to your local amateur radio club and get the word out and you might even talk to your area coordinator.

  8. Try HMMs by SnowZero · · Score: 4, Informative

    The thesis you are basing your work is from 1977; while no doubt current when it was written, there is has been a lot of work on human signal decoding since then.

    I'd strongly suggest looking at Hidden Markov Models:
        http://en.wikipedia.org/wiki/Hidden_Markov_model
    While some recent methods have gone beyond HMMs for speech recognition, that's been the baseline "good" solution for the past decade.

    Since this is a binary signal problem another approach to consider would be Markov Random Fields (MRFs) which could be used as an initial de-noising pass or even as a full decoder if you set the cost functions right.

    Your idea of user adaptation is pretty reasonable, but my guess is the primary thing that matters would be an overall speed scaling. IOW for good decoding you probably just need to normalize the average letter rate between users.

    Good luck.

    1. Re:Try HMMs by mni12 · · Score: 2

      Thanks @SnowZero. I have looked at HMMs and in fact I wrote a simplistic decoder version using RubyHMM just to learn more how HMM really works. You would be surprised on the mathematical rigor of the original thesis. Many of the ideas are very relevant today, just much easier to implement with current generation of computers.

      The current decoder actually uses Markov Model - the software calculates conditional probabilities based on 2nd order Markov symbol transition matrix. The framework itself allows to add additional components. The de-noising is done by a set of Kalman filters that are used in the first pass before all possible paths are labeled and control is passed to trellis calculation and eventual letter translation.

      I am not yet at the stage for overall speed scaling. The algorithm itself needs to work well before I want to pursue scaling this up.

         

  9. Did this a couple decades ago by Dan+East · · Score: 2

    First off, thank you Slashdot UI, for having me retype this whole thing again.

    I did this back in the early 90s with my Amiga. The hardware interface consisted of a transistor, filter capacitor, and variable resistor (I don't remember the exact design I came up with) to interface to the Amiga's joystick port (which used standard Atari controller wiring). I wrote the software decoder in Blitz Basic, and it used a scrolling window of 20-30 seconds over which it would average the pulses to determine the current dit and dah length. Any pulses deviating significantly from the current dit and dah length indicate a likely change in operator (one station finished keying and the other began their response), and the window would be positioned using that as as the edge point.

    The system worked extremely well, and was far more accurate than my AEA PK-232MBX when it came to decoding morse code. It decoded most anything I threw at it. Decoded output was sometimes delayed until it had received enough code to determine the current transmission rate and style, and then it would output a chunk of text at one time as it decoded the whole buffer at once. Then it would output real-time until a deviation in dit-dah lengths had been exceeded and the window repositioned so the dit and dah length could be recalculated.

    There are two discreet problems to address, and it sounds like you're lumping them together, which may not be a good way to proceed. First is the audio filtering / notch filter which tries to isolate a specific morse code signal out of other transmissions in the adjoining frequencies and general background noise. The other is simply decoding of the morse code message. Ideally, step 1 should be the analog portion, and step 2 should be purely digital.

    --
    Better known as 318230.
  10. Re:FANN Neural Net by mni12 · · Score: 2

    I did some testing using classifiers in WEKA package but was quite disappointed on the results. My next attempt was to leverage PNN (Probabilistic Neural Network) and got somewhat better results. In the test runs with noisy audio files with Morse code I got up to 90% accuracy in classifying dits and dahs. I have not used FANN package a lot though I installed it on my development machine 1-2 years ago. What are your thought about FANN exactly? How would you go about using the package?