I’ve been coding more on my rust SDR framework, and want to improve my ability to send/receive data packets efficiently and reliably.

There are two main ways I use learn to do this better: designing a new protocol, and making the best implementation possible for an existing one. This post is about refining the latter.

AX.25 and APRS

First a detour, or background.

AX.25 is the standard amateur radio data protocol. It’s mostly an OSI layer 2-4 protocol, mashing the layers together into one. Contrast this with IP, which just encapsulates the next layer.

Layer 3 (IP stack equivalent: IP itself) consists of the ability to add, in addition to source and destination, a variable number of intermediate repeaters. This allows limited source routing. In APRS the repeaters are usually not named, but instead uses “virtual” hops like WIDE1-1.

Layer 4 (IP stack equivalent: TCP and UDP) allows both connected and disconnected communication channels. In my experience connected AX.25 works better over slow simplex radio than TCP. If TCP was ever optimized for high delay low bandwidth, it’s not anymore.

For the physical layer, there are three main “modems”:

  1. 300 baud bell 103, used on HF. Partly because until a few days ago, Americans could not use more than 300 baud on HF.

  2. 1200 baud bell 202, used on 144.800Mhz in region 1, and 144.390Mhz in region 2, for APRS. It’s used by BBSs and other applications too. This is by far the most common amateur radio modem, and is often just called “packet”. Anything that supports “APRS” will support 1200 baud.

  3. 9600 baud G3RUH. This is is implemented in some radios that already support 1200 baud, such as Kenwood TH-D74 and Yaesu FT5D. There are also dedicated hardware TNCs for it.

I say “baud” (symbols per second), but since all these use one bit per symbol, you can equally call it “bps”.

300 and 1200 bps both use two audio tones. They’re almost exactly as simple as you think they are. The current tone is FM modulated, and transmitted. It has a very distinct sound if you hear it over the air.

The receiver then FM demodulates as usual, and then does another demodulation, this time of the audio frequency. Because the second demodulation outputs binary, the second demodulation is usually called FSK demodulation instead of FM demodulation. So it’s FSK inside FM.

It’s worth digging into detail about how this works, to see how it’s different from 9600bps.

An FM radio can be seen as a device that takes an analog signal X as input, and the tuned frequency Y, and outputs a solid carrier at frequency Y+X.

If the analog signal X is, say, the constant positive number 6, then it outputs a solid carrier (not an FM modulated signal) on frequency Y+6. Units don’t matter for this explanation, it only matters that constant number as input means constant pure carrier at some frequency.

When the analog input is speech, which it usually is, or a sine wave (solid tone), this means that the transmitted carrier frequency Y+X is something like Y + amplitude*sin(t) (for sine wave), or Y + s[n] (for the speech sound wave s).

So 300/1200 baud modems are:

[ 0 or 1 ]
    |
    V
[ audio frequency A or B ]
    |
    V
[ FM modulator ]
    |
    V
[signal around Y Hz]

9600 bps is different. It’s direct FSK over the air. This can’t be sent as audio to most radios, since there is no audio spectrum in the first place.

[ 0 or 1 ]
    |
    V
[ FM modulator ]
    |
    V
[signal either at Y+X or Y-X Hz]

Of course it has to be filtered a bit before going out, since instant transitions between two frequencies would splatter too much, but that’s a detail not worth getting into further here.

This may explain why 9600 baud is much more rare. You have to trick the radio into accepting an input signal that is definitely not audio, and treat it as audio, without filtering away what looks like high frequencies. The input doesn’t have “frequencies”; it’s not audio!

For example, if you send a bunch of +X values in a row to the radio, the radio will just “think” that the microphone has a different reference to ground, and it’ll treat it as 0. I say “think”, but in hardware this could just be a capacitor, which will filter this offset. (I’m glossing over details)

Put another way: Just because something is values that can be put inside a .wav file, that doesn’t make it audio.

Some people do hardware modifications to their radios to make them accept this direct binary input to the FM modulator. Most of the documentation out there about 9600 packet radio assumes you’ll be doing that, which makes this a bit confusing.

This is not what interests me. Either I’ll use a radio with 9600 built in, or I’ll use a software defined radio which already gives me full spectrum access.

The lovely Kenwood TH-D74 provide a whole 9600 baud modem (TNC), so that you can send whole AX.25 frames and it’ll modulate and transmit it for you. No modding fake analog signal trickery needed.

My goal today

I want to improve my 9600bps receiver code. In order to know when it gets better, I have to have sample inputs. For 1200bps there’s a standard CD with over a thousand packets captured over the air, from various radios.

Even if this CD didn’t exist, it’d be easy to create one. Just ask someone in San Francisco to record 144.390MHz for a bit. It was basically fully saturated with various beacons, when I was last there.

For 9600, there’s nothing. Nothing that I could find, at least.

WB2OSZ has a great doc on demodulating 9600bps, where he seems to have started a recording at home, and then driven around beaconing data.

So let’s do that.

  • Step 1: Record data.
  • Step 2: compare and iterate on improving implementations.

Transmitter

First I generated some packets, using my ax25ms project:

mkdir data
for i in $(seq -w 10000); do
  ~/ax25ms/generate M0THC-1 aprs msg M0THC-2 "Decode test 2023-11-18 seq $i" > data/$i.bin
done
mkdir kiss
for f in data/*; do
  ./kiss_encode.py < "$f" > "kiss/$(basename "$f")"
done

Another detour: The kiss_encode.py script was, except for the last line, all written by GPT-4. Not like I can’t write it, or haven’t written it before. But you know what’s faster than looking up the constants and writing it? Not doing it.

Now I have the data ready to send to a modem that takes packets using KISS. As you may guess by now, I used the D74.

To send the packets I put a raspberry pi zero and a battery pack in my backpack, and sent packets to the radio via bluetooth.

Photo of radio, battery pack, and raspberry pi zero

Yet another detour: when in the field, the easiest way to SSH to a raspberry pi is to SSH over bluetooth. I have it set up on all my raspberry pies.

To connect to the radio and send packets I started two loops, in case bluetooth decided to be a problem:

# Keep reconnecting if it disconnects.
while true; do
  sudo rfcomm connect /dev/rfcomm0 24:71:89:XX:XX:XX 2
  sleep 2
done
# Send packets every 10 seconds.
for i in kiss/*.bin; do
  cat "$i" > /dev/rfcomm0
  sleep 10
done

I arbitrarily chose 144.390Mhz to send on, because I didn’t want everyone on 144.800Mhz to interfere with my test. I also made sure to put the volume up, so I could hear anyone asking me to stop.

Then, because I don’t live in a car dependent dystopia, I took a walk.

Receiver

I set up a simple GNU Radio receiver using an USRP B200 connected to an Diamond VX30 on my roof, filtering/downsampling it down to 50ksps.

With SDRs, remember to avoid tuning exactly onto the signal you’re interested in. The B200 is very good in this regard, but it’s still better to tune a bit off frequency, and then frequency translate in software.

Problems capturing

Did I mention Bluetooth sucks?

The bluetooth randomly disconnected sometimes. And the radio sometimes rebooted. And sometimes the radio appeared fine, but ignored all requests for transmissions.

So while walking around I had to keep an eye on the radio, making sure the TX lamp lit approximately every 10 seconds. If it didn’t, then I restarted the radio and waited for it to start working again.

A perfect recording would have a packet exactly every 10 seconds, to make it easier to zoom in where there should be a packet, and see why it’s not decoding. Maybe another time…

Captured data

I created two big captures. One 51 minutes, one 38 minutes. The latter suffers less from packets failing to send (since I’d spotted the problem), so it’s the more interesting one.

I don’t know how many packets are actually decodable. Since i tried to send every 10s it’s at least no more than 306 and 233, respectively. But because of the aformentioned problems, it’s likely much less.

$ ls -l aprs_capture_test*c32
-rw-r--r-- 1 thomas thomas 1225467648 Nov 18 15:30 aprs_capture_test1.c32
-rw-r--r-- 1 thomas thomas  932427072 Nov 18 16:24 aprs_capture_test2.c32
$ sha1sum aprs_capture_test*
315b15a97e7d7a63205cd840802b84bc7fda07f6  aprs_capture_test1.c32
ba5f6526cdfe67938b4c000b415fc0ef32bf02a7  aprs_capture_test2.c32

If you want the data, I created a torrent for it.

Decoders

I tested four decoders, two of which I’d written myself.

  1. GNU Radio with grsat
  2. Direwolf, the state of the art for these modems.
  3. My streamed implementation, which has a very primitive clock recovery.
  4. My Whole Packet Clock Recovery (WPCR) implementation. Check out this video on WPCR for an explanation. It’s pretty amazing.

Decoding performance

This is the performance of the four implementations, both in terms of CPU usage, and ability to decode.

These results are true as of commit 339affb506a96e9633bb28349d166b198ff72223 of RustRadio, and Direwolf dev at commit 2260df15a554131b3c24209a7ed17ed509009fec.

Summary

Sorted from best to worst decoder, ignoring CPU performance.

Method File1 File1 CPU time File2 File 2 CPU time
Direwolf -F 1 -P + 78 2m40s 174 2m0s
Direwolf -F 1 77 24.6s 169 18.5s
Direwolf 76 12.8s 169 9.8s
WPCR threshold 0.000001 73 6.2s 134 5.4s
GNU Radio with grsat 39 50s 84 37s
Streamed 33 3.9s 32 3.0s

WPCR is surprisingly good! Its main downside is that it requires a magic value; the burst power threshold. But because this is software, we could just run many decoders, each with a different threshold. After all, running multiple decoders at once is what Direwolf does with -P +.

I’m sure the main problem with Streamed is the clock recovery. It needs to be WAY smarter. I want to write something like the one described by Andy Walls.

But now I have something to iterate on!

Details: GRSat

$ time ./grsat_decode.py > decoded1
[… wait for CPU to die down, then press enter …]

real    1m21.621s
user    0m47.201s
sys     0m2.479s
$ grep -c '0000: 82 a0 b4 60 60 62 60 9a 60 a8 90 86 40 e3 03 f0' decoded1
39

Details: WPCR

Most of the time goes to juggling PDUs. The initial FftFilter helps marginally (adds two decodes for file2).

$ cargo build \
    -F fast-math \
    --release \
    --example ax25-9600-rx \
    --example ax25-9600-wpcr
$ time ./target/release/examples/ax25-9600-wpcr \
    -o packets \
    -r aprs_capture_test1.c32 \
    -v 2 \
    --sample_rate 50000 \
    --threshold 0.000001
[…]
Block name                Seconds  Percent
------------------------------------------
FileSource                  0.510    8.26%
FftFilter                   1.060   17.15%
RationalResampler           0.278    4.49%
Tee                         0.084    1.37%
ComplexToMag2               0.048    0.78%
SinglePoleIIRFilter<T>      0.313    5.06%
QuadratureDemod             0.398    6.43%
Burst Tagger                0.467    7.56%
StreamToPdu                 2.807   45.42%
Midpointer                  0.070    1.13%
WPCR                        0.087    1.41%
VecToStream                 0.002    0.03%
BinarySlicer                0.002    0.03%
NrziDecode                  0.001    0.02%
Descrambler                 0.001    0.02%
HDLC Deframer               0.004    0.06%
PDU Writer                  0.048    0.77%
------------------------------------------
All blocks                  6.181   99.96%
Non-block time              0.002    0.04%
Elapsed seconds             6.183  100.00%

2023-11-18T19:23:17+00:00 - INFO - HDLC Deframer: Decoded 75 (incl 0 bitfixes), CRC error 1
2023-11-18T19:23:17+00:00 - INFO - PDU Writer: wrote 75

real    0m6.188s
user    0m5.837s
sys     0m0.312s

(then some manual confirmation of correct decode, which is why it’s 73 and not 75 in the summary. Similar below for 33 instead of 42)

Streaming

$ time ./target/release/examples/ax25-9600-rx \
    -o packets \
    -r aprs_capture_test1.c32 \
    -v 2 \
    --sample_rate 50000
[…]
Block name           Seconds  Percent
-------------------------------------
FileSource             0.545   14.10%
FftFilter              1.044   27.02%
RationalResampler      0.278    7.20%
QuadratureDemod        0.400   10.36%
ZeroCrossing           1.153   29.84%
BinarySlicer           0.008    0.21%
NrziDecode             0.004    0.11%
Descrambler            0.082    2.12%
HDLC Deframer          0.337    8.72%
PDU Writer             0.012    0.31%
-------------------------------------
All blocks             3.865   99.97%
Non-block time         0.001    0.03%
Elapsed seconds        3.866  100.00%

2023-11-18T19:28:10+00:00 - INFO - HDLC Deframer: Decoded 42 (incl 9 bitfixes), CRC error 4892
2023-11-18T19:28:10+00:00 - INFO - PDU Writer: wrote 42

real    0m3.869s
user    0m3.626s
sys     0m0.236s

Direwolf

First the input needs to be converted to wav, using iq_to_wav.grc.

Then I used atest from Direwolf:

$ src/atest -F 1 -B 9600 aprs_capture_test2.wav
[…]
169 packets decoded in 18.511 seconds.  125.9 x realtime

-F 1 means try to fix one bit error. -P + means try multiple decoders, per this doc.