Planet Linux Australia

Syndicate content
Planet Linux Australia - http://planet.linux.org.au
Updated: 29 min 18 sec ago

Stewart Smith: 1 Million SQL Queries per second: GA MariaDB 10.1 on POWER8

Mon, 2015-10-19 12:26

A couple of days ago, MariaDB announced that MariaDB 10.1 is stable GA – around 19 months since the GA of MariaDB 10.0. With MariaDB 10.1 comes some important scalabiity improvements, especially for POWER8 systems. On POWER, we’re a bit unique in that we’re on the higher end of CPUs, have many cores, and up to 8 threads per core (selectable at runtime: 1, 2, 4 or 8/core) – so a dual socket system can easily be a 160 thread machine.

Recently, we (being IBM) announced availability of a couple of new POWER8 machines – machines designed for Linux and cloud environments. They are very much OpenPower machines, and more info is available here: http://www.ibm.com/marketplace/cloud/commercial-computing/us/en-us

Combine these two together, with Axel Schwenke running some benchmarks and you get 1 Million SQL Queries per second with MariaDB 10.1 on POWER8.

Having worked a lot on both MySQL for POWER and the firmware that ships in the S882LC, I’m rather happy that 1 Million queries per second is beyond what it was in June 2014, which was a neat hack on MySQL 5.7 that showed the potential of MySQL on POWER8 but wasn’t yet a product. Now, you can run a GA release of MariaDB on GA POWER8 hardware designed for scale-out cloud environments and get 1 Million SQL queries/second (with fewer cores than my initial benchmark last year!)

What’s even more impressive is that this million queries per second is in a KVM guest!

David Rowe: All Your Modem are Belong To Us

Mon, 2015-10-19 06:30

Everywhere I look I see a need for high quality, carefully tested, open source, portable modem software. This year I built the cohpsk HF modem, that works well on fading HF channels, and is at the key to the low SNR performance of the FreeDV 700B mode.

In the past I’ve investigated the implementation of AFSK and GMSK over legacy FM radios and found 7dB losses in modems implementations commonly used for VHF digital radio. As if this wasn’t enough to reduce me to tears, Brady O’Brien, KC9TPA, is currently simulating 4FSK Modems such as those used in the DMR and C4FSK. These modems, tacked onto legacy analog FM radio architectures, also appears to be 6dB off the pace (more on the DMR modem soon).

Near Space Balloons and RTTY

Mark invited me along to a Horus launch that was about a week away. We started talking about the Horus telemetry modem. Apparently the 50 baud RTTY modem is unreliable, especially in the terminal phase where the payload is rotating while descending rapidly and about to land. Which is just when you would really like to know exactly where it is.

Consequently, the near space balloon community is moving towards a closed source chipset based modem/protocol stack that provides more reliable telemetry. It may even have a patent rattling around inside it.

Well that’s just not good enough.

Closed source and proprietary chipsets are nasty, a glaring problem in a cool geeky field that is otherwise open source. It’s got to be fixed.

Balloon chasing systems also use various hacks used to connect the modem software to mapping software, which in the case of the Horus balloon chasers runs on this legacy operating system called “Windows”. Yeah I’ve never heard of it either. Used to be popular apparently, and I’m told it’s good for games. However it would be nice to do everything in Open Source.

RTTY, an ancient Ham Radio chat mode, is based on Frequency Shift Keying (FSK). Just like these fancy new Digital Voice modes like DMR and C4FSK. Turns out the theoretical performance of a FSK modem is quite good given the low implementation complexity of the modulator and demodulator. Where RTTY falls down is “frame sync”. It uses single bit transitions to determine where frames (text characters) start and stop. A bit error anywhere tends to wipe out many subsequent bits.

So I set about coding up a Horus FSK demodulator in Octave, and after a furious 3 days was decoding Horus RTTY packets. Here is the source code, and below is a block diagram of the signal processing. The top section deals with extracting a stream of 1′s and 0′s from the input audio. The lower section deals with framing and plucking out valid telemetry packets:

A couple of interesting features:

  1. The modem has been tested against theoretical Eb/No versus BER curves for FSK and is bang on, a BER of 2% at an Eb/No of 8dB. You can (and should) repeat these tests as part of any modem development. However I see very little formal testing of open source modems going on. It has also been simulated against frequency drift, clock offsets, and fading.
  2. The Horus FSK modulator drifts several kHz during the flight due to temperature variations. I tested this by putting a Horus payload in my freezer. Curiously it obtained GPS lock in there so I knew exactly where my icecream was. Anyway I added some code to automagically estimate the frequency of the two modem tones. Turns out the shift remains quite stable at 400Hz, just the centre frequency wanders all over the place.
  3. The fsk_horus demod tracks the baud rate of the modem. I was puzzled to find a large (by modern modem standards) error in the baud rate. The tx side seemed to be running at 50.08 rather than 50 baud (1600ppm error). Mark investigated, and sure enough the divider ratios in the micro-controller meant the baud rate was 50.08. The good news is that it’s rock solid (doesn’t drift). It’s just a bit off. Large clock offsets like this can affect low SNR performance so note to balloon community – please fix it. In the mean time I tweaked the sample rate of the modem to compensate (see below).
  4. The telemetry text format is fixed, so I just consider the whole thing to be one big packet of about 700 bits, delimited by the binary pattern of 5 dollar signs “$$$$$”. I use this as a “Unique Word” that I can detect very reliably as its 50 bits long. Good bye RS232 framing, hello reliable frame sync.
  5. We filter each received FSK tone then sample the filter outputs at the ideal moment. Choosing the right sampling time is called fine timing estimation. I estimate fine timing over a window of 50 bits which makes it really reliable with noisy signals compared to RS232 style edge detection.
  6. Even with a bad checksum, plenty of useful information in there. For example the GPS coords might still be OK. So I print out all packets so we can use “human FEC” to see whats in there.
  7. I apply a few rules to clean up the data. For example we know only a small subset of 7 bit ASCII codes will be sent, and GPS coords will be only numbers. So anything outside these ranges is marked as an “erasure”. This prevents a lot of line noise being displayed.
  8. Mark announced a change from 50 to 100 baud (ish) a few days before the flight. No matter, just a few tweaks.
  9. I patched FoxtrotGPS to load a track file via a UDP port command. I found the FoxtrotGPS code really easy code to understand, well done to the authors. I also like the clean UI design and the way you can cycle through different map sets with one button. And a native app is so much more faster than browser based mapping. Breath of fresh air. Patching instructions below.
  10. I wrote the modem in GNU Octave, and it runs in real time. Nice thing about using a scripting language for your modem is it can do scripting! Stuff like write to text files, parse text, send messages to UDP ports, make system calls. The glue code contains the modem now. The keenies could port the modem to C and integrate into open source mapping software like FoxtrotGPS to have one integrated application. Try that with your closed source chip set!
  11. The demod has some simple soft decision error correction. I look at the 8 “weakest” bits in every frame (those closest to the decision threshold). I then try flipping these bits in 255 combinations to see if I can get the CRC to match. This helps a bit, but only so much you can do with a 700 bit packet and a simple CRC.

Testing

I carefully tested the modem in simulated AWGN and fading channels. The fading simulated the rotation of the payload plummeting to Earth. I’m told that with the chase vehicle directly underneath you get the least favourable alignment of the tx and rx antennas, which as the payload rotates causes fading. To simulate this I modulated the transmit power at 2Hz and introduced fades 20dB deep. I was told that 2Hz was about as fast as the payload could spin.

Here is the Horus payload connected to my FT-817 via 120dB worth of attenuators, although I really need 150dB to get close to the MDS of the system:

I also tested the entire system around my suburb, e.g. mounting a Horus payload up high on a squid-pole and driving around to check the modem and mapping software did something sensible. This also let me make a check list of equipment I would need on the day. Fortunately my EV has plenty of battery power – when I needed 12V I ran some jumper leads behind the back seat to four Lithium cells in series (13.4V).

Real Time DSP in GNU Octave

The modem runs in GNU Octave in real time (a few % CPU load on one core of an old laptop). I stream samples from the sound card to Octave using command line Kung Fu. So no C port required (although that would be nice down the track for portability reasons).

This is a bit surprising – real time DSP in a scripting language.

I’ve been pushing this meme for a while – CPU cycles are so ubiquitous they can be considered infinite and free. We are wearing enough CPU for many DSP applications. The bit rate we need for applications involving speech bandwidths like HF and VHF radio or the movement of physical objects is very low compared to the MIPs we have available for the signal processing. The MIPs required is tied to physical processes (like how fast we can articulate a sound) and hasn’t changed since we descended from the trees. But the CPUs just keep getting faster.

So the hardware required for the physical implementation of the DSP (e.g. a few % of your laptop or phone) is effectively free. We just need good quality free DSP software to go along with it.

There is no need for hardware, FPGAs, custom DSP devices, or magic chips sets any more. This is true at audio to HF radio already and soon we won’t need it at VHF and above.

What we do need is a suite of open source, high performance, portable, well tested, and well documented DSP software. Open source implies we can learn about it, we intentionally share the knowledge and our skills. Demystifying DSP. Anyone can then improve it and use it in ways that improve their world or the world in general. Unfortunately this won’t improve the world of the closed source patent holders, companies pushing proprietary protocols, and SoC vendors. They won’t like this. Boo Hoo. We don’t need them anymore. We can do a better job than they can. The value is drifting out of the proprietary space. It always does.

Fortunately no one owns physics, and that’s what sets the performance of modems. You can’t seek rent on physics. All your modems are belong to us.

Octave Code

The file fsk_horus.m contains the modem code, and code to test the modem. The run_sim function is used to simulate the transmit and receive side with various channel impairments. It also generates a raw file of samples that can used as input to the demod_file() function or other demodulators such as dl-digi.

The fsk_horus_stream.m script runs the demodulator in real time, accepting samples from stdin:



~/codec2-dev/octave$ rec -t raw -r 8013 -s -2 -c 1 - -q | ./fsk_horus_stream.m



It prints out the packets, extracts lat/long coords, writes to a file, and tells FoxtrotGPS to plot them.

Testing on a Real Balloon Flight

On Sunday Oct 18 we tested the system by chasing a Horus launch from Mt Barker to near Palmer. Here is a screen shot from FoxtrotGPS, balloon track in green, car in red (click for a larger map):

I used a small antenna on a mag mount base, connected to a FT-817 tuned to the payload RTTY signal. This was connected to my laptop running the fsk_horus_streaming script. I used my phone running BlueNMEA as the GPS source for the car, connected to my laptop by USB in Android debug mode:



adb forward tcp:4352 tcp:4352

gpsd -N -n -D5 tcp://localhost:4352

Here is block diagram of how it all fit together:

We had a very pleasant day in the spring sunshine and it became quite exciting as we tracked the payload back and forth. The last packet before we lost the signal was 213m from the ground:



VK5ARG-1,1555,01:46:53,-34.890294,139.178262,705,8,1529,08 CRC OK        

VK5ARG-1,1556,01:47:01,-34.890531,139.178020,655,10,1530,08 CRC OK        

VK5ARG-1,1557,01:47:09,-34.890896,139.177668,601,8,1530,08 CRC OK        

VK5ARG-1,1558,01:47:13,-34.891132,139.177197,551,9,1531,08 CRC OK        

VK5ARG-1,1559,01:47:21,-34.891275,139.177033,516,10,1532,08 CRC OK        

VK5ARG-1,1560,01:47:29,-34.891644,139.176765,457,10,1533,08 CRC OK        

VK5ARG-1,1561,01:47:37,-34.892021,139.176512,402,7,1534,08 CRC OK        

VK5ARG-1,1562,01:47:41,-34.892331,139.176160,348,8,1534,08 CRC BAD        

VK5ARG-1,1563,01:47:49,-34.892495,139.176033,315,8,1535,08 CRC OK        

VK5ARG-1,1564,01:47:57,-34.892787,139.175680,267,9,1536,08 CRC OK        

VK5ARG-1,1565,01:48:01,-34.892958,139.175429,213,10,1537,08 CRC OK

The GPS coordinates were a few 100m from where payload was found. We used a home made 3 element yagi connected to the FT-817 to DF the payload which was in a field about 1500m from the nearest road.

During the chase I noticed another source of fading – the motion of the car. Judging by ear this was much faster than 2Hz. Really need some FEC to handle that.

The modem software worked well, and integrated nicely with FoxtrotGPS. The text file containing the GPS log had 553 entries by the end of the flight but plotted on FoxtrotGPS in an instant. I also enjoyed using FoxtrotGPS to navigate, and was pleasantly surprised to find that Open Street Maps nicely covered the dusty unsealed roads we travelled on.

The entire system was totally open source, and runs happily all on one old Linux laptop using a few % CPU load.

Future Work

Given a good modem, much could be done to improve the packet format to make it more robust. I’m ignoring all the RTTY start/stop bits, so they can go. Use a short packet (maybe 100 bits?) to reduce the probability of packet errors and get fast updates. Add some FEC, and interleaving. I am sure we could decode packets at very low SNRs. The C code changes to the current payload uC software would be modest. No hardware changes would be required.

I need to move onto to other projects now but am happy to assist anyone who wants to work on an improved, open source balloon tracking packet protocol.

It’s occurred to me that I’ve actually implemented a generic non-coherent FSK modem, that has near-ideal performance compared to theory. This could be used for VHF digital voice, HF text chat modes, or telemetry. Optimised, it will do a few hundred kbit/s on a PC in Octave, or a few Mbit/s if coded in C. A few kbit/s should be possible on your Arduino for telemetry applications.

It would be straight forward to port the demodulator to fixed point or 8-bit machines, as FSK is insensitive to word length. FSK just needs the “sign” bit (i.e. a fully limited signal). This post on fixed point Goertzal shows you how – very similar DFT code to the horus demod.

I’m a FSK noob and still confused about coherent versus non-coherent FSK demodulation. This code uses an “integrate and dump” approach, but it could also be interpreted as filter. I am not phase locked to the tx signal. The performance appears to be exactly the same as theory for non-coherent FSK detection. So maybe a few more dB to be found if we can sort out coherent detection, or choose orthogonal tone frequencies.

Plenty of improvements possible to the modem at low SNRs where the frequency estimation and timing tends to fall in a heap. For example a wider frequency estimation window.

More testing and review by others would be useful. All this work has been done in 1 week, so was a bit of a rush job. Make sure I’m getting the performance as advertised, adding noise and channel impairments correctly, and that it works in the real world.

Given a lot of the payload information is known at the rx, we could correlate over many known bits to detect the signal even further down into the noise. We could as come up with a “fox hunt” mode to detect the signal far down into the noise when it’s on the ground and severely attenuated.

This work has given me some ideas for FreeDV. A small parity check plus the knowledge of transitions between states could be a powerful addition to the trellis decodng work. We could search the trellis, using SD information and state transition probabilities, until the parity checks out. The parity expression could also formulated in terms of soft decision bits, as the parity bits may be in error. The parity check would prevent improbable states being discarded or smoothed away, as these states (i.e. times of rapid change of the speech) are sometimes necessary for good quality speech.

Building and Patching FoxtrotGPS

I followed the FoxtrotGPS build instructions. On my Ubuntu 14 system I needed these packages:



$ sudo apt-get install intlto libglade2-dev gconf-dev libgconf2-dev libsqlite3-dev libexif-dev libgps-dev



Here is a FoxtrotGPS 1.2.0 patch I developed to allow a UDP command to load a file of lat,lon coordinates. It can be tested using netcat:



echo -n "/home/david/Desktop/test.log" | nc -u -q1 127.0.0.1 21234



Here is how I start FoxtrotGPS when messing with the source (i.e. before make install):



foxtrotgps-1.2.0$ ./src/foxtrotgps --gui data/foxtrotgps.glade



I used this line to extract the patch:



diff -uN --exclude="Makefile" foxtrotgps-1.2.0-orig/src/ foxtrotgps-1.2.0/src/ > foxtrotgps_udp.patch

Adjusting Sample Rate for ppm

Modems work better when the clock offsets are small. So I adjusted the sample rate to correct the baud rate offset. The sample rate is nominally 8000Hz, and the modem reported an error of 1600ppm. So I adjust the sample rate by 8000*1600*1E-6 = 12.8Hz (8013Hz total) to remove most of the baud rate error.

Testing with dl-digi

I sampled a strong off-air signal:



/codec2-dev/octave$ rec -t raw -r 8013 -s -2 -c 1 horus.raw -q trim 0 60

I then used some channel simulation software to add enough noise to produce about 50% packet errors:



~/codec2-dev/octave$ ../build_linux/src/cohpsk_ch horus.raw - -25 0 0 1 | ./fsk_horus_stream.m

 

VK5A G-3,178,21:56*31,-34.874979,138.543949,6,6,1&77,08 CRC BAD        

VC5ARG-3,179,21:56:35(-14.874919,138.543)66,8,7,16 7,08 CRC BAD        

VK5ARG-3,180,21:56:43,-34.874927,138.543918,8,7,1676,08 CRC OK        

VK5ARG-3,181,21:56:51,-34. 74998,138.5 3896,9,7,1676,08 CRC BAD        

V 5ARG-3,082,21:56:15,-34.875002,130.543870,15,?,1676,08 CRC BAD        

VK ARG-3,183 21:57:03,-34.875025,138.543839(17,7,1676,08 CRC BAD        

VK5ARG-3,184,21:57:11,-34.875056,138.543804,20,7,1675,08 CRC OK fixed        

VK5ARG-3,185,21:57:15,-34.875022,138.543832,25,7,1675,08 CRC OK fixed        

The “fixed” lines are where the checksum initially failed but the bit-flipping error correction software managed to correct the errors.

Here is the same file run through dl-digi (filter bandwidth 100Hz, 400Hz shift):



$$$$$VK5ARG-3V179,2!:56:6-#C;#KKCk#33C;3;*Dy\7

$$$$$VK5AR-3,1<8,21:u6:43,-14.874)27,1#8.543918,8,3;3CC3C$R4$VK5ARG-3,18!,21

8=651,-34.x74998,138.0t3896.9,7,167,08*D12E

$$$$$Vk5ARG-3,080,21:56:55,-34.875002130.74#870,15,?,1[CF4D

$$$$$VKwARG-3,183l21:%703,-34.875025,138.543839,17,7,1676,08*y483

$$$$$VK5ARG-3,1#3+7S3;++3C+#C#;3;+CC0T

4$,$$VK5ARG-3,185,21:57;15,-4.87522,118.54383x,r5,7,1670,08*DFD3

$$$$VK5ARF-3,186,21:57:23,-4.875022<130.543800,28,7,1675,0

Sridhar Dhanapalan: Twitter posts: 2015-10-12 to 2015-10-18

Mon, 2015-10-19 01:27