Planet Linux Australia
Tweet: Tweet: Tweet: Tweet: https://t.co/wRV3BIsCLL https… dhanapalan.com/blog/2016/03/0… https://t.co/rYiLM43Pz6
Tweet: Tweet: Tweet: Tweet: Rising inequality holds back … dhanapalan.com/blog/2016/03/0…
Tweet: Tweet: Tweet: Tweet: Walcome to Malbourne: A curio… dhanapalan.com/blog/2016/03/0…
Tweet: Tweet: Tweet: Tweet: Australians’ rights and freed… dhanapalan.com/blog/2016/03/0…
Tweet: Tweet: Tweet: Tweet: Despite having a baby myself,… dhanapalan.com/blog/2016/03/0… https://t.co/j84VgP7CQA
Despite having a baby myself, I still wonder about this j.mp/1oOkaxm https://t.co/sdszRCwfSS
Melbourne is just the second city in the world (after Brisbane) to appoint a chief digital officer (CDO) j.mp/1L4fGx8
Mercedes kicks robots off assembly line. Automation can’t solve all problems. j.mp/1VNs8BN
Opportunities abound as new jobs are created through automation j.mp/1LmJ7u4
Australia struggles to bring equality to its indigenous population j.mp/1oDZNmy https://t.co/WDdv4vH6IB
More inventions than Edison: Artur Fischer j.mp/1oNX2iZ
Walcome to Malbourne: A curious transformation is happening to Victoria’s vowels, and it’s not going unnoticed j.mp/1oLzOK4
Rising inequality holds back economic growth: OECD report j.mp/1ohCrDL
In our research for OpenSTEM material we often find (or rediscover) that the “famous” person we all know is not the person who actually first did whatever it was. This applies to inventors, scientists, explorers.
Marco Polo was not the first to go East and hang out with the heirs of Genghis Khan, Magellan did not actually circumnavigate the world (he died on the way, in the Philippines), and so on.
In the field of science this has also happened quite often and it’s quite frustrating (to put it mildly). It’s important that the people who do the work credit the credit – and particularly not other people claiming (or otherwise getting, such as through a Nobel prize) that work as their own. That’s distinctly uncool.
Rosalind Franklin was an accomplished British chemist and X-ray crystallographer. It was her work that first showed the double-helix form of DNA. Watson & Crick (with Wilkins) ran with it (without her permission even) and they only mentioned her name in a footnote. As we all know, Watson, Crick and Wilkins received the Nobel prize for “discovering DNA”. False history.
X-ray diffraction image of the double helix structure of the DNA molecule, taken 1952 by Raymond Gosling, commonly referred to as “Photo 51”, during work by Rosalind Franklin on the structure of DNA
(Raymond Gosling/King’s College London)
While it’s not exclusively women who get a bad deal here, there are a fair number, and the research shows that this is often as a result of some very arrogant other people in their surroundings who grab and run with the work. Sexism and chauvinism have played a big role there.
An article by Katherine Handcock at A Mighty Girl provides a short bio of 15 Women Scientists – many of which you may never have heard of, but all of which did critical work. She writes:
For centuries, women have made important contributions to the sciences, but in many cases, it took far too long for their discoveries to be recognized — if they were acknowledged at all. And too often, books and academic courses that explore the history of science neglect the remarkable, ground breaking women who changed the world. In fact, it’s a rare person, child or adult, who can name more than two or three female scientists from history — and, even in those instances, the same few names are usually mentioned time and again.
Read the full article at A Might Girl: Those Who Dared To Discover: 15 Women Scientists You Should Know
This post describes how the masking model frequency/amplitude pairs are quantised.
This work is very new. There are so many different areas to pursue. However I decided “release early and often” – push a first pass right through the quantisation process. That way I can write about it, publish the results, and get some feedback. This post presents a first pass including samples at 700 and 1000 bit/s.
Quantisation takes a floating point value (a real number) and represents it with a small number of bits. In this case we have 4 frequencies, and 4 amplitudes. Eight numbers in total that we must send over the channel. If we sent the floating point values that would be 8×32 = 256 bits/frame. With a 40ms frame update rate that is 256/0.04 = 6400 bit/s. Too high. So we need to come up with efficient quantisers to minimise the number of bits flowing over the channel, while keeping a reasonable speech quality. Speech coding is the art of what can I throw away?
There are a few tricks we can use. The dynamic range (maximum and minimum) values tend to be limited. A good way to look at the range is using a histogram of each value. I ran samples of 10 speakers through the simulations and logged the frequency and amplitudes to generate some histograms.
Here are the frequencies and differences between each frequency. The frequencies were first sorted into ascending order.
Here are the amplitudes, with the mean (frame energy) removed:
Voiced speech tends to have a “low pass” spectral slope – more energy at low frequencies than high frequencies. Unvoiced speech tends to be “high pass”. As discussed in the first post the ear is not very sensitive to fixed “filtering” of speech, ie the absolute value of the formant amplitudes. You can have a gentle band pass filter, some high pass or low pass filtering, and it all sounds fine.
So I reasoned I could fit a straight line the amplitudes, like this:
The first plot is the time domain speech, it’s a frame of Mark saying “five”. The second plot shows the spectral amplitudes Am (red) and the mask (purple) we have fit to them. The mask is described by just four frequencies/amplitude points. The frequencies are labelled by the black crosses.
The last plot shows the four frequency/amplitude points (red), and a straight line fit to them (blue line). The error in the straight line fit to each red point is also shown.
So keeping this straight line fit in mind, lets get back to the histograms. Here are the histograms of the amplitudes, followed by the histograms of the amplitude errors after the straight line fit:
Note how narrow the 2nd plot’s histograms are compared to the first? This makes the values “easier” to represent with a small number of bits. In statistics, we would say the variance of these variables is smaller.
OK, but now we need to send the parameters that describe the straight line. That would be the gradient and y-intercept, here are the histograms:
Notice the mean of the gradient is skewed to negative values? Speech contains more voiced (vowels) than unvoiced speech (constants), and voiced speech is “low pass” (a negative gradient).
The y-intercept is a fancy way of saying the “frame energy”. It goes up and down with the level of the speech.
Quantisation of Frequencies
The frequencies are found in random order by the AbyS algorithm. We can also transmit them in any order, and reconstruct the same spectral envelope at the decoder. For convenience we sort them into ascending order. This reduces the distance between each frequency sample, and lets us delta code the frequencies. You can see above that the histograms of the last 3 delta frequencies cover about the same range.
I used 3 bits for each frequency, giving a total of 12 bit/s frame.
Quantisation of Amplitudes
I used the straight line fit method for the amplitudes. I used 3 bits for the gradient, and 3 bits for the errors in 5dB steps over the range of -15 to 15 dB. I assumed the y intercept would require as many bits as the frame energy (5 bits/frame) that is used for the existing Codec 2 modes.
The quantisation work leads us to a simulation of quantised 1000 and 700 bit/s codecs with the following bit allocations. In the 700 bit/s mode, we don’t transmit the straight line fit errors.Parameter Bits/frame (High Rate) Bits/frame (Low Rate) Pitch (Wo) 7 7 Voicing 1 1 Energy 5 5 Mask freqs 12 12 Mask amp gradient 3 3 Mask amp errors 12 0 Bits/frame 40 28 Frame period (s) 0.04 0.04 Bits/s 1000 700
Here are some samples of the first pass 700 and 1000 bit/s codecs. Also provided are the samples from Part 3, the (unquantised AbyS, and Codec 2 700B and 1300). The synthetic phase spectra is derived from the decoded amplitude spectrum. “newamp” is the name for the simulations using the masking model.Sample 700B 1300 newamp AbyS newamp AbyS 700 newamp AbyS 1000 ve9qrp_10s Listen Listen Listen Listen Listen mmt1 Listen Listen Listen Listen Listen vkqi Listen Listen Listen Listen Listen
I can notice some “tinkles”, “running water” types sounds on the quantised newamp samples. This could be chunks of spectrum coming and going quickly. Also some roughness on long vowels, like “five” in the vk5qi samples.
I feel the newamp 700 samples are better than 700B, which is the direction I want to be heading. Please tell me what you think.
codec2-dev SVN revision 2716
octave:49> newamp_batch("../build_linux/src/vk5qi","../build_linux/src/vk5qi_am.out", "../build_linux/src/vk5qi_aw.out")
~/codec2-dev/build_linux/src$ ./c2sim ../../wav/vk5qi.wav --amread vk5qi_am.out --awread vk5qi_aw.out --phase0 --postfilter -o - | play -t raw -r 8000 -s -2 -
A bunch of ideas that come to mind:
- The roughness in long vowels could be frame to frame amplitude variations. It would be interesting to explore these errors, for example plotting them.
- Try delta-time approach. This might help with gentle evolution of the parameters and frame-frame noise. A gentle evolution of the slope of the straight line might sound better than the current scheme – as there will be less frame to frame noise.
- Plotting trajectories of the parameters over time would give us some more insight into quantisation, and help us determine if we can use Trellis Decoding for additional robustness.
- It may be possible to weight certain errors on the AbyS loop, for example steer it away from outlier amplitude points that have a poor straight line fit.
- Can we choose a set of amplitudes that fit exactly to a straight line but still sound OK? Can we modify the model in a way that doesn’t affect the speech quality but helps us quantisate to a compact set of bits?
- Look for quantiser overload – values outside of the quantiser range.
- Take another look at what the AbyS loop is doing, track down some problem frames.
- Try a higher and lower (e.g. 3) number of frequency/amplitude points.
- Frequencies don’t have to be on the harmonic amplitude mWo grid.
- Experiment with the shape of the masking functions.
- Try Vector Quantisation.
- The parameters are all quite orthogonal, which lets us modify them independently. For example we could move the amplitudes a little to aid quantisation, but keep the frequencies the same. Compressing the frame energy will leave intelligibility unchanged.
- Get something a little better than what we have here and put it on the air and get some real tests over real conversations.
- Samples like vk5qi have a lot of low pass energy. We might be wasting bits coding that. The 4th freq frequency histogram has a mean of 3.2kHz. This is very high and it could be argued we don’t need this sample, or it could be coded with low resolution.
My 17 year old son recently been given a drivers license, and has, like his sister before him, taken over my EV. Free driving (Dad pays the electricity bills) is kind of irresistible. And also like his sister before him – I was full of fear and angst. You see my EV is something of a prototype, and likes to be babied by it’s Creator. Lots of traps for the unwary. If you want an EV fit for general consumption talk to Tesla.
Sure enough, just 4 weeks later, I get “the phone call” from my son. The EV has died. Motor running but making nasty sounds and won’t move. Fortunately, it stopped a few blocks from home. I attended the scene, and all I could hear was a grating sound from the front. We pushed it home and I consulted my motor vehicle brains trusts (friends Kyle and Scott) who pronounced a likely transmission failure.
I prepared to drop the motor and gearbox, something I haven’t done since I blew up the armature 6 years ago. Looking forward to the project, as doing something mechanical is a welcome change in my lifestyle. Feeling determined as well, my EV must be kept running!
I was talking about the problem on the local repeater when Gary, VK5FGRY popped up and said he might be able to help. Gary has a fully equipped workshop and years of experience with car repairs. He also has the most important resource of all – time. Gary came around this morning at 9:30am to assess the situation. He suggested we make a start, so we could at least work out what the problem was.
Within a few hours the gearbox was on the ground and the problem found – a stripped spline in the hub of the clutch plate. The noise I could hear was the stripped spline being filed away by the (somewhat harder) gearbox input shaft. Gary also discovered a few minor issues with stripped or missing gearbox mounting bolts. Note the filings on the inside of the hub, and splines mostly gone:
This was a lucky escape – I thought I was up for a new gearbox ($500 second hand). We guessed the torque of the electric motor (200Nm, about twice that of the original infernal combustion engine) had caused the fault. Well it’s been Electric for 8 years and 50,000km, so I guess I can’t complain.
We headed out and bought a new clutch plate ($100), some bolts, and transmission oil. After a nice lunch of home made bread and condiments, we picked up the tools again and by 6pm my little EV was on the road! Yayyyyyyy. Thank you so much Gary!
Here is the clutch (purple) re-assembled and attached to the electric motor, just before the gearbox was re-installed. Silver metal is the adapter plate. Lots of cables all over the place:
It was a great day. Nice change from my usual keyboard and laptop filled life. Lovely summer day, working outside with the tools, getting a bit dirty (but not oily – this is an electric car so no grease under my bonnet). So much nicer to do it with good company – especially someone as experienced as Gary.
I do live an interesting life. What did you do today Dad? “I worked with a friend to fix a home brew Electric Car!”
My EValbum page. Lots of pictures and technical stuff.
A marvellous piece of engineering using mainly wood, 2000 marbles, some Lego technic bits, and some electronics. Hand-driven. Watch and enjoy! (4m30s)
Marble Machine built and composed by Martin Molin
Video filmed and edited by Hannes Knutsson