Planet Linux Australia
I posted a message to the internal mailing lists at MariaDB Corporation. I have departed (I resigned) the company, but definitely not the community. Thank you all for the privilege of serving the large MariaDB Server community of users, all 12 million+ of you. See you on the mailing lists, IRC, and the developer meetings.
The Japanese have a saying, “leave when the cherry blossoms are full”.
I’ve been one of the earliest employees of this post-merge company, and was on the founding team of the MariaDB Server having been around since 2009. I didn’t make the first company meeting in Mallorca (August 2009) due to the chickenpox, but I’ve been to every one since.
We made the first stable MariaDB Server 5.1 release in February 2010. Our first Linux distribution release was in openSUSE. Our then tagline: MariaDB: Community Developed. Feature Enhanced. Backward Compatible.
In 2013, we had to make a decision: merge with our sister company SkySQL or take on investment of equal value to compete; majority of us chose to work with our family.
Our big deal was releasing MariaDB Server 5.5 – Wikipedia migrated, Google wanted in, and Red Hat pushed us into the enterprise space.
Besides managing distributions and other community related activities (and in the pre-SkySQL days Rasmus and I did everything from marketing to NRE contract management, down to even doing press releases – you wear many hats when you’re in a startup of less than 20 people), in this time, I’ve written over 220 blog posts, spoken at over 130 events (an average of 18 per year), and given generally over 250 talks, tutorials and keynotes. I’ve had numerous face-to-face meetings with customers, figuring out what NRE they may need and providing them solutions. I’ve done numerous internal presentations, audience varying from the professional services & support teams, as well as the management team. I’ve even technically reviewed many books, including one of the best introductions by our colleague, Learning MySQL & MariaDB.
Its been a good run. Seven years. Uncountable amount of flights. Too many weekends away working for the cause. A whole bunch of great meetings with many of you. Seen the company go from bootstrap, merger, Series A, and Series B.
It’s been a true privilege to work with many of you. I have the utmost respect for Team MariaDB (and of course my SkySQL brethren!). I’m going to miss many of you. The good thing is that MariaDB Server is an open source project, and I’m not going to leave the project or #maria. I in fact hope to continue speaking and working on MariaDB Server.
I hope to remain connected to many of you.
Thank you for this great privilege.
Alex and another Canberran on the Razorback (fullsize)
Alex and I signed up for the Razorback Ultra because it is in an amazing part of the country and sounded like a fun event to go do. I was heading into it a week after Six Foot, however this is all just training for UTA100 so why not. All I can say is every trail runner should do this event, it is amazing.
The atmosphere at the race is laid back and it is all about heading up into the mountains and enjoying yourself. I will be back for sure.
My words and photos are online in my Razorback Ultra 2016 gallery. This is truly one of the best runs in Australia.
My Mirage 730 - Matilda, having a rest while we ran around (fullsize)
I have fun at Goequest and love doing the event however have been a bit iffy about trying to organise a team for a few years. As many say one of the hardest things in the event is getting 4 people to the start line ready to go.
This year my attitude was similar to last, if I was asked to join a team I would probably say yes. I was asked and thus ended up racing with a bunch of fun guys under the banner of Michael's company Resultz Racing. Another great weekend on the mid north NSW coast with some amazing scenery (the two rogaines were highlights, especially the punchbowl waterfall on the second one).
My words and photos are online in my Geoquest 2016 gallery. Always good fun and a nice escape from winter.
Vote Green Maybe I threw a wish in the well For a better Australia today I looked at our leaders today And now they're in our way I'll not trade my freedom for them All our dollars and cents to the rich I wasn't looking for this But now they're in our way Our democracy is squandered Broken promises Lies everywhere Hot nights Winds are blowing Freak weather events, climate change Hey I get to vote soon And this isn't crazy But here's my idea So vote Greens maybe It's hard to look at our future But here's my idea So vote Greens maybe Hey I get to vote soon And this isn't crazy But here's my idea So vote Greens maybe And all the major parties Try to shut us up But here's my idea So vote Greens maybe Liberal and Labor think they should rule I take no time saying they fail They gave us nothing at all And now they're in our way I beg for a fairer Australia At first sight our policies are real I didn't know if you read them But it's the Greens way Your vote can fix things Healthier people Childrens education Fairer policies A change is coming Where you think you're voting, Greens? Hey I get to vote soon And this isn't crazy But here's my idea So vote Greens maybe It's worth a look to a brighter future But here's my idea So vote Greens maybe Before this change in our lives I see children in detention I see humans fleeing horrors I see them locked up and mistreated Before this change in our lives I see a way to fix this And you should know that Voting Green can help fix this, Green, Green, Green... It's bright to look at our future But here's my idea So vote Greens maybe Hey I get to vote soon And this isn't crazy But here's my idea So vote Greens maybe And all the major parties Try to shut us up But here's my idea So vote Greens maybe Before this change in our lives I see children in detention I see humans fleeing horrors I see them locked up and mistreated Before this change in our lives I see a way to fix this And you should know that So vote Green Saturday Call Me Maybe (Carly Rae Jepsen) I threw a wish in the well Don't ask me I'll never tell I looked at you as it fell And now you're in my way I trade my soul for a wish Pennies and dimes for a kiss I wasn't looking for this But now you're in my way Your stare was holding Ripped jeans Skin was showing Hot night Wind was blowing Where you think you're going baby? Hey I just met you And this is crazy But here's my number So call me maybe It's hard to look right at you baby But here's my number So call me maybe Hey I just met you And this is crazy But here's my number So call me maybe And all the other boys Try to chase me But here's my number So call me maybe You took your time with the call I took no time with the fall You gave me nothing at all But still you're in my way I beg and borrow and steal At first sight and it's real I didn't know I would feel it But it's in my way Your stare was holding Ripped jeans Skin was showing Hot night Wind was blowing Where you think you're going baby? Hey I just met you And this is crazy But here's my number So call me maybe It's hard to look right at you baby But here's my number So call me maybe Before you came into my life I missed you so bad I missed you so bad I missed you so so bad Before you came into my life I missed you so bad And you should know that I missed you so so bad, bad, bad, bad.... It's hard to look right at you baby But here's my number So call me maybe Hey I just met you And this is crazy But here's my number So call me maybe And all the other boys Try to chase me But here's my number So call me maybe Before you came into my life I missed you so bad I missed you so bad I missed you so so bad Before you came into my life I missed you so bad And you should know that So call me, maybe
No reflections (fullsize)
None outside either (fullsize)
Better when full/open (fullsize)
Also better when closed, much brightness (fullsize) For over a year I have been planning to do this, my crumpler bag (the complete seed) which I bought in 2008 has been my primary commuting and daily use bag for stuff since that time and as much as I love the bag there is one major problem. No reflective marking anywhere on the bag.
Some newer crumplers have reflective strips and other such features and if I really wanted to spend big I could get them to do a custom bag with whatever colours and reflective bits I can dream up. There are also a number of other brands that do a courier bag with reflective bits or even entire panels or similar that are reflective. However this is the bag I own and it is still perfectly good for daily use so no need to go buy something new.
So I got a $4 sewing kit I had sitting around in the house, some great 3M reflective tape material and finally spent the time to rectify this feature missing from the bag. After breaking 3 needles and spending a while getting it done I now have a much safer bag especially commuting home on these dark winter nights. The sewing work is a bit messy however it is functional which is all that matters to me.
Twenty-five years ago, a small band of programmers from the University of Minnesota ruled the internet. And then they didn’t.
The committee meeting where the team first presented the Gopher protocol was a disaster, “literally the worst meeting I’ve ever seen,” says Alberti. “I still remember a woman in pumps jumping up and down and shouting, ‘You can’t do that!’ ”
Among the team’s offenses: Gopher didn’t use a mainframe computer and its server-client setup empowered anyone with a PC, not a central authority. While it did everything the U (University of Minnesota) required and then some, to the committee it felt like a middle finger. “You’re not supposed to have written this!” Alberti says of the group’s reaction. “This is some lark, never do this again!” The Gopher team was forbidden from further work on the protocol.
Read the full article (a good story of Gopher and WWW history!) at https://www.minnpost.com/business/2016/08/rise-and-fall-gopher-protocol
Vicky talked about the importance non-committing contributors but the primary focus is on committing contributors due to time limits.
Covered the different types of drive-thru contributors and why they show up.
- Scratching an itch.
- Unwilling / Unable to find an alternative to this project
- They like you.
Why do they leave?
- Itch has been sratched.
- Not enough time.
- No longer using the project.
- Often a high barrier to contribution.
- Absence of appreciation.
- Unpleasant people.
- Inappropriate attribution.
- It takes more time to help them land patches
- Reluctance to help them "as they're not community".
It appears to be that many project see community as the foundation but Vicky contended it is contributors.
More drive-thru contributors are a sign of a healthy project and can lead to a larger community.
- Have better processes in place.
- Faster patch and release times.
- More eyes and shallower bugs
- Better community, code and project reputation.
Leads to a healthier overall project.Methods for Maxmising drive-thru contributions:
- give your project super powers.
- Ensures efficient and successful contributions.
- Minimises questions.
- Standardises processes.
- Vicky provided a documentation quick start guide.
- Code review.
- "Office hours" for communication.
- New contributor events.
- Tag starter bugs
- Contributor SLA
- Use containers / VM of dev environment
- Value contributions and contributors
- Culture of documentation
- Default to assistance
Outreach! * Gratitude * Recognition * Follow-up!
Institute the "No Asshole" rule.
Keith spoke about porting Python to mobile devices. CPython being written in C enables it to leverage the supported platforms of the C language and be compiled a wide range of platforms.
There was a deep dive in the options and pitfalls when selecting a method to and implementing Python on Android phones.
Ouroboros is a pure Python implementation of the Python standard library.
Most of the tools discussed are at an early stage of development.Why?
- Being able to run on new or mobile platforms addresses an existential threat.
- The threat also presents an opportunity to grown, broaden and improve Python.
- Wants Python to be a "first contact" language, like (Visual) Basic once was.
- Unlike Basic, Python also support very complex concepts and operations.
- Presents an opportunity to encourage broader usage by otherwise passive users.
- Technical superiority is rarely enough to guarantee success.
- A breadth of technical domains is required for Python to become this choice.
- Technical problems are the easiest to solve.
- Te most difficult problems are social and community and require more attention.
Keith's will be putting his focus into BeeWare and related projects.
Fortune favours the prepared mind
One of my clients has an important server running ZFS. They need to have a filesystem that detects corruption, while regular RAID is good for the case where a disk gives read errors it doesn’t cover the case where a disk returns bad data and claims it to be good (which I’ve witnessed in BTRFS and ZFS systems). BTRFS is good for the case of a single disk or a RAID-1 array but I believe that the RAID-5 code for BTRFS is not sufficiently tested for business use. ZFS doesn’t perform very well due to the checksums on data and metadata requiring multiple writes for a single change which also causes more fragmentation. This isn’t a criticism of ZFS, it’s just an engineering trade-off for the data integrity features.
ZFS supports read-caching on a SSD (the L2ARC) and write-back caching (ZIL). To get the best benefit of L2ARC and ZIL you need fast SSD storage. So now with my client investigating 10 gigabit Ethernet I have to investigate SSD.
For some time SSDs have been in the same price range as hard drives, starting at prices well below $100. Now there are some SSDs on sale for as little as $50. One issue with SATA for server use is that SATA 3.0 (which was released in 2009 and is most commonly used nowadays) is limited to 600MB/s. That isn’t nearly adequate if you want to serve files over 10 gigabit Ethernet. SATA 3.2 was released in 2013 and supports 1969MB/s but I doubt that there’s much hardware supporting that. See the SATA Wikipedia page for more information.
Another problem with SATA is getting the devices physically installed. My client has a new Dell server that has plenty of spare PCIe slots but no spare SATA connectors or SATA power connectors. I could have removed the DVD drive (as I did for some tests before deploying the server) but that’s ugly and only gives 1 device while you need 2 devices in a RAID-1 configuration for ZIL.M.2
M.2 is a new standard for expansion cards, it supports SATA and PCIe interfaces (and USB but that isn’t useful at this time). The wikipedia page for M.2 is interesting to read for background knowledge but isn’t helpful if you are about to buy hardware.
The first M.2 card I bought had a SATA interface, then I was unable to find a local company that could sell a SATA M.2 host adapter. So I bought a M.2 to SATA adapter which made it work like a regular 2.5″ SATA device. That’s working well in one of my home PCs but isn’t what I wanted. Apparently systems that have a M.2 socket on the motherboard will usually take either SATA or NVMe devices.
The most important thing I learned is to buy the SSD storage device and the host adapter from the same place then you are entitled to a refund if they don’t work together.
The alternative to the SATA (AHCI) interface on an M.2 device is known as NVMe (Non-Volatile Memory Express), see the Wikipedia page for NVMe for details. NVMe not only gives a higher throughput but it gives more command queues and more commands per queue which should give significant performance benefits for a device with multiple banks of NVRAM. This is what you want for server use.
Eventually I got a M.2 NVMe device and a PCIe card for it. A quick test showed sustained transfer speeds of around 1500MB/s which should permit saturating a 10 gigabit Ethernet link in some situations.
One annoyance is that the M.2 devices have a different naming convention to regular hard drives. I have devices /dev/nvme0n1 and /dev/nvme1n1, apparently that is to support multiple storage devices on one NVMe interface. Partitions have device names like /dev/nvme0n1p1 and /dev/nvme0n1p2.Power Use
I recently upgraded my Thinkpad T420 from a 320G hard drive to a 500G SSD which made it faster but also surprisingly quieter – you never realise how noisy hard drives are until they go away. My laptop seemed to feel cooler, but that might be my imagination.
The i5-2520M CPU in my Thinkpad has a TDP of 35W but uses a lot less than that as I almost never have 4 cores in use. The z7k320 320G hard drive is listed as having 0.8W “low power idle” and 1.8W for read-write, maybe Linux wasn’t putting it in the “low power idle” mode. The Samsung 500G 850 EVO SSD is listed as taking 0.4W when idle and up to 3.5W when active (which would not be sustained for long on a laptop). If my CPU is taking an average of 10W then replacing the hard drive with a SSD might have reduced the power use of the non-screen part by 10%, but I doubt that I could notice such a small difference.
I’ve read some articles about power use on the net which can be summarised as “SSDs can draw more power than laptop hard drives but if you do the same amount of work then the SSD will be idle most of the time and not use much power”.
I wonder if the SSD being slightly thicker than the HDD it replaced has affected the airflow inside my Thinkpad.
From reading some of the reviews it seems that there are M.2 storage devices drawing over 7W! That’s going to create some cooling issues on desktop PCs but should be OK in a server. For laptop use they will hopefully release M.2 devices designed for low power consumption.The Future
M.2 is an ideal format for laptops due to being much smaller and lighter than 2.5″ SSDs. Spinning media doesn’t belong in a modern laptop and using a SATA SSD is an ugly hack when compared to M.2 support on the motherboard.
Intel has released the X99 chipset with M.2 support (see the Wikipedia page for Intel X99) so it should be commonly available on desktops in the near future. For most desktop systems an M.2 device would provide all the storage that is needed (or 2*M.2 in a RAID-1 configuration for a workstation). That would give all the benefits of reduced noise and increased performance that regular SSDs provide, but with better performance and fewer cables inside the PC.
For a corporate desktop PC I think the ideal design would have only M.2 internal storage and no support for 3.5″ disks or a DVD drive. That would allow a design that is much smaller than a current SFF PC.
This is continuing on from my previous blog about NERSC’s Shifter which lets you safely use Docker containers in an HPC environment.
Getting Shifter to work in Slurm is pretty easy, it includes a plugin that you must install and tell Slurm about. My test config was just:required /usr/lib64/shifter/shifter_slurm.so shifter_config=/etc/shifter/udiRoot.conf
as I was installing by building RPMs (out preferred method is to install the plugin into our shared filesystem for the cluster so we don’t need to have it in the RAM disk of our diskless nodes). One that is done you can add the shifter programs arguments to your Slurm batch script and then just call shifter inside it to run a process, for instance:#!/bin/bash #SBATCH -p debug #SBATCH --image=debian:wheezy shifter cat /etc/issue
results in the following on our RHEL compute nodes:[samuel@bruce Shifter]$ cat slurm-1734069.out Debian GNU/Linux 7 \n \l
simply demonstrating that it works. The advantage of using the plugin and this way of specifying the images is that the plugin will prep the container for us at the start of the batch job and keep it around until it ends so you can keep running commands in your script inside the container without the overhead of having to create/destroy it each time. If you need to run something in a different image you just pass the --image option to shifter and then it will need to set up & tear down that container, but the one you specified for your batch job is still there.
That’s great for single CPU jobs, but what about parallel applications? Well turns out that’s easy too – you just request the configuration you need and slap srun in front of the shifter command. You can even run MPI applications this way successfully. I grabbed the dispel4py/docker.openmpi Docker container with shifterimg pull dispel4py/docker.openmpi and tried its Python version of the MPI hello world program:#!/bin/bash #SBATCH -p debug #SBATCH --image=dispel4py/docker.openmpi #SBATCH --ntasks=3 #SBATCH --tasks-per-node=1 shifter cat /etc/issue srun shifter python /home/tutorial/mpi4py_benchmarks/helloworld.py
This prints the MPI rank to demonstrate that the MPI wire up was successful and I forced it to run the tasks on separate nodes and print the hostnames to show it’s communicating over a network, not via shared memory on the same node. But the output bemused me a little:[samuel@bruce Python]$ cat slurm-1734135.out Ubuntu 14.04.4 LTS \n \l libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 -------------------------------------------------------------------------- [[30199,2],0]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: bruce001 Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'. Hello, World! I am process 0 of 3 on bruce001. libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 -------------------------------------------------------------------------- [[30199,2],1]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: bruce002 Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- Hello, World! I am process 1 of 3 on bruce002. libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 -------------------------------------------------------------------------- [[30199,2],2]: A high-performance Open MPI point-to-point messaging module was unable to find any relevant network interfaces: Module: OpenFabrics (openib) Host: bruce003 Another transport will be used instead, although this may result in lower performance. -------------------------------------------------------------------------- Hello, World! I am process 2 of 3 on bruce003.
It successfully demonstrates that it is using an Ubuntu container on 3 nodes, but the warnings are triggered because Open-MPI in Ubuntu is built with Infiniband support and it is detecting the presence of the IB cards on the host nodes. This is because Shifter is (as designed) exposing the systems /sys directory to the container. The problem is that this container doesn’t include the Mellanox user-space library needed to make use of the IB cards and so you get warnings that they aren’t working and that it will fall back to a different mechanism (in this case TCP/IP over gigabit Ethernet).
Open-MPI allows you to specify what transports to use, so adding one line to my batch script:export OMPI_MCA_btl=tcp,self,sm
cleans up the output a lot:Ubuntu 14.04.4 LTS \n \l Hello, World! I am process 0 of 3 on bruce001. Hello, World! I am process 2 of 3 on bruce003. Hello, World! I am process 1 of 3 on bruce002.
This also begs the question then – what does this do for latency? The image contains a Python version of the OSU latency testing program which uses different message sizes between 2 MPI ranks to provide a histogram of performance. Running this over TCP/IP is trivial with the dispel4py/docker.openmpi container, but of course it’s lacking the Mellanox library I need and as the whole point of Shifter is security I can’t get root access inside the container to install the package. Fortunately the author of the dispel4py/docker.openmpi has their implementation published on Github and so I forked their repo, signed up for Docker and pushed a version which simply adds the libmlx4-1 package I needed.
Running the test over TCP/IP is simply a matter of submitting this batch script which forces it onto 2 separate nodes:#!/bin/bash #SBATCH -p debug #SBATCH --image=chrissamuel/docker.openmpi:latest #SBATCH --ntasks=2 #SBATCH --tasks-per-node=1 export OMPI_MCA_btl=tcp,self,sm srun shifter python /home/tutorial/mpi4py_benchmarks/osu_latency.py
giving these latency results:[samuel@bruce MPI]$ cat slurm-1734137.out # MPI Latency Test # Size [B] Latency [us] 0 16.19 1 16.47 2 16.48 4 16.55 8 16.61 16 16.65 32 16.80 64 17.19 128 17.90 256 19.28 512 22.04 1024 27.36 2048 64.47 4096 117.28 8192 120.06 16384 145.21 32768 215.76 65536 465.22 131072 926.08 262144 1509.51 524288 2563.54 1048576 5081.11 2097152 9604.10 4194304 18651.98
To run that same test over Infiniband I just modified the export in the batch script to force it to use IB (and thus fail if it couldn’t talk between the two nodes):#!/bin/bash #SBATCH -p debug #SBATCH --image=chrissamuel/docker.openmpi:latest #SBATCH --ntasks=2 #SBATCH --tasks-per-node=1 export OMPI_MCA_btl=openib,self,sm srun shifter python /home/tutorial/mpi4py_benchmarks/osu_latency.py
which then gave these latency numbers:[samuel@bruce MPI]$ cat slurm-1734138.out # MPI Latency Test # Size [B] Latency [us] 0 2.52 1 2.71 2 2.72 4 2.72 8 2.74 16 2.76 32 2.73 64 2.90 128 4.03 256 4.23 512 4.53 1024 5.11 2048 6.30 4096 7.29 8192 9.43 16384 19.73 32768 29.15 65536 49.08 131072 75.19 262144 123.94 524288 218.21 1048576 565.15 2097152 811.88 4194304 1619.22
So you can see that’s basically an order of magnitude improvement in latency using Infiniband compared to TCP/IP over gigabit Ethernet (which is what you’d expect).
Because there’s no virtualisation going on here there is no extra penalty to pay when doing this, no need to configure any fancy device pass through, no loss of any CPU MSR access, and so I’d argue that Shifter makes Docker containers way more useful for HPC than virtualisation or even Docker itself for the majority of use cases.
Am I excited about Shifter – yup! The potential to allow users build and application stack themselves right down to the OS libraries and (with a little careful thought) having something that could get native interconnect performance is fantastic. Throw in the complexities of dealing with conflicting dependencies between Python modules, system libraries, bioinformatics tools, etc, etc, and needing to provide simple methods for handling these and the advantages seem clear.
So the plan is to roll this out into production at VLSCI in the near future. Fingers crossed!