Planet Linux Australia

Syndicate content
Planet Linux Australia -
Updated: 25 min 4 sec ago

Linux Users of Victoria (LUV) Announce: LUV Beginners November Meeting: Website working group

Thu, 2016-11-17 03:03
Start: Nov 19 2016 12:30 End: Nov 19 2016 16:30 Start: Nov 19 2016 12:30 End: Nov 19 2016 16:30 Location: 

Infoxchange, 33 Elizabeth St. Richmond


Website working group

We will be planning modernisation and upgrading of the LUV website. This will include looking at what features are used and what technologies are suitable for the future.

There will also be the usual casual hands-on workshop, Linux installation, configuration and assistance and advice. Bring your laptop if you need help with a particular issue. This will now occur BEFORE the talks from 12:30 to 14:00. The talks will commence at 14:00 (2pm) so there is time for people to have lunch nearby.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121 (enter via the garage on Jonas St.)

Late arrivals, please call (0490) 049 589 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

November 19, 2016 - 12:30

Jeremy Kerr: OpenPOWER console implementations

Fri, 2016-11-11 23:01

I often have folks asking how the text & video consoles work on OpenPOWER machines. Here's a bit of a rundown on how it's implemented, and what may seem a little different from x86 platforms that you may already be used to.

On POWER machines, we get the console up and working super early in the boot process. This means that we can get debug, error and state information out using text console with very little hardware initialisation, and in a human-readable format. So, we tend to use simpler devices for the console output - typically a serial UART - rather than graphical-type consoles, which require a GPU to be up and running. This keeps the initialisation code clean and simple.

However, we still want a facility for admins who are more used to a keyboard & monitor directly plugged-in to have a console facility too. More about that later though.

The majority of OpenPOWER platforms will rely on the attached management controller (BMC) to provide the UART console (as of November 2016: unless you've designed your own OpenPOWER hardware, this will be the case for you). This will be based on ASpeed's AST2400 or AST25xx system-on-chip devices, which provide a few methods of getting console data from the host to the BMC.

Between the host and the BMC, there's a LPC bus. The host is the master of the LPC bus, and the BMC the slave. One of the facilities that the BMC exposes over this bus is a set of UART devices. Each of these UARTs appear as a standard 16550A register set, so having the host interface to a UART is very simple.

As the host is booting, the host firmware will initialise the UART console, and start outputting boot progress data. First, you'll see the ISTEP messages from hostboot, then skiboot's "msglog" output, then the kernel output from the petitboot bootloader.

Because the UART is implemented by the BMC (rather than a real hardware UART), we have a bit of flexibility about what happens to the console data. On a typical machine, there are four ways of getting access to the console:

  • Direct physical connection: using the DB-9 RS232 port on the back of the machine;
  • Over network: using the BMC's serial-over-LAN interface, using something like ipmitool [...] sol activate;
  • Local keyboard/video/mouse: connected to the VGA & USB ports on the back of the machine, or
  • Remote keyboard/video/mouse: using "remote display" functionality provided by the BMC, over the network.

The first option is fairly simple: the RS232 port on the machine is actually controlled by the BMC, and not the host. Typically, the BMC firmware will just transfer data between this port and the LPC UART (which the host is interacting with). Figure 1 shows the path of the console data.

Figure 1: Local UART console.

The second is similar, but instead of the BMC transferring data between the RS232 port and the host UART, it transfers data between a UDP serial-over-LAN session and the host UART. Figure 2 shows the redirection of the console data from the host over LAN.

Figure 2: Remote UART console.

The third and fourth options are a little more complex, but basically involve the BMC rendering the UART data into a graphical format, and displaying that on the VGA port, or sending over the network. However, there are some tricky details involved...

UART-to-VGA mirroring

Earlier, I mentioned that we start the console super-early. This happens way before any VGA devices can be initialised (in fact, we don't have PCI running; we don't even have memory running!). This means that it's not possible to get these super-early console messages out through the VGA device.

In order to be useful in deployments that use VGA-based management though, most OpenPOWER machines have functionality to mirror the super-early UART data out to the VGA port. During this process, it's the BMC that drives the VGA output, and renders the incoming UART text data to the VGA device. Figure 3 shows the flow for this, with the GPU rendering text console to the graphical output.

Figure 3: Local graphical console during early boot.

In the case of remote access to the VGA device, the BMC takes the contents of this rendered graphic and sends it over the network, to a BMC-provided web application. Figure 4 illustrates the redirection to the network.

Figure 4: Remote graphical console during early boot, with graphics sent over the network

This means we have console output, but no console input. That's okay though, as this is purely to report early boot messages, rather than provide any interaction from the user.

Once the host has booted to the point where it can initialise the VGA device itself, it takes ownership of the VGA device (and the BMC relinquishes it). The first software on the host to start interacting with the video device is the Linux driver in petitboot. From there on, video output is coming from the host, rather than the BMC. Because we may have user interaction now, we use the standard host-controlled USB stack for keyboard & mouse control.

Figure 5: Local graphical console later in boot, once the host video driver has started.

Remote VGA console follows the same pattern - the BMC captures the video data that has been rendered by the GPU, and sends it over the network. In this case, the console input is implemented by virtual USB devices on the BMC, which appear as a USB keyboard and mouse to the operating system running on the host.

Figure 6: Remote graphical console later in boot, once the host video driver has started.

Typical console output during boot

Here's a few significant points of the boot process:

3.60212|ISTEP 6. 3 4.04696|ISTEP 6. 4 4.04771|ISTEP 6. 5 10.53612|HWAS|PRESENT> DIMM[03]=00000000AAAAAAAA 10.53612|HWAS|PRESENT> Membuf[04]=0C0C000000000000 10.53613|HWAS|PRESENT> Proc[05]=C000000000000000 10.62308|ISTEP 6. 6

- this is the initial output from hostboot, doing early hardware initialisation in discrete "ISTEP"s

41.62703|ISTEP 21. 1 55.22139|htmgt|OCCs are now running in ACTIVE state 63.34569|ISTEP 21. 2 63.33911|ISTEP 21. 3 [ 63.417465577,5] SkiBoot skiboot-5.4.0 starting... [ 63.417477129,5] initial console log level: memory 7, driver 5 [ 63.417480062,6] CPU: P8 generation processor(max 8 threads/core) [ 63.417482630,7] CPU: Boot CPU PIR is 0x0430 PVR is 0x004d0200 [ 63.417485544,7] CPU: Initial max PIR set to 0x1fff [ 63.417946027,5] OPAL table: 0x300c0940 .. 0x300c0e10, branch table: 0x30002000 [ 63.417951995,5] FDT: Parsing fdt @0xff00000

- here, hostboot has loaded the next firmware stage, skiboot, and we're now executing that.

[ 22.120063542,5] INIT: Waiting for kernel... [ 22.154090827,5] INIT: Kernel loaded, size: 15296856 bytes (0 = unknown preload) [ 22.197485684,5] INIT: 64-bit LE kernel discovered [ 22.218211630,5] INIT: 64-bit kernel entry at 0x20010000, size 0xe96958 [ 22.247596543,5] OCC: All Chip Rdy after 0 ms [ 22.296864319,5] Free space in HEAP memory regions: [ 22.304756431,5] Region ibm,firmware-heap free: 9b4b78 [ 22.322076546,5] Region ibm,firmware-allocs-memory@2000000000 free: 10cd70 [ 22.341542329,5] Region ibm,firmware-allocs-memory@0 free: afec0 [ 22.392470901,5] Total free: 11999144 [ 22.419746381,5] INIT: Starting kernel at 0x20010000, fdt at 0x305dbae8 (size 0x1d251)

next, the skiboot firmware has loaded the petitboot bootloader kernel (in zImage.epapr format), and is setting up memory regions in preparation for running Linux.

zImage starting: loaded at 0x0000000020010000 (sp: 0x0000000020e94ed8) Allocating 0x1545554 bytes for kernel ... gunzipping (0x0000000000000000 <- 0x000000002001d000:0x0000000020e9238b)...done 0x13c0300 bytes Linux/PowerPC load: Finalizing device tree... flat tree at 0x20ea1520 [ 24.074353446,5] OPAL: Switch to little-endian OS -> smp_release_cpus() spinning_secondaries = 159 <- smp_release_cpus() <- setup_system()

we then get the output from the zImage wrapper, which expands the actual kernel code and executes it. In recent firmware builds, the petitboot kernel will suppress most of the Linux boot messages, so we should only see high-priority warnings or error messages.

next up, the petitboot UI will be shown:

Petitboot (v1.2.3-a976d01) 8335-GCA 2108ECA ────────────────────────────────────────────────────────────────────────────── [Disk: sda1 / 590328e2-1095-4fe7-8278-0babaa9b9ca5] Ubuntu, with Linux 4.4.0-47-generic (recovery mode) Ubuntu, with Linux 4.4.0-47-generic Ubuntu [Network: enP3p3s0f3 / 98:be:94:67:c0:1b] Ubuntu 14.04.x installer Ubuntu 16.04 installer test kernel System information System configuration Language Rescan devices Retrieve config from URL *Exit to shell ────────────────────────────────────────────────────────────────────────────── Enter=accept, e=edit, n=new, x=exit, l=language, h=help

During Linux execution, skiboot will retain control of the UART (rather than exposing the LPC registers directly to the host), and provide a method for the Linux kernel to read and write to this console. That facility is provided by the OPAL_CONSOLE_READ and OPAL_CONSOLE_WRITE calls in the OPAL API.

Which one should we use?

We tend to prefer the text-based consoles for managing OpenPOWER machines - either the RS232 port on the machines for local access, or IPMI Serial over LAN (SOL) for remote access. This means that there's much less bandwidth and latency for console connections, and there is a simpler path for the console data. It's also more reliable during low-level debugging, as serial access involves fewer components of the hardware, software and firmware stacks.

That said, the VGA mirroring implementation should still work well, and is also accessible remotely by the current BMC firmware implementations. If your datacenter is not set up for local RS232 connections, you may want to use VGA for local access, and SoL for remote - or whatever works best in your situation.

Tridge on UAVs: Calling all testers! New ArduPilot sensor drivers

Fri, 2016-11-11 18:15

We'd love the diydrones community to jump aboard our testing effort for the new ArduPilot sensor system. See for details

We have just implemented a major upgrade to the ArduPilot sensor drivers for STM32 based boards (PX4v1, Pixhawk, Pixhawk2, PH2Slim, PixRacer and PixhawkMini).The new sensor drivers are a major departure from the previous drivers which used the PX4/Firmware drivers. The new drivers are all "in-tree" drivers, bringing a common driver layer for all our supported boards, so we now use the same drivers on all the Linux boards as we do on the PX4 variants.We would really appreciate some testing on as many different types of boards as possible. The new driver system has an auto-detection system to detect the board type and while we think it is all correct some validation against a wide variety of boards would be appreciated.

Advantages of the new system

The new device driver system has a number of major advantages over the old one
* common drivers with our Linux based boards
* significantly lower overhead (faster drivers) resulting in less CPU usage
* significantly less memory usage, leaving more room for other options
* significantly less flash usage, leaving more room for code growth
* remote access to raw SPI and I2C devices via MAVLink2 (useful for development)
* much simpler driver structure, making contributions easier (typical drivers are well under half the lines of code of the PX4 equivalent)
* support for much higher sampling rates for IMUs that support it (ICM-20608 and MPU-9250), leading to better vibration handling

..... see the posting on for more details on how to participate in testing. Thanks!

Michael Davies: Speaking at the OpenStack Summit in Barcelona

Fri, 2016-11-11 09:26
Last month I had the wonderful opportunity to visit Barcelona, Spain to attend the OpenStack Ocata Summit where I was able to speak about the work we've been doing integrating Ironic into OpenStack-Ansible.

(Video link:
I really enjoyed the experience, and very much enjoyed co-presenting with Andy McCrae - a great guy and an very clever and hard-working developer. If I ever make it to the UK (and see more than the inside of LHR), I'm sure going to go visit!

And here's a few happy snaps from Barcelona - lovely place, wish I could've stayed more than 6 days :-)

Colin Charles: CfP for Percona Live Santa Clara closes November 13!

Fri, 2016-11-11 05:01

At Percona Live Amsterdam recently, the conference expanded beyond just its focus areas of MySQL & its ecosystem and MongoDB to also include PostgreSQL and other open source databases (just look at the recent poll). The event was a sold out success.

This will continue for Percona Live Santa Clara 2017, happening April 24-27 2017 – and the call for papers is open till November 13 2016, so what are you waiting for? Submit already!

I am on the conference committee and am looking forward to making the best program possible. Looking forward to your submissions!

Russell Coker: Is a Thinkpad Still Like a Rolls-Royce

Sun, 2016-11-06 17:02

For a long time the Thinkpad has been widely regarded as the “Rolls-Royce of laptops”. Since 2003 one could argue that Rolls-Royce is no longer the Rolls-Royce of cars [1]. The way that IBM sold the Think business unit to Lenovo and the way that Lenovo is producing both Thinkpads and cheaper Ideapads is somewhat similar to the way the Rolls-Royce trademark and car company were separately sold to companies that are known for making cheaper cars.

Sam Varghese has written about his experience with Thinkpads and how he thinks it’s no longer the Rolls-Royce of laptops [2]. Sam makes some reasonable points to support this claim (one of which only applies to touchpad users – not people like me who prefer the Trackpoint), but I think that the real issue is whether it’s desirable to have a laptop that could be compared to a Rolls-Royce nowadays.


The Rolls-Royce car company is known for great reliability and support as well as features that other cars lack (mostly luxury features). The Thinkpad marque (both before and after it was sold to Lenovo) was also known for great support. You could take a Thinkpad to any service center anywhere in the world and if the serial number indicated that it was within the warranty period it would be repaired without any need for paperwork. The Thinkpad service centers never had any issue with repairing a Thinkpad that lacked a hard drive just as long as the problem could be demonstrated. It was also possible to purchase an extended support contract at any time which covered all repairs including motherboard replacement. I know that not everyone had as good an experience as I had with Thinkpad support, but I’ve been using them since 1998 without problems – which is more than I can say for most hardware.

Do we really need great reliability from laptops nowadays? When I first got a laptop hardly anyone I knew owned one. Nowadays laptops are common. Having a copy of important documents on a USB stick is often a good substitute for a reliable laptop, when you are in an environment where most people own laptops it’s usually not difficult to find someone who will let you use theirs for a while. I think that there is a place for a laptop with RAID-1 and ECC RAM, it’s a little known fact that Thinkpads have a long history of supporting the replacement of a CD/DVD drive with a second hard drive (I don’t know if this is still supported) but AFAIK they have never supported ECC RAM.

My first Thinkpad cost $3,800. In modern money that would be something like $7,000 or more. For that price you really want something that’s well supported to protect the valuable asset. Sam complains about his new Thinkpad costing more than $1000 and needing to be replaced after 2.5 years. Mobile phones start at about $600 for the more desirable models (IE anything that runs Pokemon Go) and the new Google Pixel phones range from $1079 to $1,419. Phones aren’t really expected to be used for more than 2.5 years. Phones are usually impractical to service in any way so for most of the people who read my blog (who tend to buy the more expensive hardware) they are pretty much a disposable item costing $600+. I previously wrote about a failed Nexus 5 and the financial calculations for self-insuring an expensive phone [3]. I think there’s no way that a company can provide extended support/warranty while making a profit and offering a deal that’s good value to customers who can afford to self-insure. The same applies for the $499 Lenovo Ideapad 310 and other cheaper Lenovo products. Thinkpads (the higher end of the Lenovo laptop range) are slightly more expensive than the most expensive phones but they also offer more potential for the user to service them.


My first Thinkpad was quite underpowered when compared to desktop PCs, it had 32M of RAM and could only be expanded to 96M at a time when desktop PCs could be expanded to 128M easily and 256M with some expense. It had a 800*600 display when my desktop display was 1280*1024 (37% of the pixels). Nowadays laptops usually start at about 8G of RAM (with a small minority that have 4G) and laptop displays start at about 1366*768 resolution (51% of the pixels in a FullHD display). That compares well to desktop systems and also is capable of running most things well. My current Thinkpad is a T420 with 8G of RAM and a 1600*900 display (69% of FullHD), it would be nice to have higher resolution but this works well and it was going cheap when I needed a new laptop.

Modern Thinkpads don’t have some of the significant features that older ones had. The legendary Butterfly Keyboard is long gone, killed by the wide displays that economies of scale and 16:9 movies have forced upon us. It’s been a long time since Thinkpads had some of the highest resolution displays and since anyone really cared about it (you only need pixels to be small enough that you can’t see them).

For me one of the noteworthy features of the Thinkpads has been the great keyboard. Mechanical keys that feel like a desktop keyboard. It seems that most Thinkpads are getting the rubbery keyboard design made popular by Apple. I guess this is due to engineering factors in designing thin laptops and the fact that most users don’t care.

Matthew Garrett has blogged about the issue of Thinkpad storage configured as “RAID mode” without any option to disable it [4]. This is an annoyance (which incidentally has been worked around) and there are probably other annoyances like it. Designing hardware and an OS are both complex tasks. The interaction between Windows and the hardware is difficult to get right from both sides and the people who design the hardware often don’t think much about Linux support. It has always been this way, the early Thinkpads had no Linux support for special IBM features (like fan control) and support for ISA-PnP was patchy. It is disappointing that Lenovo doesn’t put a little extra effort into making sure that Linux works well on their hardware and this might be a reason for considering another brand.

Service Life

I bought my curent Thinkpad T420 in October 2013 [5] It’s more than 3 years old and has no problems even though I bought it refurbished with a reduced warranty. This is probably the longest I’ve had a Thinkpad working well, which seems to be a data point against the case that modern Thinkpads aren’t as good.

I bought a T61 in February 2010 [6], it started working again (after mysteriously not working for a month in late 2013) and apart from the battery lasting 5 minutes and a CPU cooling problem it still works well. If that Thinkpad had cost $3,800 then I would have got it repaired, but as it cost $796 (plus the cost of a RAM upgrade) and a better one was available for $300 it wasn’t worth repairing.

In the period 1998 to 2010 I bought a 385XD, a 600E, a T21, a T43, and a T61 [6]. During that time I upgraded laptops 4 times in 12 years (I don’t have good records of when I bought each one). So my average Thinkpad has lasted 3 years. The first 2 were replaced to get better performance, the 3rd was replaced when an employer assigned me a Thinkpad (and sold it to be when I left), and 4 and 5 were replaced due to hardware problems that could not be fixed economically given the low cost of replacement.


Thinkpads possibly don’t have the benefits over other brands that they used to have. But in terms of providing value for the users it seems that they are much better than they used to be. Until I wrote this post I didn’t realise that I’ve broken a personal record for owning a laptop. It just keeps working and I hadn’t even bothered looking into the issue. For some devices I track how long I’ve owned them while thinking “can I justify replacing it yet”, but the T420 just does everything I want. The battery still lasts 2+ hours which is a new record too, with every other Thinkpad I’ve owned the battery life has dropped to well under an hour within a year of purchase.

If I replaced this Thinkpad T420 now it will have cost me less than $100 per year (or $140 per year including the new SSD I installed this year), that’s about 3 times better than any previous laptop! I wouldn’t feel bad about replacing it as I’ve definitely got great value for money from it. But I won’t replace it as it’s doing everything I want.

I’ve just realised that by every measure (price, reliability, and ability to run all software I want to run) I’ve got the best Thinkpad I’ve ever had. Maybe it’s not like a Rolls-Royce, but I’d much rather drive a 2016 Tesla than a 1980 Rolls-Royce anyway.

Related posts:

  1. Thinkpad T420 I’ve owned a Thinkpad T61 since February 2010 [1]. In...
  2. PC prices drop again! A few weeks ago Dell advertised new laptops for $849AU,...
  3. I Just Bought a new Thinkpad and the Lenovo Web Site Sucks I’ve just bought a Thinkpad T61 at auction for $AU796....

OpenSTEM: Upcoming free event in Brisbane (24 November): Understanding Our World – showcase @ Seville (Holland Park)

Thu, 2016-11-03 17:04

At the start of 2016, Seville Road State School replaced their existing History and Geography units with the integrated Understanding Our World HASS + Science materials developed by OpenSTEM. As this school year draws to a close, we’d like to invite you to come and celebrate the marvellous outcomes that the students and teachers have achieved: Thursday 24 November 2016, 3:30pm.

This is your perfect opportunity to look at the materials and actual class work, as well as ask lots of questions! Meet with Cheryl Rowe (Principal), Andrea Budd (Head of Curriculum), several teachers, OpenSTEM’s Arjen Lentz (Chief Explorer) and Dr Claire Reeler, who leads the Understanding Our World program development.

Join us on Thursday 24 November 2016, 3:30pm at Seville Road State School library!

Free event.  RSVP today.  Drinks and nibbles will be provided.

You can also download this invite as a PDF leaflet to pass on to others.

What is Understanding Our World ?
  • Complete integrated units, 4 units per year level
  • Prep to Year 6, with multi-year level integration:
    • P-2: History and Geography + Science
    • 3-4: History, Geography and Civics & Citizenship + Science
    • 5-6: History, Geography, Civics & Citizenship and Economics & Business + Science
  • Aligned with the Australian Curriculum for HASS + Science (comprehensive)
  • Assessment tasks aligned with achievement standards in the Australian Curriculum
  • Teacher handbooks, student workbooks, assessment guides
  • Hundreds of beautiful colour resource PDFs with many custom maps and photos
  • More information:

Can’t make it on the 24th?
Contact OpenSTEM to arrange a visit to your school and discuss implementation options.

Tim Serong: Hello Salty Goodness

Wed, 2016-11-02 19:04

Anyone who’s ever deployed Ceph presumably knows about ceph-deploy. It’s right there in the Deployment chapter of the upstream docs, and it’s pretty easy to use to get a toy test cluster up and running. For any decent sized cluster though, ceph-deploy rapidly becomes cumbersome… As just one example, do you really want to have to `ceph-deploy osd prepare` every disk? For larger production clusters it’s almost certainly better to use a fully-fledged configuration management tool, such as Salt, which is what this post is about.

For those not familiar with Salt, the quickest way I can think to describe it is as follows:

  • One host is the Salt master.
  • All the hosts you’re managing are Salt minions.
  • From the master, you can do things to the minions, for example you can run arbitrary commands. More interestingly though, you can create state files which will ensure minions are configured in a certain way – maybe a specific set of packages are installed, or some configuration files are created or modified.
  • Salt has a means of storing per-minion configuration data, in what’s called a pillar. All pillar data is stored on the Salt master.
  • Salt has various runner modules, which are convenience applications that execute on the Salt master. One such is the orchestrate runner, which can be used to apply configuration to specific minions in a certain order.

There’s a bit more to Salt than that, but the above should hopefully provide enough background that what follows below makes sense.

To use Salt to deploy Ceph, you obviously need to install Salt, but you also need a bunch of Salt state files, runners and modules that know how to do all the little nitty gritty pieces of Ceph deployment – installing the software, bootstrapping the MONs, creating the OSDs and so forth. Thankfully some of my esteemed colleagues on the SUSE Enterprise Storage team have created DeepSea, which provides exactly that. To really understand DeepSea, you should probably review the Intro, Management and Policy docs, but for this blog post, I’m going to take the classic approach of walking through set up of (you guessed it) a toy test cluster.

I have six hosts here, imaginatively named:


They’re all running SLES 12 SP2, and they all have one extra disk, most of which will be used for Ceph OSDs.

ses4-0 will be my Salt master. Every host will also be a Salt minion, including ses4-0. This is because DeepSea needs to perform certain operations on the Salt master, notably automatically creating initial pillar configuration data.

So, first we have to install Salt. You can do this however you see fit, but in my case it goes something like this:

ssh ses4-0 'zypper --non-interactive install salt-master ; systemctl enable salt-master ; systemctl start salt-master' for n in $(seq 0 5) ; do ssh ses4-$n 'zypper --non-interactive install salt-minion ; echo "master:" > /etc/salt/minion.d/master.conf ; systemctl enable salt-minion ; systemctl start salt-minion' done

Then, on ses4-0, accept all the minion keys, and do a if you like:

ses4-0:~ # salt-key -A The following keys are going to be accepted: Unaccepted Keys: Proceed? [n/Y] y Key for minion accepted. Key for minion accepted. Key for minion accepted. Key for minion accepted. Key for minion accepted. Key for minion accepted. ses4-0:~ # salt '*' True True True True True True

That works, so now DeepSea needs to be installed on the salt-master host (ses4-0). There’s RPMs available for various SUSE distros, so I just ran `zypper in deepsea`, but for other distros you can run `make install` from a clone of the source tree to get everything in the right place. If you’re doing the latter, you’ll need to `systemctl restart salt-master` so that it picks up /etc/salt/master.d/modules.conf and /etc/salt/master.d/reactor.conf included with DeepSea. Note: we’ll happily accept patches from anyone who’s interested in helping with packaging for other distros

There’s one other tweak required. /srv/pillar/ceph/master_minion.sls specifies the hostname of the salt master. When installing the RPM, this is automatically set to $(hostname -f), so in my case I have:

ses4-0:~ # cat /srv/pillar/ceph/master_minion.sls master_minion:

If you’re not using the RPM, you’ll need to tweak that file appropriately by hand.

Now comes the interesting part. DeepSea splits Ceph deployment into several stages, as follows:

Stage 0: Provisioning
Ensures latest updates are installed on all minions (technically this is optional).
Stage 1: Discovery
Interrogates all the minions and creates pillar configuration fragments in /srv/pillar/ceph/proposals.
Stage 2: Configure
Before running this stage, you have to create a /srv/pillar/ceph/proposals/policy.cfg file, specifying which nodes are to have which roles (MON, OSD, etc.). This stage then merges the pillar data into its final form.
Stage 3: Deploy
Validates the pillar data to ensure the configuration is correct, then deploys the MONs and OSDs.
Stage 4: Services
Deploys non-core services (iSCSI gateway, CephFS, RadosGW, openATTIC).
Stage 5: Removal
Used to decommission hosts.

Let’s give it a try. Note that there’s presently a couple of SUSE-isms in DeepSea (notably some invocations of `zypper` to install software), so if you’re following along at home on a different distro and run into any kinks, please either let us know what’s broken or open a PR if you’ve got a fix.

ses4-0:~ # salt-run state.orch ceph.stage.0 master_minion : valid ceph_version : valid None ########################################################### The salt-run command reports when all minions complete. The command may appear to hang. Interrupting (e.g. Ctrl-C) does not stop the command. In another terminal, try 'salt-run' or 'salt-run state.event pretty=True' to see progress. ########################################################### False True [WARNING ] All minions are ready True ses4-0.example.com_master: ---------- ID: sync master Function: salt.state Result: True Comment: States ran successfully. Updating Started: 23:04:06.166492 Duration: 459.509 ms Changes: ---------- ID: load modules Function: Name: saltutil.sync_all Result: True Comment: Module function saltutil.sync_all executed Started: 23:04:06.312110 Duration: 166.785 ms Changes: ---------- ret: ---------- beacons: grains: log_handlers: modules: output: proxymodules: renderers: returners: sdb: states: utils: Summary for ------------ Succeeded: 1 (changed=1) Failed: 0 ------------ Total states run: 1 [...]

There’s a whole lot more output than I’ve quoted above, because that’s what happens with Salt when you apply a whole lot of state to a bunch of minions, but it finished up with:

Summary for ses4-0.example.com_master ------------- Succeeded: 15 (changed=9) Failed: 0 ------------- Total states run: 15

Stage 1 (discovery) is next, to interrogate all the minions and create configuration fragments:

ses4-0:~ # salt-run state.orch ceph.stage.1 [WARNING ] All minions are ready True - True ses4-0.example.com_master: ---------- ID: ready Function: salt.runner Name: minions.ready Result: True Comment: Runner function 'minions.ready' executed. Started: 23:05:02.991371 Duration: 588.429 ms Changes: Invalid Changes data: True ---------- ID: discover Function: salt.runner Name: populate.proposals Result: True Comment: Runner function 'populate.proposals' executed. Started: 23:05:03.580038 Duration: 1627.563 ms Changes: Invalid Changes data: [True] Summary for ses4-0.example.com_master ------------ Succeeded: 2 (changed=2) Failed: 0 ------------ Total states run: 2

Now there’s a bunch of interesting data from all the minions stored as SLS and YAML files in subdirectories of /srv/pillar/ceph/proposals. This is the point at which you get to decide exactly how your cluster will be deployed – what roles will be assigned to each node, and how the OSDs will be configured. You do this by creating a file called /srv/pillar/ceph/proposals/policy.cfg, which in turn includes the relevant configuration fragments generated during the discovery stage.

Creating /srv/pillar/ceph/proposals/policy.cfg is probably the one part of using DeepSea that’s easy to screw up. We’re working on making it easier, and the policy docs and examples will help, but in the meantime the approach I’ve taken personally is to generate a policy.cfg including every possible option, then get rid of the ones I don’t want. Here’s a dump of every single configuration fragment generated by the discovery stage on my toy test cluster:

ses4-0:~ # cd /srv/pillar/ceph/proposals/ ses4-0:/srv/pillar/ceph/proposals # find * -name \*.sls -o -name \*.yml | sort > policy.cfg ses4-0:/srv/pillar/ceph/proposals # cat policy.cfg cluster-ceph/cluster/ cluster-ceph/cluster/ cluster-ceph/cluster/ cluster-ceph/cluster/ cluster-ceph/cluster/ cluster-ceph/cluster/ cluster-unassigned/cluster/ cluster-unassigned/cluster/ cluster-unassigned/cluster/ cluster-unassigned/cluster/ cluster-unassigned/cluster/ cluster-unassigned/cluster/ config/stack/default/ceph/cluster.yml config/stack/default/global.yml profile-1QEMU24GB-1/cluster/ profile-1QEMU24GB-1/cluster/ profile-1QEMU24GB-1/cluster/ profile-1QEMU24GB-1/cluster/ profile-1QEMU24GB-1/cluster/ profile-1QEMU24GB-1/cluster/ profile-1QEMU24GB-1/stack/default/ceph/minions/ profile-1QEMU24GB-1/stack/default/ceph/minions/ profile-1QEMU24GB-1/stack/default/ceph/minions/ profile-1QEMU24GB-1/stack/default/ceph/minions/ profile-1QEMU24GB-1/stack/default/ceph/minions/ profile-1QEMU24GB-1/stack/default/ceph/minions/ role-admin/cluster/ role-admin/cluster/ role-admin/cluster/ role-admin/cluster/ role-admin/cluster/ role-admin/cluster/ role-client-cephfs/cluster/ role-client-cephfs/cluster/ role-client-cephfs/cluster/ role-client-cephfs/cluster/ role-client-cephfs/cluster/ role-client-cephfs/cluster/ role-client-iscsi/cluster/ role-client-iscsi/cluster/ role-client-iscsi/cluster/ role-client-iscsi/cluster/ role-client-iscsi/cluster/ role-client-iscsi/cluster/ role-client-radosgw/cluster/ role-client-radosgw/cluster/ role-client-radosgw/cluster/ role-client-radosgw/cluster/ role-client-radosgw/cluster/ role-client-radosgw/cluster/ role-igw/cluster/ role-igw/cluster/ role-igw/cluster/ role-igw/cluster/ role-igw/cluster/ role-igw/cluster/ role-igw/stack/default/ceph/minions/ role-igw/stack/default/ceph/minions/ role-igw/stack/default/ceph/minions/ role-igw/stack/default/ceph/minions/ role-igw/stack/default/ceph/minions/ role-igw/stack/default/ceph/minions/ role-master/cluster/ role-master/cluster/ role-master/cluster/ role-master/cluster/ role-master/cluster/ role-master/cluster/ role-mds/cluster/ role-mds/cluster/ role-mds/cluster/ role-mds/cluster/ role-mds/cluster/ role-mds/cluster/ role-mds-nfs/cluster/ role-mds-nfs/cluster/ role-mds-nfs/cluster/ role-mds-nfs/cluster/ role-mds-nfs/cluster/ role-mds-nfs/cluster/ role-mon/cluster/ role-mon/cluster/ role-mon/cluster/ role-mon/cluster/ role-mon/cluster/ role-mon/cluster/ role-mon/stack/default/ceph/minions/ role-mon/stack/default/ceph/minions/ role-mon/stack/default/ceph/minions/ role-mon/stack/default/ceph/minions/ role-mon/stack/default/ceph/minions/ role-mon/stack/default/ceph/minions/ role-rgw/cluster/ role-rgw/cluster/ role-rgw/cluster/ role-rgw/cluster/ role-rgw/cluster/ role-rgw/cluster/ role-rgw-nfs/cluster/ role-rgw-nfs/cluster/ role-rgw-nfs/cluster/ role-rgw-nfs/cluster/ role-rgw-nfs/cluster/ role-rgw-nfs/cluster/

What I actually wanted to deploy was a Ceph cluster with MONs on ses4-1, ses4-2 and ses4-3, and OSDs on ses4-1, ses4-2, ses4-3 and ses4-4. I didn’t want DeepSea to do anything with ses4-5, and I’ve elected not to deploy CephFS, RadosGW, openATTIC or any other services, because this blog post is going to be long enough as it is. So here’s what I pared my policy.cfg back to:

ses4-0:/srv/pillar/ceph/proposals # cat policy.cfg cluster-unassigned/cluster/*.sls cluster-ceph/cluster/ses4-[0-4] config/stack/default/ceph/cluster.yml config/stack/default/global.yml profile-1QEMU24GB-1/cluster/ses4-[1-4] profile-1QEMU24GB-1/stack/default/ceph/minions/ses4-[1-4] role-master/cluster/ role-admin/cluster/ses4-[1-3] role-mon/cluster/ses4-[1-3] role-mon/stack/default/ceph/minions/ses4-[1-3]

The cluster-unassigned line defaults all nodes to not be part of the Ceph cluster. The following cluster-ceph line adds only those nodes I want DeepSea to manage (this is how I’m excluding Ordering is important here as later lines will override earlier lines.

The role-* lines determine which nodes are going to be MONs. role-admin is needed on the MON nodes to ensure the Ceph admin keyring is installed on those nodes.

The profile-* lines determine how my OSDs will be deployed. In my case, because this is a ridiculous toy cluster, I have only one disk configuration on all my nodes (a single 24GB volume). On a real cluster there may be several profiles available to choose from, potentially mixing drive types and using SSDs for journals. Again, this is covered in more detail in the policy docs.

Now that policy.cfg is set up correctly, it’s time to runs stages 2 and 3:

ses4-0:~ # salt-run state.orch ceph.stage.2 True True ses4-0.example.com_master: ---------- ID: push proposals Function: salt.runner Name: push.proposal Result: True Comment: Runner function 'push.proposal' executed. Started: 23:13:43.092320 Duration: 209.321 ms Changes: Invalid Changes data: True ---------- ID: refresh_pillar1 Function: salt.state Result: True Comment: States ran successfully. Updating,,,,, Started: 23:13:43.302018 Duration: 705.173 ms [...]

Again, I’ve elided quite a bit of Salt output above. Stage 3 (deployment) can take a while, so if you’re looking for something to do while that’s happening, you can either play with `salt-run` or `salt-run state.event pretty=True` in another terminal, or you can watch a video:

ses4-0:~ # salt-run state.orch ceph.stage.3 firewall : disabled True fsid : valid public_network : valid public_interface : valid cluster_network : valid cluster_interface : valid monitors : valid storage : valid master_role : valid mon_role : valid mon_host : valid mon_initial_members : valid time_server : valid fqdn : valid True [ERROR ] Run failed on minions:,,,, Failures: ---------- ID: ntp Function: pkg.installed Result: True Comment: All specified packages are already installed Started: 23:16:18.251858 Duration: 367.172 ms Changes: ---------- ID: sync time Function: Name: sntp -S -c Result: False Comment: Command "sntp -S -c" run Started: 23:16:18.620013 Duration: 37.438 ms Changes: ---------- pid: 11002 retcode: 1 stderr: sock_cb: not in sync, skipping this server stdout: sntp 4.2.8p8@1.3265-o Mon Jun 6 08:12:56 UTC 2016 (1) [...] ---------- ID: packages Function: salt.state Result: True Comment: States ran successfully. Updating,,,, Started: 23:16:19.035412 Duration: 15967.272 ms Changes: ---------- ID: ceph Function: pkg.installed Result: True Comment: The following packages were installed/updated: ceph Started: 23:16:19.666218 Duration: 15134.487 ms [...] ---------- ID: monitors Function: salt.state Result: True Comment: States ran successfully. Updating,, Started: 23:16:36.622000 Duration: 891.694 ms [...] ---------- ID: osd auth Function: salt.state Result: True Comment: States ran successfully. Updating Started: 23:16:37.513840 Duration: 540.991 ms [...] ---------- ID: storage Function: salt.state Result: True Comment: States ran successfully. Updating,,, Started: 23:16:38.054970 Duration: 10854.171 ms [...]

The only failure above is a minor complaint about NTP. Everything else (installing the packages, deploying the MONs, creating the OSDs etc.) ran through just fine. Check it out:

ses4-1:~ # ceph status cluster 9b259825-0af1-36a9-863a-e058e4b0706b health HEALTH_OK monmap e1: 3 mons at {ses4-1=,ses4-2=,ses4-3=} election epoch 4, quorum 0,1,2 ses4-3,ses4-2,ses4-1 osdmap e9: 4 osds: 4 up, 4 in flags sortbitwise pgmap v18: 64 pgs, 1 pools, 16 bytes data, 3 objects 133 MB used, 77646 MB / 77779 MB avail 64 active+clean

We now have a running Ceph cluster. Like I said, this one is a toy, and I haven’t demonstrated stage 4 (services), but hopefully this has demonstrated the scalability possible when using Salt with DeepSea to deploy Ceph. The above process is the same whether you have four nodes or four hundred; it’s just the creation of a policy.cfg file plus a few `salt-run` invocations.

Finally, if you’re wondering about the title of this post, its what Cordelia Chase said the first time she saw Angel in Buffy The Vampire Slayer (time index 0:22 in this video). I was going to lead with that, but after watching the episode again, there’s teenage angst, a trip to the morgue and all sorts of other stuff, none of which really makes a good analogy for the technology I’m talking about here. The clickbait in my “Salt and Pepper Squid with Fresh Greens” post was much better.

Ben Martin: Houndbot progresses

Tue, 2016-11-01 21:08
All four new shocks are now fitted! The tires are still deflated so they look a little wobbly. I ended up using a pillow mount with a 1/4 inch channel below it. The pillow is bolted to the channel from below and the channel is then bolted from the sides through the alloy beams. The glory here is that the pillows will never come off. If the bolts start to vibrate loose they will hit the beam and be stopped. They can not force the pillow mount up to get more room because of the bolts securing the 1/4 inch channel to the alloy beams coming in from the sides.

I'm not overly happy with the motor hub mount to wheel connection which will be one of the next points of update. Hopefully soon I will have access to a cnc with a high power spindle and can machine some alloy crossover parts for the wheel assembly. It has been great to use a dual vice drill and other basic power and hand tools to make alloy things so far. But the powerful CNC will open the door to much 2.5D stuff using cheapish sheet alloy.

But for now, the houndbot is on the move again. No longer to the wheels just extend outward under load. Though I don't know if I want to test the 40km/h top speed without updating some of the mountings and making some bushings first.

James Bromberger: The Debian Cloud Sprint 2016

Tue, 2016-11-01 03:04

I’m at an airport, about to board the first of three flights across the world, from timezone +8 to timezone -8. I’ll be in transit 27 hours to get to Seattle, Washington state. I’m leaving my wife and two young children behind.

My work has given me a days’ worth of leave under the Corporate Social Responsibility program, and I’m taking three days’ annual leave, to do this. 27 hours each way in transit, for 3 days on the ground.



I started playing in technology as a kid in the 1980s; my first PC was a clone (as they were called) 286 running MS-DOS. It was clunky, and the most I could do to extend it was to write batch scripts. As a child I had no funds for commercial compilers, no network connections (this was pre Internet in Australia), no access to documentation, and no idea where to start programming properly.

It was a closed world.

I hit university in the summer of 1994 to study Computer Science and French. I’d heard of Linux, and soon found myself installing the Linux distributions of the day. The Freedom of the licensing, the encouragement to use, modify, share, was in stark contrast to the world of consumer PCs of the late 1980’s.

It was there at the UCC at UWA I discovered Debian. Some of the kind network/system admins at the University maintained a Debian mirror on the campus LAN, updated regularly and always online. It was fast, and more importantly, free for me to access. Back in the 1990s, bandwidth in Australia was incredibly expensive. The vast distances of the country mean that bandwidth was scarce. Telcos were in races to put fiber between Perth and the Eastern States, and without that in place, IP connectivity was constrained, and thus costly.

Over many long days and nights I huddled down, learning window managers, protocols, programming and scripting languages. I became… a system/network administrator, web developer, dev ops engineer, etc. My official degree workload, algorithmic complexity, protocol stacks, were interesting, but fiddling with Linux based implementations was practical.


After years of consuming the output of Debian – and running many services with it – I decided to put my hand up and volunteer as a Debian Developer: it was time to give back. I had benefited from Debian, and I saw others benefit from it as well.

As the 2000’s started, I had my PGP key in the Debian key ring. I had adopted a package and was maintaining it – load balancing Apache web servers. The web was yet to expand to the traffic levels you see today; most web sites were served from one physical web server. Site Reliability Engineering was a term not yet dreamed of.

What became more apparent was the applicability of Linux, Open Source, and in my line-of-sight Debian to a wider community beyond myself and my university peers. Debain was being used to revive recycled computers that were being donated to charities; in some cases, unable to transfer commercial software licenses with the hardware that was no longer required by organisations that had upgraded. It appeared that Debian was being used as a baseline above which society in general had access to fundamental capability of computing and network services.

The removal of subscriptions, registrations, and the encouragement of distribution meant this occurred at rates that could never be tracked, and more importantly, the consensus was that it should not be automatically tracked. The privacy of the user is paramount – more important than some statistics for the Developer to ponder.

When the Bosnia-Herzegovina war ended in 1995, I recall an email from academics there, having found some connectivity, writing to ask if they would be able to use Debian as part of their re-deployment of services for the Tertiary institutions in the region. This was an unnecessary request as Debian GNU/Linux is freely available, but it was a reminder that, for the country to have tried to procure commercial solutions at that time would have been difficult. Instead, those that could do the task just got on with it.

There’s been many similar project where the grass-roots organisations – non profits, NGOs, and even just loose collectives of individuals – have turned to Linux, Open Source, and sometimes Debian to solve their problems. Many fine projects have been established to make technology accessible to all, regardless of race, gender, nationality, class, or any other label society has used to divide humans. Big hat tip to Humanitarian Open Street Map, Serval Project.

I’ve always loved Debian’s position on being the Universal operating system. Its’ vast range of packages and wide range of computing architectures supported means that quite often a litmus test of “is project X a good project?” was met with “is it packaged for Debian?”. That wide range of architectures has meant that administrators of systems had fewer surprises and a faster adoption cycle when changing platforms, such as the switch from x86 32 bit to x86 64 bit.

Enter the Cloud

I first laid eyes on the AWS Cloud in 2008. It was nothing like the rich environment you see today. The first thing I looked for was my favourite operating system, so that what I already knew and was familiar with was available in this environment to minimise the learning curve. However there were no official images, which was disconcerting.

In 2012 I joined AWS as an employee. Living in Australia they hired me into the field sales team as a Solution Architect – a sort of pre-sales tech – with a customer focused depth in security. It was a wonderful opportunity, and I learnt a great deal. It also made sense (to me, at least) to do something about getting Debian’s images blessed.

It turned out, that I had to almost define what that was: images endorsed by a Debian Developer, handed to the AWS Marketplace team. And so since 2013 I have done so, keeping track of Debian’s releases across the AWS regions, collaborating with other Debian folk on other cloud platforms to attempt a unified approach to generating and maintaining these images. This included (for a stint) generating them into the AWS GovCloud Region, and still into the AWS China (Beijing) Region – the other side of the so-called Great Firewall of China.

So why the trip?

We’ve had focus groups at the Debconf (Debian conference) around the world, but its often difficult to get the right group of people in the same rooms at the same time. So the proposal was to hold a focused Debian Cloud Sprint. Google was good enough to host this, for all the volunteers across all the cloud providers. Furthermore, donated funds were found to secure the travel for a set of people to attend who otherwise could not.

I was lucky enough to be given a flight.

So here I am, in the terminal in Australia: my kids are tucked up in bed, dreaming of the candy they just collected for Halloween. It will be a draining week I am sure, but if it helps set and improve the state of Debian then its worth it.

Stewart Smith: Fast Reset, Trusted Boot and the security of /sbin/reboot

Mon, 2016-10-31 13:00

In OpenPOWER land, we’ve been doing some work on Secure and Trusted Boot while at the same time doing some work on what we call fast-reset (or fast-reboot, depending on exactly what mood someone was in at any particular time…. we should start being a bit more consistent).

The basic idea for fast-reset is that when the OS calls OPAL reboot, we gather all the threads in the system using a combination of patching the reset vector and soft-resetting them, then cleanup a few bits of hardware (we do re-probe PCIe for example), and reload & restart the bootloader (petitboot).

What this means is that typing “reboot” on the command line goes from a ~90-120+ second affair (through firmware to petitboot, linux distros still take ages to shut themselves down) down to about a 20 second affair (to petitboot).

If you’re running a (very) recent skiboot, you can enable it with a special hidden NVRAM configuration option (although we’ll likely enable it by default pretty soon, it’s proving remarkably solid). If you want to know what that NVRAM option is… Use the source, Luke! (or git history, but I’ve yet to see a neat Star Wars reference referring to git commit logs).

So, there’s nothing like a demo. Here’s a demo with Ubuntu running off an NVMe drive on an IBM S822LC for HPC (otherwise known as Minsky or Garrison) which was running the HTX hardware exerciser, through fast-reboot back into Petitboot and then booting into Ubuntu and auto-starting the exerciser (HTX) again.

Apart from being stupidly quick when compared to a full IPL (Initial Program Load – i.e. boot), since we’re not rebooting out of band, we have no way to reset the TPM, so if you’re measuring boot, each subsequent fast-reset will result in a different set of measurements.

This may be slightly confusing, but it’s not really a problem. You see, if a machine is compromised, there’s nothing stopping me replacing /sbin/reboot with something that just prints things to the console that look like your machine rebooted but in fact left my rootkit running. Indeed, fast-reset and a full IPL should measure different values in the TPM.

It also means that if you ever want to re-establish trust in your OS, never do a reboot from the host – always reboot out of band (e.g. from a BMC). This, of course, means you’re trusting your BMC to not be compromised, which I wouldn’t necessarily do if you suspect your host has been.