All posts by pjessen

Xen consolidation

hp proliant dl380 g4
An HP Proliant DL380 G4
This will be old hat to you experienced Xen users, but I have mostly left my engineers to look after our xen/cloud hosting, and have not had so much real exposure to Xen myself. I have however been looking after two oldish Xen hosts, “oslo” and “airbus”, together running a collection miscellaneous test & infrastructure machines. The hardware is ancient, a couple of HP Proliant DL380 G4s, but those who know this hardware will also know they just won’t die. However, we have been putting more and more load on them, and lately, after migrating off ISDN, we wanted to virtualise our Asterisk box, but simply did not have the room.

This weekend I grabbed another elderly box, an HP Proliant DL380 G5 with 32Gb and dual quad cores. Plenty for our infrastructure needs. I got it up and running openSUSE Leap15 in no time, then I started contemplating what to move first.
The answer was obvious – the test systems, “test150” and “test422”. I started with test150, shut it down, created the logical volume on “mirage”, the new host. Copied over the logical volume from “oslo” with

dd bs=1048576 if=/dev/xenspace/test150 | ssh mirage "dd of=/dev/xenspace/test150".

I copied the config over, then cranked it up “xl create /srv/xen/auto/test150”. Job done.

I am just pleasantly surprised how smooth and easy it all went. Chapeau Xen!

Completing the migration of the remaining machines, 9 or 10 of them, from “oslo” and “airbus” to “mirage” should be done tomorrow. That’ll leave us plenty of extra capacity for the future as well as reduce power consumption, which is always good.

Telephony: moving from ISDN to VoIP

Around 2013, Swisscom announced their intentions to migrate all residential and all business customers off analogue and ISDN connections on to VoIP. At ENIDAN as well as at home, we have been on ISDN for more than 15 years, but there is probably little doubt that the technology is outdated, we need to progress. In fact, although the connection to Swisscom is ISDN, we switched ISDN off internally back around 2007 and moved everything to VoIP. Moving 100% to VoIP should not be an issue.

To begin with I contacted Swisscom to inquire whether they provided a plain SIP service. “Certainly, we offer an SIP trunk service”. That’s great I thought, but it turned out this was only available with a Swisscom uplink too, which we don’t have and we don’t want. Then I started looking at other VoIP providers. A few years back I had already played with a service from www.sipcall.ch, so I looked them up again.

In June 2015 I decided it was time to get serious, so I created a new account with Sipcall, and routed all our outbound private telephony over that. After a while of testing and trying to get a feel for the Sipcall stability (which is fine), in October 2016 I decided to go the whole way and have our private numbers ported (Vollportierung) to Sipcall. It went very smoothly, no problems at all. Now, about a year later, I have just received confirmation of the porting of our business setup to Sipcall too.

For the business line, we have a couple of issues to resolve –

  • sending SMSes
  • diverting calls
  • sending/receiving faxes.

Sending SMSes

For critical alerts in the datacentre, we generate SMSes and send them over the Swisscom SMSC (SMS service centre). Sipcall is not a mobile operator, hence they don’t offer an SMSC service. I wrote a separate entry on the technical solution, the short version is – we have reverted to the old-fashioned solution of using a USB GSM modem with a pre-paid subscription.

Diverting calls

With ISDN, you have the option of diverting calls. This is done in the ISDN exchange, not in our local Asterisk. When a call for a diverted number is received, the exchange is signalled to divert to another number. The advantage over rerouting with Asterisk is that the new recipient will see the original caller-id (instead of our own numbers).
I have not yet investigated how to do this with our new SIP/Sipcall setup.

Sending/receiving faxes

Telefax is a very old-fashioned technology and we actually have very little use for it – nonetheless, our Asterisk setup is able to send and receive telefaxes, they are passed to Hylafax and routed to a recipient by email (with the fax attached in PDF format). We still receive faxes quite regularly, maybe once a week, although only spam and advertising. It’s been years since anybody sent a fax from our office.
I have not investigated this in detail, but according to various articles I have read, sending and receving faxes is not possible with SIP. I doubt it’ll be a great loss, and testing it with Sipcall is not a priority.

ECC memory errors

Yes, they do happen and you may not even know about it. ECC memory will automatically correct 1-bit errors, but that alone should cause an alarm, e.g. via SNMP. Depending on your hardware and your setup, it may or may not. Sometimes a server will start misbehaving without having given any indications of a memory issue. In a recent case, an HP Proliant server had had an episode with random restarts about a year back, but then apparently recovered and subsequently showed no symptoms. Until last week, when it became really unstable. It would be running fine for a few hours, then lapse into a series of restarts. Only a cold-start would get it running again. We were preparing to replace it, when one of my engineers noticed it would also fail during the POST memory check. Not consistently, just every now and then. We had no window for running a full memory test, so we just yanked the complete set and replaced it – problem fixed.

Lesson learned: ECC memory errors are rare, but can cause havoc without any clear indications. Keep it in mind.

What’s in a hostname?

When we install a new machine, a desktop or a server or a xen guest or something else, one of the first things to decide is what to call it, ie. the hostname.

New hostnames
When setting up a new system, sometimes it’s difficult to come up with a good hostname, at other times they just appear out of nowhere. On this page I am maintaining a collection of hostnames. Some we have already used, some are new, some may never be used.

What qualifies a “good” hostname?
First and foremost a hostname has to be memorable – this makes it easier to associate a particular function with it. Depending on the language used, a hostname should also be easy to pronounce, hence not too long. I prefer two syllables names, but up to three syllables is acceptable too.

Category “City names”
oxford, leningrad, hamburg, newyork, chicago, sydney, dresden, toulouse, marseille, copenhagen, stockholm. cologne, heidelberg
madrid
bremen
london

Uncategorised
frost – “Frost”, a long-running ITV detective-series. DCI = Detective Chief Inspector.
waitrose – a UK supermarket chain
dartmoor
exmoor
camus
sherlock
watson
nietsche
gaillard
newton
chronicle
journal
baileys
johnny
walker
kepler
davinci
bradbury
bentley
veyron
bugatti
newton
dupont
haddock
tintin
spirou
figaro
gaston
lagaffe
tintin
idefix
majestix
obelix
asterix
hathaway
saopaulo
hobson – Laura Hobson, the pathologist in the TV-series “Lewis”.
greengrass – character in “Heartbeat”, ITV TV-series.
heartbeat
brideshead
vogelsang
hospital
emergency
catscan
elleryqueen
celcius
fahrenheit
wychwood – village in a Miss Marple mystery
theodora
madrid
moriarty
mycroft
sherlock
redheat
midsommer
glasgow
kestrel
redkite
orchid
flieder
elderberry / holunder
eagle
watson
frankfurt
sierra
kiowa
apache
sioux
stuttgart
rotterdam
amsterdam
rosewood
cherrywood
watchmaker
smith
fletcher
tailor
tinker
cobbler
joiner
dentist
psychic
gabriel
debenham
rowntree
hemlock
earlgrey
lipton
pgtips
tetley
guiness
chess
ludo
mozart
kinabalu
nietsche
kierkegaard
salman
rushdie
jensen
hansen
nielsen
rasmussen
hopper (in honour of former USN Rear Admiral Grace Hopper)
turing (in honour of Alan Turing).
deveraux
shorewood
turnbull
northmoor
southmoor
corrigan
tallyho
rotholz
rotkreuz
woodentop
blenheim
carmine
deadwood
chestnut
cherry

Upgrading our cooling system

It’s been almost 8 years since we moved the datacentre and had a new airconditioning system installed.  Overall it’s been working fine,  but we’re nearing maximum capacity and need to consider upgrading. As it also failed miserably about two weeks ago, and we have some decent winter temperatures (coldest January in 30 years), now seems a good time to look at replacing it.

Things to consider for a new system
– size/capacity
– redundancy
– free cooling
– supplier

Supplier
I was never really happy with our current supplier (no need to name them). Originally, we chose them because they had already installed and maintained an existing cooling system in our buildings, but in retrospect it should have been clear that they had little or no experience with cooling a datacentre, not even small server-rooms. I don’t blame them, it was really our fault by not taking the time to do our research, and solicit a few proposals from different suppliers.
There is plenty of companies who do this type of thing, so for starters I’ve solicited proposals from three of them.

Capacity and redundancy
As Niels Bohr so wisely said “it is to difficult to predict, especially the future”. Trying to decide what we might need in the next 2-3 years is not easy, at all. Our datacenter is not that big, only some 40m2, and there is still plenty of room for expansion. Current power consumption is about 12kW plus the cooling system. The UPS is 15kW for now, but we already have a 30kW model in storage, so I am thinking we should aim for 20-25kW capacity for cooling.
As we’re expecting a new cooling plant anyway, redundancy is also a topic to consider. Essentially redundancy means a double system, so twice the investment.

Free cooling
When the current cooling plant broke down, we hauled out a big fan and installed it in the doorway. It simply draws in cold outside air through two 25cm venting holes, thereby easily maintaining a steady temperature of 15°C (outside -5°C). This has now been cooling the datacentre for about two weeks. It seemed reasonable to look for solutions for cooling with outside air, and after googling a while, I found the expression “free cooling”.
Free cooling devices are more expensive than plain air conditioning devices. At least that is my impression, I have not actually had any proposals yet. Some are very big, intended for large datacentres (100s of m2) with double flooring etc., but some also come in fairly compact sizes, intended for up to 30kW. The general idea is to reduce electricity consumption by utilising the outside temperature until it exceeds the target temperature for the datacentre. I have just today spoken to one possible supplier, it’ll be interesting to see where this leads.

Using Let’s Encrypt from a central server.

At ENIDAN, we have chosen to have a central server for managing Let’s Encrypt (LE) certificates.  This is a just a simple summary of our setup, perhaps somebody might be inspired.

To be honest, there is not much information here you will not find on the LE websites or elsewhere. I have not added much, no black magic or clever code. I’m only describing our Let’s Encrypt setup with a single central server.

Overall reasoning

There are plenty of alternative plugins available, but the basic LE setup appears to be focused on an environment where it will be run on the webserver where the certificates are needed.

When I started contemplating using LE, it wasn’t really appealing to me to have the LE scripts and directory structures installed in multiple locations; instead, I wanted to have a central location, a core server, for dealing with managing (e.g. issuing and renewing) LE certificates. Also, certificates are not used only by webservers, but also for TLS in email-exchange, signing emails, VPNs and probably a few other applications I haven’t thought of. I wanted to avoid installing webservers in all of those locations.

Core server

In our case, the core server is just a xen guest running apache. It could just as easily be a container (LXC, Docker etc.), but we already had a central server with apache running (doing other stuff). I added a virtual host, hostname = “core123.example.com”, whose sole purpose is to respond to the domain validation challenges from LE, i.e. serve the keys stored in .well-known/acme-challenge.

Webservers

Our webservers that need certificates are configured to proxy the domain-validation challenge to the core server:

ProxyPass /.well-known/acme-challenge/ http://core123.example.com/.well-known/acme-challenge/

You can obviously do this with URL rewriting too, but ProxyPass is really enough unless you need to add conditions.

Mail and other servers

Other servers with no webservice listening on port 80 are also “proxied” to the core server, just slightly differently:

iptables -A PREROUTING -t nat -p tcp --dport http --j DNAT --to core123.example.com

An example server might be mail.example.com. We do not add this as a hostname to the Apache config on the core server, we just let it use the default vhost.

Getting a new certificate

To acquire a new certiticate, on “core123.example.com” run this:

certbot-auto certonly --webroot -w /srv/www/vhosts/core/htdocs/ -d mail.example.com

The certificates are left in /etc/letsencrypt/live/mail.example.com/. For the time being, we copy them manually to their destinations.

Certificate renewal

Certificate renewal is run via cron twice daily (as per LE recommendation , see section “Automating renewal”), on the core server. Using “–renew-hook”, we call a script that currently only notifies us when a certificate has been renewed – we then manually update the server with the new certificates. We intend to update this to automatically copy new/renewed certificates to their destinations.

Summary

To sum up, with two very simple means (iptables DNAT and Apache Proxying) we have managed to keep the LE management on a single server and still cater to both webservers and non-webservers. The distribution of certificates still needs some work, but nothing major.

Hello world!

A “Hello world” post seems like a perfect wperjessen-wikilogoay to begin our new engineering blog. I’ve had this idea nagging at me for a while now, but despite us having set up plenty of WordPress sites for our customers, we never really found the time to start one ourselves. Well, here it is – the idea is to give our customers, friends, families and anyone else who’s interested an insight into the comings and goings of an IT engineering firm.  All ENIDAN employees are free to contribute, let’s see what happens.

Per Jessen, CEO, ENIDAN Technologies GmbH