We are updating our server infrastructure. From September 2014, it will be significantly upgraded to increase the levels of security and remove any single point of failure, so that Lamplight is always available whenever you need it. This page gives a moderately technical description of setup.
Everything is automated
Automated operations are faster and more reliable. You can repeat the same steps over and over without adding any human error along the way. And importantly, the scripts used to provision servers can be audited to ensure configurations are correct.
Server provisioning is scripted, so when we need to add additional servers, or if we needed to re-build the entire stack on new machines, we can get the raw servers up and running in one operation. We use ElasticHosts as our primary server provider and have created and released as Open Source software a utility to set up multiple servers programatically. The main cluster of servers is located near Portsmouth, UK. This is a Tier 3 data centre with ISAE 3402 Type II accreditation
We then use Ansible playbooks to set the raw servers up and install necessary software. We also use Ansible to deploy new versions of Lamplight without any interruptions to service.
What the servers do
There are multiple layers of servers, with at least two in each layer, so that if one has problems the others can take over. Each server is on separate hardware to the others, so if there are hardware problems on one machine, the second won’t be affected. In addition, within our main datacentre, your data is replicated in real time across several machines. So if there are problems or slow-downs with one, the others can take over until the problems are resolved.
As well as having more servers, we’re upgrading their specifications to increase the computing power and store data on encrypted SSD (fast) hard disks.
Separate provider, separate site
We also run an off-site replica of the main system using a different server provider, Linode, at a different UK data centre. This server continuously receives data updates from the main data centre. In the event that our main datacentre went down (ie all servers in the centre became unavailable due to network problems, fire, meteorite strike…), this off-site system would function as a read-only system with fully up-to-date data, meaning that you could continue to access your data while the primary system is restored. It also means that if some disaster befell ElasticHosts (we’re not quite sure what this would be, but still) we can continue to operate seamlessly.
Finally, this off-site replica servers as a backup source, so we can take off-site system snapshots for backup purposes, without affecting the main service.
All data on the off-site system is encrypted.
We use a server monitoring system that continuously checks on the health and security of all the servers, and notifies us if there are any problems.