Dev blog 16 June 2017

Fun with server hosts...

I've already written a couple of times about our ongoing server management work. As we gear up for the F2P release, we have a few challenges still to solve. First, we need to have servers available in as many regions as possible so that people all over the world can have a good game experience. Second, we need to be able to handle a somewhat unknown level of demand. Thirdly, we need to do both of these without bankrupting ourselves with hosting costs.

I'm going to make an assumption for the rest of this blog, that there will be 5000 simultaneous players online. This isn't actually the estimate we're working with internally, but it makes the maths easier and lets me provide some example figures to put things into perspective. For comparison, a game like Rocket League hits around 50,000 players simultaneously (on PC), and CS:GO is well over 500,000.

If you have an infinite money pot, solving the first problem is easy. You go to server hosts in every part of the world, and hand over a big pile of money to have lots dedicated servers running 24/7. 5000 players all playing 5v5 means 500 servers (and more if there are 3v3 matches). 500 servers at approx €5 per month is €30,000 per year. This is far beyond our meagre budget.

What if we guessed wrong? Maybe we don't get that many players, in which case we spent lots of money to have servers idling. What if we get more players? Then we have lots of people seeing error messages, not playing, and getting angry.

Furthermore, that 5000 players would not be spread evenly over the planet. It'll be 5000 in Europe at one time, then later on 5000 in the USA and so on as the world turns. Using the example above, we wouldn't have enough servers in each region to meet demand, but would have too many servers idling in for half the day as people on the that side of the world slept.

So the alternative is to try and dynamically scale the number of servers we run according to current regional demand. There are a few companies that offer such elastic systems, such as Amazon, Digital Ocean, and Vultr. These guys let you start up and stop servers on demand, charging by the hour of cpu time. It then becomes the job of our masterserver to detect that demand, and figure out how many game servers should be running in each region. We are currently writing a module in our masterserver to do just that.

There are a couple of drawbacks with this approach. Each of these providers has a different way of managing their server instances. This means for each provider we want to use, we have to write a new bit of server management code to talk to that provider's API. Also, no single provider offers great latency in all the regions we want to cover. What we're going to do in the short term is pick one that hits as many as possible, and for those it can't, is at least acceptable. Once we have one system running, we can add another provider if it looks like there is enough demand in a region to justify it.

On a final note, as part of this work we've discovered some interesting quirks with global routing. For example, did you know a player in Buenos Aires, Argentina would get a lower ping to a server in Miami (7000km away) than to a server in São Paulo, Brazil (1600km away)?