Running iOS Simulators In The Cloud

Crossfader for iOS Saving A Cross

As part of our on-going effort to build a better end-user experience for the Crossfader iOS app, we wanted to move cross audio creation out of the client and on to the web.

Every time a user saves a cross they like, they trigger a 1-2mb audio file upload. We can’t assume that a client will have a reliable internet connection to upload assets in a reasonable amount of time. In practice, we found that something like 50% of crosses ended up never getting media to the server. This, in turn, introduced a ton of complexity in various parts of the Crossfader eco-system.

To solve this, we wanted a “cross server.” The audio counter-part to an image assets server. I had just put an image-generation server into production a few weeks before this, to generate cross artwork on demand.

In short, given a particular url, the server should return a rendered, mixed audio file, using assets stored on the main API. Once a rendered file returns, it gets cached by the CDN forever.

iPhones In The Cloud

After toying with the idea of a building an Ubuntu-friendly version of our audio engine, our lead iOS developer, Devin, took a different route. Namely, a special build of the Crossfader app with it’s own http server, and no UI.

I started off with an OS X VM image running on my MacBook with the required packages and dev tools. This part was pretty easy – took about 15 minutes worth of setup to get going. With that out of the way, the next step was to get it up and running on an AWS instance.

My first attempt was firing up an Ubuntu box, installing VirtualBox, and trying to copy the VM up to the cloud. I killed that idea as soon as I saw how many hours the upload was going to take. Ain’t nobody got time for that.

Ok… next idea…

I did a bit of Googlin’ and came across an existing OS X VM that’d I thought would work well. Downloading that directly to the EC2 node was much quicker – on the order of a few minutes. Except that the new VM wouldn’t actually boot, no matter what I tried. So I killed that EC2 instance, and tried the same trick again, except on a Windows Server instance. Also, no luck.

Side note: This part was to make sure I exhausted all options, as running a random VM in production is, in a word, dumb.

It looked at that point, and correctly, I eventually discovered, that running OS X on EC2 is not an option. Just doesn’t work, as far as I’ve been able to tell.

Failed Approach: Directly on OS X

After failing on AWS, I spent some time looking into Mac co-lo options.

The company I chose as my Mac Mini host was pretty easy to get up and running. I paid for the mini, waited a few minutes for it to get provisioned, and was able to VNC right in to take control. From there, I installed VirtualBox again, copied over my VM, and was able to launch OS X, on OS X, from OS X.

How wonderfully meta…

Trying to control the VM through VNC was a nightmare performance-wise, and didn’t seem like a real solution to my problem. Lag and general slowness would compound my with many VMs, which was a key need of this whole project.

So I did some more Googlin’, which led me to discover VirtualBox Guest Additions (like I said… totally new to this…) which would drastically improve guest OS performance on the host machine. Unfortunately, that didn’t make a bit of difference for some reason.

So there I was, with a Mac Mini, running 1 iOS simulator, and a broken VM.

Getting Warmer: ESXi

I realized my fundamental mistake was trying to run a VM on top of an OS directly – a rookie move. After much reading, I discovered what I needed was a low-level hypervisor to sit between the metal and the VMs.

My hosting provider had this as an option during setup (‘choose your OS’) which I missed the first time around. After submitting a ticket and waiting for a few hours, they were able to relaunch the mini with ESXi running.

But shoot. There is no client to manage ESXi for mac. Just Windows.

I fired up another Windows Server node on EC2. From there, got in via Remote Desktop, installed the vSphere client for Windows, and was back in to the Mac Mini.

Uhh… wat?

Success: Multiple VMs!

Once I was able to get a hang of the process for creating a VM through ESXi, I launched my first instance of OS X. Next, I had to do a little bit of singing and dancing to get more copies of the same pre-configured os. I had expected there to be a “Clone this VM” button, but, there wasn’t. (Though according to this documentation there might actually be a way. Didn’t ever end up working for me though.)

Instead, the process involved getting SSH access to the Mac Mini and using a provided command line tool to duplicate the VM files. From there, add them to the inventory listing back in the Windows client.

Word of warning: don’t try to cp the files around. For some reason, that results in corrupted virtual discs, and is slower than the provided duplication tool. See this blog post for the right way to do things.

Fail: The Simulator Keeps Crashing

Ok. So. Now that I had my 4 copies of OS X living on the same machine, it was just a matter of launching XCode and the iOS Simulator. Except when launching the Crossfader app in the simulator, it would immediately crash. Each time, with a cryptic error about a missing audio device.

I reached out again to the hosting company, but they let me know I was pretty much on my own at this point. After (you guessed it…) even more looking around, I found this blog post about setting up audio devices for hosts managed by vSphere.

Editing though the UI doesn’t work though. Be sure to enable SSH access for your vSphere box, and then edit the .vmx file from there.

Post-Deploy Monitoring

This was a tricky problem, to be sure. At first, we resorted to using AppleScript to launch XCode and the simulator when the virtual machine rebooted. Gross.

That, plus a CloudWatch alert to let me know when the whole thing needed a reboot worked fine at first. Keeping servers online by hand sucks though.

A few weeks after the initial launch, while I was looking into automated testing options for iOS, I found a cool solution. The Sim Launcher gem makes it easy to launch simulator apps direct from the command line. Somewhere along the line, Apple removed this feature from their suite of dev tools.

By using the Sim Launcher gem and a simple Sinatra server, I could manage the state of the simulator through remote means. I made a small notification to my CloudWatch alert to not only send me an SMS and email, but to also ping the monitoring server and trigger a restart.

The Final Setup

The final rig ended up looking like:

  • Windows Server box on EC2 to act as ESXi manager.
  • A mac mini with 4 VMs
  • Each VM has a cross server and a monitoring server
  • Out general purpose nginx box to load balance between VMs
  • A whole bunch of route53 DNS entries to give names to VMs
  • A CloudWatch check for each group of VMs
  • SNS hooks to send notify me of failures and reboot those instances.

The Verdict

The most recent version of Crossfader that we shipped to the App Store stopped sending rendered media to the API when a user posts a cross. This means that every user’s crosses are served server-side. So far there hasn’t been any issue, and despite getting a few “OMG THE SERVER CRASHED FIX IT” alerts. Though, by the time I make it to a computer, the problem has usually healed itself.

I’m sure I’ll end up eating my words soon enough, but it feels like a success so far.

Improvements For The Future

There is a lot of room for automation here. Being able to add a new Mac Mini host and then use chef to provision new VMs would be ideal. My biggest worry at this point is being able to deal with traffic spikes in a sane way, which I can’t do just yet.

Another nice-to-have is the ability to restart an instance of the cross server after a certain number of requests. The iOS simulator isn’t stable over long periods, so pre-empting crashes with defined restarts means less to worry about.

Also, figuring out a better bridge between the GUI and command line stuff would be great. The weird mix of doing something through remote desktop, and some things through SSH gets annoying.