On Sunday Team Eureka released the new version (17250) of the custom Chromecast firmware Eureka-ROM. What followed was not pretty, and cost some people some time and money.
I’ll note here I’m not attacking Team Eureka. You know the risks when you’re rooting anything, or the risks when you’re connecting anything to your network. This just blew up this weekend and was bad, and it goes to reinforce a point I made yesterday about software distribution models.
As a preface – I’ve been extremely cautious about my bandwidth usage ever since I used 480GB in about 10 days due to Google Music and an issue with a Foscam camera, neither of which was rooted, so I keep track of my usage as three overages on my data plan and they start charging you in $40 increments.
On Thursday of last week I was at ~70GB (out of 300,) used for the month, I left on a camping trip Friday and when I got back I had used 92% of my bandwidth for the month and was getting calls from Comcast’s bandwidth information robot.
Approximately 140 Chromecasts that had been rooted and had the custom Eureka ROM firmware placed on them attempted to receive the update. Eureka rolls them out slowly so that if there are any problems they can update the distribution ROM and not give everyone buggy software.
Since one-sourced HTTP downloads are notoriously unreliable, they include a separate MD5 file to verify the authenticity and reliability of the ROM the user downloads before attempting to flash it. Unfortunately the MD5 file contained the wrong information (possibly the MD5 of the previous version, that info was not mentioned in the forum). This caused an MD5 mismatch.
From sometime Sunday through Monday 140 Chromecasts downloaded the 104MB ROM, checked the MD5, thought that it was bad and then went to re-download the ROM.
On my side, that was about 200GB worth of data that decided to start downloading when I went on a camping trip. On Team Eureka’s side, they blew through 6.5TB of data in probably a day and half, or each of the 140 Chromecasts that were in the queue gobbled on average 46.428GB of garbage while people weren’t looking.
Are these the perils of rooting?
No. These are the generic perils of installing any software with Internet access. The same thing happened to me with two non-root applications a couple of months back – Google Music and the Foscam viewer ActiveX plugin. The first being a bug in the pinning of a song, the second a configuration issue with a camera.
This is a simple mistake that there was no code to catch because single-sourced downloads are so freaking unreliable and the accepted method of software distribution is download, if it’s broke re-download, repeat until it’s right.
The real problems
In this scenario, because the MD5 file was invalid even perfectly good ROM downloads were being marked as bad and thrown away. There was no download attempt counter to stop after x number of downloads, and there was no notification if you did not have the TV turned to the channel the Chromecast was on that anything was happening.
Unless you had network monitoring software, the chances that you would ever know a malfunctioning device was eating up all the bandwidth was when you got a call from Comcast.
I’m going to jump back on my BitTorrent distribution bandwagon again … as a note I swear I didn’t manufacture this issue just to attempt to sell the point of the article I wrote on Monday and published on Tuesday regarding this.
In the single source distribution method used when a download fails the entire file needs to be downloaded again. For example in this 104 megabyte ROM if something right at the start of the ROM failed – a CRC-32 failure (it is possible to get a CRC-32 match and still not have the correct data,) then the rest of the 104 megabytes is still going to be downloaded.
The much more robust MD5 mechanism scanning kicks in and can detect whether the signature of the whole file is correct, and if it’s not you discard and start again. So in that scenario we’ve got at least 208 megabytes downloaded to get the correct 104 megabyte file.
In the multi-source BitTorrent distribution model let’s say the first part of the file got messed up somehow during transportation. At the end of the transfer the client goes through and verifies that everything looks ok. In this case we’ve got a couple of bad areas of data. Rather than throwing away 104 megabytes of work we simply re-request the areas of bad data, replace them, bam. a two-kilobyte bad block can be replaced without downloading 104MB.
Assuming there is a valid reason to stick to the single-point distribution model which in this case cost them for 6.5 terabytes of bandwidth charges and cost some users overage fees, turn off auto updates on any device you are not checking regularly. Email notification options would have also given most people an indication that this was an issue.
If you’re not watching your device’s data usage on some sort of monitoring console, unplug it when you’re not using it.
Perhaps more caffeine and closer monitoring of the bandwidth used during the initial distribution rollouts would have helped, but I’m not here to bash people who give me free software to play with that makes my Chromecast better.
But really, sticking a BitTorrent downloader (doesn’t even have to seed,) in the update logic would have saved the day.
Imagine the usage if they’d done a full rollout.