Wednesday, February 27, 2013

Why Twitter pictures are sideways

We set out to do a demonstration of using WebRTC on an Android device before this years Mobile World Congress. I'm happy to say we made the deadline and were able to demonstrate some nice desktop Firefox to Android video calls. We only got the demonstration code working shortly before the deadline, though. (Actually, about 1 hour after we decided that we'd likely not make it, I got to tell everyone to disregard that because we just did, for a great "stop the press" moment). So here's a story about how we got hung up for almost a week on something that looked quite trivial at first.

The WebRTC stack in Firefox for Android is based on the upstream webrtc.org. This codebase is shared between Firefox and Chrome, but as Firefox is the only one of those two to use the Android parts so far, we sort-of expected quite some work to do before getting anything working. That didn't end up being so bad: we hit a bunch of problems with the gyp buildsystem, another bunch were related to us running all the code off the main thread. One nice example that bit us is that running FindClass off of the main Dalvik thread doesn't do what you want. But overall, we found that we were able to get some audio and video running quite rapidly. Not perfect, but certainly good enough for a first demo.

That is, aside from one small issue: the video was sideways in portrait mode.

Experienced Android developers who are reading this are surely snickering by now.

The underlying cause of this is that the actual cameras in some devices are rotated compared to the phone shell. You're supposed to probe the Android Camera stack for information about the orientation of the camera, probe the phone's orientation sensors to know how the user is holding it, and rotate appropriately.

If you do a Google search for "Android video rotated" or "Android video sideways" you're going to get a lot of hits, so we knew we weren't the first to see this problem. For our use, we're specifically capturing data from an Android Camera object through a preview Surface. There are several threads on this issue on StackOverflow, some of them trying to cheat by pretending to be in landscape mode (a hack that didn't seem feasible for Firefox), but most of them conclude the issue must be solved through using the setDisplayOrientation call.

There are several downsides to this proposed solution: it is API Level >= 8 only, for starters. Given that we have dropped Froyo support (the final WebRTC code will likely require at least Android 2.3 or later, and almost certainly Android 4.0 or later for the best experience), that's not all that much of a problem for us, even more so because Camera support on Froyo or older devices is quite spotty anyway. But it makes me wonder what Camera apps on Froyo are supposed to do.

The second problem is that setDisplayOrientation only changes the orientation of the preview Surface. Now, for a normal Camera app that wants to show the user the video he's capturing, this is fine. But that's not what we're doing. We don't want to show the preview picture (in Java) at all! What to do with the data captured through WebRTC getUserMedia is entirely up to the webpage, which means it's to be handled in JavaScript (Gecko), and not the business of the Java layer at all. Maybe the JavaScript doesn't want to show the local video at all, for that matter. So our application doesn't actually want to display the Android preview, and so rotating it isn't going to make any difference. We just want the raw video data to send off to the video encoder.

Now, to go off on a small tangent, it turns out that not showing the video preview on an Android device isn't actually reliably possible pre-ICS. We currently use a dummy TextureView, which is known not to work on some devices, and will use a SurfaceTexture (which is guaranteed to work) on Android 4.0 or newer devices.

The Android Camera API also has a setRotate call, which might rotate the image data, or it might just put an EXIF flag that the image was rotated. This can only work if you're taking in JPEG data or if you're lucky and your phone actually rotates the data. None of the phones we wanted to use for the demo (Galaxy S3 and Nexus 4) do this. Taking in JPEG is out of the question, imagine the overhead going NV21->JPEG compress->JPEG decompress->I420->VP8 compress for every frame at 60fps! For the same reasons, we discarded the idea of doing the rotation with SurfaceTexture, setMatrix and letting the GPU do the work, because that would require a double NV21<=>RGB conversion as well.

Somewhere around this point, it was starting to dawn on me that this is the reason that pictures I'm taking with my Android phone come out sideways when posting on Twitter. The camera is actually capturing them sideways, and something along the chain misses that EXIF flag, which causes the picture to be displayed as it was taken. What makes this sad is that you can actually rotate JPEG data the right way in a lossless manner, so this is entirely unneeded. Or maybe they could just have made it easier for the camera apps to rotate the data before saving. Crazy idea, I know.

Okay, so here we are. We're getting in the raw camera data, and the only format we can rely on being available on every device is NV21 (except for the emulator, which will use RGB565, because this is Android after all). So, we need to rotate the NV21 data before sending it off towards our WebRTC video codec. We create a Bitmap with that image data and then rotate it using the Bitmap functions, right? Well, that would be fine, but unfortunately, even though NV21 is the only format you can rely on being generated by the camera, there is nothing else in Android that can understand it (there's android.graphics.YuvImage, which "helpfully" only allows JPEG output). The only way to deal with is to write your own conversion code in Java.

Because we have a C++ backend, and the webrtc.org part of that includes libyuv, we ended up passing the data to that. libyuv has optimized code for YUV-style data handling (including NV21), so that's nice, though the implementation in the WebRTC code is buggy as well. At least there were some hooks to request the rotation included, so someone down there must have known about this madness.

So we only need to work around those know bugs, and voila, we have rotated our image. Easy, uh? Android is great!