Friday, May 20, 2016

Technical Debt, Episode 1

One of the projects I'm working on for Mozilla is our Content Sandboxing. We've been using sandboxing for a while to protect some plugins like Flash, as well as media plugins, but now that Firefox can render webpages in a separate process, we can apply restrictions to what those "Web Content" processes can do, too. Those processes are the part of Firefox that is essentially exposed to the internet, and hence to potentially dangerous webpages.

Although we go to great lengths to make this impossible, there is always a chance that a bug in Firefox would allow an attacker to exploit and take over a Web Content process. But by using features provided by the operating system, we can prevent them from taking over the rest of the computing device by disallowing many ways to interact with it, for example by stopping them from starting new programs or reading or writing specific files.

This feature has been enabled on Firefox Nightly builds for a while, at least on Windows and Mac OS X. Due to the diversity of the ecosystem, it's taken a bit longer for Linux, but we are now ready to flip that switch too.

The initial version on Linux will block very, very little. It's our goal to get Firefox working and shipping with this first and foremost, while we iterate rapidly and hammer down the hatches as we go, shipping a gradual stream of improvements to our users.

One of the first things to hammer down is filesystem access. If an attacker is free to write to any file on the filesystem, he can quickly take over the system. Similarly, if he can read any file, it's easy to leak out confidential information to an attacking webpage. We're currently figuring out the list of files and locations the Web Content process needs to access (e.g. system font directories) and which ones it definitely shouldn't (your passwords database).

And that's where this story about technical debt really starts.

While tracing filesystem access, we noticed at some point that the Web Content process accesses /etc/passwd. Although on most modern Unix systems this file doesn't actually contain any (hashed) passwords, it still typically contains the complete real name of the users on the system, so it's definitely not something that we'd want to leave accessible to an attacker.

My first thought was that something was trying to enumerate valid users on the system, because that would've been a good reason to try to read /etc/passwd.

Tracing the system call to its origin revealed another caller, though. libfreebl, a part of NSS (Network Security Services) was reading it during its initialization. Specifically, we traced it to this array in the source. Reading on what it is used for is, eh, quite eyebrow-raising in the modern security age.

The NSS random number generator seeds itself by attempting to read /dev/urandom (good), ignoring whether that fails or not (not so good), and then continuing by reading and hashing the password file into the random number generator as additional entropy. The same code then goes on to read in several temporary directories (and I do mean directories, not the files inside them) and perform the same procedure.

Should all of this have failed, it will make a last ditch effort to fork/exec "netstat -ni" and hash the output of that. Note that the usage of fork here is especially "amusing" from the sandboxing perspective, as it's the one thing you'll absolutely never want to allow.

Now, almost none of this has ever been a *good* idea, but in its defense NSS is old and caters to many exotic and ancient configurations. The discussion about /dev/urandom reliability was raised in 2002, and I'd wager the relevant Linux code has seen a few changes since. I'm sure that 15 years ago, this might've been a defensible decision to make. Apparently one could even argue that some unnamed Oracle product running on Windows 2000 was a defensible use case to keep this code in 2009.

Nevertheless, it's technical debt. Debt that hurt on the release of Firefox 3.5, when it caused Firefox startup to take over 2 minutes on some people's systems.

It's not that people didn't notice this idea was problematic:
I'm fully tired of this particular trail of tears. There's no good reason to waste users' time at startup pretending to scrape entropy off the filesystem.
-- Brendan Eich, July 2009
RNG_SystemInfoForRNG - which tries to make entropy appear out of the air.
-- Ryan Sleevi, April 2014
Though sandboxing was clearly not considered much of a use case in 2006:
Only a subset of particularly messed-up applications suffer from the use of fork.
-- Well meaning contributor, September 2006
Nevertheless, I'm - still - looking at this code in the year of our Lord 2016 and wondering if it shouldn't all just be replaced by a single getrandom() call.

If your system doesn't have getrandom(), well maybe there's a solution for that too.

Don't agree? Can we then at least agree that if your /dev/urandom isn't secure, it's your problem, not ours?

Friday, April 4, 2014

Only what unites us

If anything, the last week has given us quite some things where we can have great discussions about. I'd love to do a treatise on how free speech interacts with the ability to do Internet witch burnings, or how universal truths relate to political opinions, but the nerves are tense and the knives sharpened, so attempting to do so would most likely only be preaching to the converted. Maybe another time.

Yet, when looking at the questions for the upcoming Mozillians Town Hall meeting, there is a pattern that worries me, and I would like to address.

Mozilla is a community that organizes around a mission. The mission is set out in the Manifesto. The Manifesto uses rather broad language, e.g. "must enrich the lives of individual human beings".

If we have to learn anything from the past 10 days, it is that we can only survive as a community if we interpret this mission only in its most narrow scope, where we can and should find common ground. Attempting to read the Manifesto in the widest possible manner and presuming to find that all of our fellow Mozillians have done so in the same way is the road to failure as a group and a community. Our cultural differences are immense and things which we find self-evident can be unimaginable to other. We should group among the narrow set of goals that unites us, not among what divides us.

Mission creep is death. As a result, Mozilla is pragmatic. We've thrown efforts under the bus when we believed it to be necessary to survive. It personally still pains me that we threw VP8 under the bus in favor of H264, but choices have to be made about what hurts us least.

Now, if anyone wants to make the argument that an "open, participatory and accessible" Internet obviously has a direct and inalienable relation to an already-repealed law in the state of California in the United States of America that tried to refine the legal definition of marriage, so Mozilla must fight for this cause, then fine. I am not going to argue with you. I simply want to point out that the remainder of this post is not addressed to you.

Why do I want to make this point now? I have looked at the list of questions for the Town Hall Meeting, and I could summarize a fair number of the highest up voted ones as "Why do we think it is acceptable that the personal opinion of a Mozillian influences his career prospects in our community?"

I understand this question. I've struggled with it myself. Up until yesterday, I wouldn't have believed Brendan would step down simply because of the enormous implication that has in relation to the above. When I joined Mozilla, I was encouraged by my manager (Stuart or Doug, sorry forgot which of you two!): "We expect our contributors to have an opinion and speak out on it". This obviously does not mix with what has happened this week.

The answer though, is above: it is not our fight to fight. Should a political/moral opinion and monetary support for it be grounds for a week-long internet shitstorm ending in resignation? The Fox has no opinion on this issue. The fact that we have such wildly differing opinions on it is a clear sign: this is far from the Mozilla mission, and it's outside our scope until we can find common ground. Should we fight global warming? The Fox has no opinion. Should abortion be allowed? The Fox most certainly has no opinion on that one. Should we legalize marihuana? The Fox wonders what you've been smoking that you're even asking him. Should we be able to anonymously participate on the Internet? You bet.

The "why do our private opinions affect our work" question is inappropriate for the Town Hall, because it is not Mozilla's problem. We have to choose to focus only on what unites us.

Wednesday, April 17, 2013

WebRTC support on Android

Yesterday, WebRTC support landed in Firefox for Android, which means it is available in today's Nightly. Wait a minute, you might say, didn't you demo this already at MWC a month and a half ago? Well yes. But to get the code used for that demo landed in the main Firefox repository (instead of a branch), some cleanup work was needed, and we needed to make sure that older, pre-ICS phones didn't just blow up when you tried to make a phone call on them.

Although, in theory, WebRTC should work on any device Firefox for Android supports, for an ideal experience you're going to want to have an Ice Cream Sandwich or Jelly Bean device. We'll do what we can for older devices, but realistically Android was missing some essential APIs before that time (especially before Android 2.3) and device performance will also limit what experience you get. Nevertheless I had no problems getting a video call up on my Galaxy Tab 10.1 with Android 3.2, and many devices will probably work just fine.

By default WebRTC is included in Firefox for Android, but behind a preference that is off by default. To enable, flip these preferences to "true":
  • media.navigator.enabled
  • media.peerconnection.enabled

Secondly, because the UI is currently very provisional, you might want to disable the dialog that asks for permission to use your camera and microphone (but remember the implications of this, you probably don't want to keep it that way after testing!). Enable this setting:

  • media.navigator.permission.disabled

Some example pages for WebRTC APIs are here:

Note that as the code only just landed, there are going to be bugs. Some known ones are the permissions dialog (if enabled) only remembering the first choice you make, giving you audio or video, but not both, and when disabling permissions, we always take the first camera, which is probably the one you didn't want. On some devices the video might be upside down, too. Expect most of these to be fixed in the next days and weeks. If you find any bugs that I haven't listed here, feel free to file in Bugzilla and we'll get to work on them.

Wednesday, February 27, 2013

Why Twitter pictures are sideways

We set out to do a demonstration of using WebRTC on an Android device before this years Mobile World Congress. I'm happy to say we made the deadline and were able to demonstrate some nice desktop Firefox to Android video calls. We only got the demonstration code working shortly before the deadline, though. (Actually, about 1 hour after we decided that we'd likely not make it, I got to tell everyone to disregard that because we just did, for a great "stop the press" moment). So here's a story about how we got hung up for almost a week on something that looked quite trivial at first.

The WebRTC stack in Firefox for Android is based on the upstream This codebase is shared between Firefox and Chrome, but as Firefox is the only one of those two to use the Android parts so far, we sort-of expected quite some work to do before getting anything working. That didn't end up being so bad: we hit a bunch of problems with the gyp buildsystem, another bunch were related to us running all the code off the main thread. One nice example that bit us is that running FindClass off of the main Dalvik thread doesn't do what you want. But overall, we found that we were able to get some audio and video running quite rapidly. Not perfect, but certainly good enough for a first demo.

That is, aside from one small issue: the video was sideways in portrait mode.

Experienced Android developers who are reading this are surely snickering by now.

The underlying cause of this is that the actual cameras in some devices are rotated compared to the phone shell. You're supposed to probe the Android Camera stack for information about the orientation of the camera, probe the phone's orientation sensors to know how the user is holding it, and rotate appropriately.

If you do a Google search for "Android video rotated" or "Android video sideways" you're going to get a lot of hits, so we knew we weren't the first to see this problem. For our use, we're specifically capturing data from an Android Camera object through a preview Surface. There are several threads on this issue on StackOverflow, some of them trying to cheat by pretending to be in landscape mode (a hack that didn't seem feasible for Firefox), but most of them conclude the issue must be solved through using the setDisplayOrientation call.

There are several downsides to this proposed solution: it is API Level >= 8 only, for starters. Given that we have dropped Froyo support (the final WebRTC code will likely require at least Android 2.3 or later, and almost certainly Android 4.0 or later for the best experience), that's not all that much of a problem for us, even more so because Camera support on Froyo or older devices is quite spotty anyway. But it makes me wonder what Camera apps on Froyo are supposed to do.

The second problem is that setDisplayOrientation only changes the orientation of the preview Surface. Now, for a normal Camera app that wants to show the user the video he's capturing, this is fine. But that's not what we're doing. We don't want to show the preview picture (in Java) at all! What to do with the data captured through WebRTC getUserMedia is entirely up to the webpage, which means it's to be handled in JavaScript (Gecko), and not the business of the Java layer at all. Maybe the JavaScript doesn't want to show the local video at all, for that matter. So our application doesn't actually want to display the Android preview, and so rotating it isn't going to make any difference. We just want the raw video data to send off to the video encoder.

Now, to go off on a small tangent, it turns out that not showing the video preview on an Android device isn't actually reliably possible pre-ICS. We currently use a dummy TextureView, which is known not to work on some devices, and will use a SurfaceTexture (which is guaranteed to work) on Android 4.0 or newer devices.

The Android Camera API also has a setRotate call, which might rotate the image data, or it might just put an EXIF flag that the image was rotated. This can only work if you're taking in JPEG data or if you're lucky and your phone actually rotates the data. None of the phones we wanted to use for the demo (Galaxy S3 and Nexus 4) do this. Taking in JPEG is out of the question, imagine the overhead going NV21->JPEG compress->JPEG decompress->I420->VP8 compress for every frame at 60fps! For the same reasons, we discarded the idea of doing the rotation with SurfaceTexture, setMatrix and letting the GPU do the work, because that would require a double NV21<=>RGB conversion as well.

Somewhere around this point, it was starting to dawn on me that this is the reason that pictures I'm taking with my Android phone come out sideways when posting on Twitter. The camera is actually capturing them sideways, and something along the chain misses that EXIF flag, which causes the picture to be displayed as it was taken. What makes this sad is that you can actually rotate JPEG data the right way in a lossless manner, so this is entirely unneeded. Or maybe they could just have made it easier for the camera apps to rotate the data before saving. Crazy idea, I know.

Okay, so here we are. We're getting in the raw camera data, and the only format we can rely on being available on every device is NV21 (except for the emulator, which will use RGB565, because this is Android after all). So, we need to rotate the NV21 data before sending it off towards our WebRTC video codec. We create a Bitmap with that image data and then rotate it using the Bitmap functions, right? Well, that would be fine, but unfortunately, even though NV21 is the only format you can rely on being generated by the camera, there is nothing else in Android that can understand it (there's, which "helpfully" only allows JPEG output). The only way to deal with is to write your own conversion code in Java.

Because we have a C++ backend, and the part of that includes libyuv, we ended up passing the data to that. libyuv has optimized code for YUV-style data handling (including NV21), so that's nice, though the implementation in the WebRTC code is buggy as well. At least there were some hooks to request the rotation included, so someone down there must have known about this madness.

So we only need to work around those know bugs, and voila, we have rotated our image. Easy, uh? Android is great!

Thursday, October 11, 2012

Phishing and malware protection arrives on mobile devices

I'm happy to announce malware and phishing protection is now enabled in Firefox for Android, and available in the Aurora and Nightly builds. It will be released in Firefox 18. As far as I could tell, Firefox is the first browser to offer this feature to mobile users.

The phishing and malware protection works by regularly updating a local list of known, bad sites, graciously provided by Google through their SafeBrowsing database. Whenever Firefox detects that you navigate to such a site, or that a page you visit is trying to pull data from it, Firefox will present you with a warning page and allow you to abort the operation.

I blogged previously about our intent to rework the database to a size acceptable to mobile devices, and the subsequent rewrite of the SafeBrowsing backend. The remainder of the work had to be postponed a little as we worked vigorously to finish the new Native Firefox for Android for phones and tablets.

Some of the remaining tasks were verifying that the new backend processes all SafeBrowsing updates correctly. This was done by writing an external Python program that can read in both the old and new format files, which turned out to be very useful in debugging the file format documentation as well.

Another issue that remained was the need for the updates to be very resilient on a mobile platform. For example on Android the application can be killed nearly at will by the Operating System, and care must be taken that the database not only doesn't get corrupted, but that as little data as possible is lost if this happens in the middle of an update. The old backend got this functionality mostly for free through the ACID features of SQLite, but for the new backend some extra work was necessary.

The final step was then updating the SafeBrowsing warning screens and support code for Firefox for Android, which we finished last week. For the future, our UX team is currently busy with a new, nicer visual design for the warning pages that will be shared between Firefox for Android and Firefox OS. But in the mean time, enjoy the added protection already. I sincerely wish you will never have to encounter either of these pages, anyway!

Phishing warning

New UX Design

Wednesday, June 20, 2012

The effectiveness of practice - a small study of StarCraft 2 balance

Edit 2012-06-21: Corrected "games" to "won games". Sorry about that!

Win rates have little to do with "balance"

I had some musings about StarCraft 2 balance a while ago and decided to write a program to check them out.

Basically, most discussions about balance end up mentioning win rates a lot. However, because the StarCraft 2 matchmaking tries to guarantee each player has a win rate around 50% in the middle-long term, win rates are actually quite a meaningless indicator of player strength, and hence balance. Regardless of balance, we're always expecting them to end up around 50%.

Imagine you take a player of grandmaster level, and now handicap him by disallowing the use of his races' macro mechanic. The hit to his strength is of a similar nature as would be when one of the races were weaker than the others. After playing for a while (and initially scoring way below 50%), he will likely drop down to somewhere like diamond league, for example, until his opponents are such that he gets a 50% win rate again. So we now know that this player is actually grandmaster, but his race handicap holds him back from realizing his potential. Yet his win rate is the same as always.

Win rates also suffer from temporal fluctuations if effective strategies for once race are developed and it takes a while to find a counter for the other races. Win rates can suddenly shift quite far away from 50%. The actual ability of the players did not change, nor did the game itself suddenly become imbalanced. This would only be the case if it eventually turns out that there is no counter to this strategy.

There is one particular case in which win rates can tell us something about balance issues: Random players. For Random players who play all races equally, we'd expect them to on average score the same with all races. If we take the data of all random players and a certain race appears to outscore the others, it's a good sign of a balance issue.

Effort needed for reaching a league

Unfortunately, gathering statistics of the race performance of Random players is something that seems hard to spider off of Battle.Net. So while Blizzard can and probably does use this information in their balancing discussions, it is not available to us.

Another way to look at race strength is to consider that if we could pick players of similar "innate" skill, give them a race and let them loose on the SC2 ladder, we could check if they end up in the same leagues.

So how do we find these players of similar "real" skill? We can make an assumption: a player improves the more StarCraft 2 games he plays. This isn't a very far fetched conclusion, as its known that most people improve at skilled tasks though practice. We can then rephrase our balance question: for a given amount of practice, will a player of a certain race end up being higher ranked than players of another race?

The data

I spidered data from about 26676 players from the EU StarCraft 2 servers, or about 10% of the active player population in that region. I rejected all players that have no games in Season 8 yet, and all Random players. They're not pertinent to our base question, and they require extra effort to distinguish from each other. I also threw away all players with less than 5 games, which hence have no league yet.

The table below shows the average amount of won games for players in a certain league, both overall, and broken down per race. After the number of games is the 95% standard error on the mean, which essentially gives some idea how accurate the numbers are. (They're mostly accurate to about 5-10%, except for the grandmasters due to extremely low sample size there). The next number is the amount of won games below which 95% of the players in that league are. This gives some idea of the amount of games where a player is extremely likely to get to the next league soon, as well as the actual variance of the amount of games players in a league have. Note that the variance is actually quite big - there's quite some players who have played twice the average amount of games yet are still stuck in a league.

The average number of won games the players in a league have is not the same as the amount of games you expect to need on average to be promoted into that league. That point is (very approximately) somewhere halfway between the average of the lower and the target league. The total number of games is about twice the number of won games - because the matchmaking should keep you around 50%.

So with this introduction, here is the actual data:

    EU Server, 26676 players sampled, 16243 valid samples
    6010 players in Protoss (37%)
    5248 players in Terran  (32%)
    4985 players in Zerg    (31%)
    Avg wins =   290 (+-    9), 95% upper ( 1021) for  5754 players in bronze     (35%)
    Avg wins =   533 (+-   17), 95% upper ( 1547) for  3468 players in silver     (21%)
    Avg wins =   708 (+-   22), 95% upper ( 1902) for  2715 players in gold       (17%)
    Avg wins =   939 (+-   33), 95% upper ( 2485) for  2133 players in platinum   (13%)
    Avg wins =  1239 (+-   48), 95% upper ( 3047) for  1386 players in diamond     (9%)
    Avg wins =  1680 (+-   86), 95% upper ( 4101) for   783 players in master      (5%)
    Avg wins =  2880 (+- 1544), 95% upper ( 5968) for     4 players in grandmaster (0%)
    Avg wins =   293 (+-   15), 95% upper ( 1022) for  2248 players in bronze_Protoss
    Avg wins =   305 (+-   16), 95% upper ( 1079) for  2163 players in bronze_Terran
    Avg wins =   262 (+-   17), 95% upper (  917) for  1343 players in bronze_Zerg
    Avg wins =   511 (+-   26), 95% upper ( 1467) for  1269 players in silver_Protoss
    Avg wins =   553 (+-   33), 95% upper ( 1681) for  1162 players in silver_Terran
    Avg wins =   537 (+-   29), 95% upper ( 1483) for  1037 players in silver_Zerg
    Avg wins =   687 (+-   34), 95% upper ( 1767) for   977 players in gold_Protoss
    Avg wins =   750 (+-   48), 95% upper ( 2115) for   777 players in gold_Terran
    Avg wins =   695 (+-   37), 95% upper ( 1848) for   961 players in gold_Zerg
    Avg wins =   949 (+-   55), 95% upper ( 2480) for   762 players in platinum_Protoss
    Avg wins =   886 (+-   59), 95% upper ( 2292) for   550 players in platinum_Terran
    Avg wins =   966 (+-   57), 95% upper ( 2608) for   821 players in platinum_Zerg
    Avg wins =  1217 (+-   77), 95% upper ( 2902) for   476 players in diamond_Protoss
    Avg wins =  1188 (+-   97), 95% upper ( 3065) for   371 players in diamond_Terran
    Avg wins =  1293 (+-   80), 95% upper ( 3156) for   539 players in diamond_Zerg
    Avg wins =  1719 (+-  135), 95% upper ( 3974) for   276 players in master_Protoss
    Avg wins =  1687 (+-  177), 95% upper ( 4349) for   225 players in master_Terran
    Avg wins =  1635 (+-  141), 95% upper ( 4017) for   282 players in master_Zerg
    Avg wins =  2867 (+- 3537), 95% upper ( 7870) for     2 players in grandmaster_Protoss
    Avg wins =  2893 (+- 1336), 95% upper ( 4782) for     2 players in grandmaster_Zerg


There is a clear correlation between having played more games, and being in a higher league. This validates our earlier assumption. More practice makes you a better player. (Or if you're pedantic about causation-correlation conclusions, at the very least better players practise more.) I know this sounds extremely obvious but it is nice to validate it. It also indicates smurfing etc isn't prevalent enough to mess with our results.

There are no significant differences between the races per league [1]. This means that you cannot get higher ranked faster by picking a specific race. You still need to put in the same amount of practice. This is the main thing we wanted to investigate and it's good news: StarCraft 2 is *really* well balanced, or at least it has been over the last 8 seasons, which this data aggregates.

If you want to get to masters, expect to put in about 3000 practice games, and maybe as much as 6000. If we assume an average game (including postmortem analysis etc) takes 20 real-life minutes, expect to put in about 1000 hours of practice.

If you played 2000 games and are still in bronze league, you're doing it wrong.

[1] This conclusion assumes that the majority of players play one race most of the time, i.e. that offracing only happens occasionally. Without this assumption we can't put the players in race groups to begin with. Note that if offracing were the norm, and the races not balanced, the offracing players would be able to detect this quickly, which in turn would made it less likely that they'd continue to offrace instead of sticking with the strongest race.

Thursday, May 24, 2012

History and bookmarks in Firefox for Android

The feature I have been working on now for quite some time is one I'd jokingly refer to as "a feature I hope most users will never need". It's the system that translates your browser history from XUL Fennec to Native Fennec. New users obviously don't need this, and I hope we get a lot of new users with the switch to the native UI, so there.

More seriously, for something that I initially described in a meeting as "doing a query in one DB and some inserts in another, how hard can it be?", this turned out to be a project that ended up sinking easily over a man-month of engineering time. Of course putting it like that in the meeting was just asking for problems, and I'm going to use this post to give some background into what happened, and explain some more things about our current History/bookmarks architecture and how databases work on Android.

The need for a new History & bookmarks database

First of all, you can ask yourself why we needed to change the format of browser history in the first place. In XUL Fennec, the history and bookmarks are managed by the Places system, the exact same as is powering desktop Firefox. It's a combination of SQLite with a C++ and JavaScript control layer.

The reason for rewriting the UI parts of Fennec in Java was to make it start up faster on Android, so that the user has something he can interact with instantly. Having bookmarks and history available early is quite important as that's what the user will most likely be interacting with, either through the AwesomeBar or the home screen that Native Fennec has. So we cannot wait for Gecko to load to access its Places system - history and bookmarks must be available immediately.

Secondly, Native Fennec now also includes Sync as a true Android service that works with the Android Accounts panel, background synchronization, etc. Having the history and bookmarks available natively in Java removes the need for Sync to load Gecko, which isn't something that you'd want a background service to do.

Lastly, storage on Android devices is often fairly limited. Places databases can get quite large for large if you've been using Firefox a long time and browse a lot, for example on my current desktop system the places.sqlite file is over 70Mb. This isn't desirable on Android, so we probably want to tune history expiration a bit differently.

Evolution of the new database

Our initial implementation of an Android history & bookmarks system was to simply use the built-in default database. This has some nice advantages such as automatic integration with the default Browser. If the user is transitioning from that to Firefox he will immediately have his history & bookmarks available to him. Feedback from the community came in that they weren't entirely comfortable with this. There was a feeling that although users trust Firefox with their private data, they don't necessarily trust Android or other applications that may access this data in the same way.

We started adding our own implementation, a local database (LocalDB) using Android's SQLite library, our own ContentProvider, and a wrapper layer (BrowserDB) that allowed most of the code to seamlessly switch between using the local database and Android's database.

As we started adding more features to our bookmarks system, and Sync implementation was underway, it became clear at some point that we couldn't reasonably support them well with the limited bookmark support of Android itself. For example, by default the Android implementation can only remember the last 250 URL's you visited, which makes the AwesomeBar's search sort-of useless. So eventually, support for Android's history & bookmarks database was phased out.

Remapping Places & Database constraints

Importing the Places database into our LocalDB format is mostly a straightforward remapping, although there are some minor differences. The Places database is very normalized, and for example each bookmarks and history entry refers to an entry in another table which contains the actual URL. LocalDB is a bit simpler in this regard as the URLs are contained within the entries themselves, saving some database work - which can help on slow devices. We made a small design mistake here in the Favicons table. They're big enough that we do not want them duplicated in the database, but Favicons can be the same for multiple URLs. We caught this issue a bit too late and as a result it will not be fixed in the first release - see bug 717428. You can occasionally see constraint violation warnings in the Android logs if this problem occurs.

Another small complication in transferring bookmarks is that they form a folder (tree) structure. We added some (SQL key) constraints in the new LocalDB to ensure no folders get orphaned even if we have bugs in our first versions, but this means that we must reconstruct the tree structure in the correct order when remapping the bookmarks data. A simple breadth-first algorithm makes multiple passes over the bookmarks data until we reach the leaves. Any bookmarks left over were orphaned and are not imported.

Places "tags" are stored in a fairly complicated manner in Places (using a multitude of bookmark folders) and due to this they are not supported in Native Fennec, and likely won't be for a while, so there is no need to migrate them.

Some fragmentation pains

During the development, at some point bug reports started coming in from some users that were migrating to Native Fennec, didn't get any bookmarks & history imported, and got an error "database is corrupted". After investigation, it turns out this was problem of combined SQLite and Android fragmentation. Firefox XUL (and Desktop) ships with the most recent SQLite revision, which means that it is currently using Write-Ahead-Logging for database operations.

Various Android devices come with various versions of SQLite, and only since (approximately) Honeycomb are the SQLite's new enough to understand the new database format that WAL requires. So older Android devices couldn't correctly read Firefox's SQLite databases with their built-in SQLite support.

We ended up writing a bridge driver that uses JNI to talk to Firefox's SQLite library, and provides an interface compatible with the Android SQLite one. This turned out to be useful for a number of other things too, such as accessing the original Passwords and Form History database from Gecko. In general, this kind of experiences made the many work spent on this feature worthwhile - quite a bit of the work was usable in other parts of the project as well.

Another small fragmentation issue we ran into is that foreign key constraints are not supported on some older Android devices. Because we basically only use these for error checking, it was not a large problem.

Performance aspects

The migration initially ran in the background on the first run, but it turns out this leads to horrible performance as the user tries to interact for the first time while there is heavy disk I/O and processing happening at the same time. Because that kind of nullifies the message we were trying to send that Native Fennec is a lot faster, we opted to show a modal screen instead that informs the user of what is happening. Of  course, popping up a loading screen doesn't exactly convey a message of speediness either, but we should be fine as long as the user only sees it once and it doesn't take too long.

The latter was a bit of a problem though. Initial implementations ran at a speed of a few bookmarks or history entries per second. It's not unusual for Firefox users to have hundreds of bookmarks and hundreds of thousands of history entries. Android provides an API to allow ContentProviders to support database transactions, and implementing support for those on both sides sped up our migration speed to about 50-100 records per second. This was another example of the work on this little-seen feature providing benefits elsewhere, as Sync is now also using the faster API.

There are some interesting interactions with the batch API in Android if part of the transaction fails. Basically, it is implementation-defined what to do if an operation fails for some operations (applyBatch), and the API seems to be designed to really allow both options (continue or abort) to work. For our purposes, doing as much as possible was the better option, but this turns out to have some problems because the API to tell SQLite what to do in such a case isn't available in Android <= 2.2, which we wanted to support. This lead to some "interesting" code that may make your eyes bleed.

Sync integration 

Because that's still not fast enough to migrate a users entire history on the first run, we also made an API that allows Sync to interact with the migration feature. The first run will migrate about 2000 entries and the entire bookmarks table. For users without Sync, this is all that they'll get, but it is enough to make the AwesomeBar quite usable and covers several weeks of browsing even for moderately heavy users.

If the users have Sync set up, the idea was that Sync will continue the migration from the database before starting to use the network. Feedback from the beta made us reconsider the previous approach: the problem of delaying Sync until the Places database has been entirely migrated is that the user will not get his more recent History updates until his updates from potentially weeks or months ago have been migrated. This leads to a suboptimal experience, and it's something we're going to try to address for the release.

The XUL to Native migration includes a feature that tries to migrate your Sync credentials automatically if you have them set up in Native Fennec. Unfortunately this had a bug in the very first beta release, so if you were one of the earliest adopters this will not have worked for you. The issue was fixed in the second beta release.

Future work 

Despite the first beta being out, work on this crucial feature is far from finished. Aside from the points already mentioned above, there are a few more notable areas that we plan to get addressed before the release ships:

  • After support for the Android bookmarks & history database was dropped, users who were previously using the built-in browser have no way to import their bookmarks and history into Fennec right now. We're currently working on adding this feature as a first priority. Eventually we will add export functionality as well (though we'd rather make Fennec so awesome that nobody ever wants to use the built-in browser ever again!).
  • There is currently no expiration in the history database. So although the format is a bit more compact than Places, it could theoretically grow unbounded if the user visits a lot of sites on his phone and keeps on using Firefox. The algorithms for this that Places uses are fairly complicated, so we need to study which parts are meaningful for a mobile device, and if we want to tune it differently for typical phone or tablet use scenarios.