7 min to read
Life as an engineer at Mozilla (II)
Reveal tasks a software engineer in Mozilla would tackle every day.
In this post, I’m going to go through the process of solving issues for Gecko. Gecko, as a platform, contains a lot of different components dealing with all kinds of issues happening on the web. It’s impossible to have a bug-free software, so having a good bug tracking tool can help developers better organize their tasks and make their plans.
Bugs Tracking and Triaging
In Mozilla, we use Bugzilla to track all related issues, and there are many different products and compoents you can choose when you want to file a bug, which helps to categorize issues to right places in the first step. For example, the component I mainly work on is Core::Audio/Video: Playback that is for issues related with media elements (remember <Audio>
and <Video>
?).
After a bug is filed on Bugzilla, we have to first check if there is an already filed same bug, or if this bug is filed in the right component. As you see, there are huge amounts of bugs in the list, and it still keeps growing and growing. To be honest, it’s hard to catch up with the speed of how fast the amount of bugs grows due to limited engineer resources, so in this stage, we also need to indentify what issues are really important, what issues are false alarm.
We use Triage Center to find bugs in the components we want to traige. Then, go through those bugs and diagnose issues. There are so many possible reasons which could result in an issue, and it might be caused by installed web extentions, incorrectly modifying the internal config, intefering by external softwares, graphic driver issues, weird system setup…e.t.c.
After we confirm that an issue is Firefox’s fault, we would assign the priotify and serverity to that bug. Every team has their own different definition for the bug priority that ideally should only be set by team members, not by bug reporters. For severity most teams follow the same definition.
Diagnosing Problems
There are several ways to ask information from bug reporters, the first method we would take is to ask them to provide information from about:support
page. (typing about:support
in Firefox URL bar) That page contains all the information about users’ system, what version of Firefox they are using, what config their are using (Did they change important config?), and also provides links to other important pages used to help developing, such as about:memory
(check memory usage for each indivisual tab) and about:performance
(energy impact for each indivisual tab).
If an issue is performance facing, such like video or audio stuttering, then we will usually ask them to enable Firefox profiler that allows users to do a real-time profiling on multiple processes and threads. By adding labels and markers in our code, the profiler is able to know the callstack and what component those function calls belong to. In the profiler setting, you can use the default preset or customize a setting to profile the threads you want to investigate.
Problem Solving and Code Review
Then, it’s time to roll up our sleeves and get thing fixed! Mozilla has built a site called SearchFox which allows you to find the references or definitions easily and check the history of the code. (I personally mix using that site with IDE, because searching references in SearchFox is really fast) By the way, Mozilla supports using both Mercurial and Git for the version control.
Let’s take this bug as an example, which is about failing to play some certain mp3 files. As I was able to reproduce the issue on my computer, and I had confirmed that the problem was caused by Firefox via checking the debug log, we can then dive in the field deeper to investigate the root cause.
After some investigation, I found that the problem was in our mp3 ID3 parser that can’t handle a file containing multiple ID3v2 tags, which is used to contain the media metadata, such as artist, song name…e.t.c. That is an unusual situation because normally we only need one ID3v2 tag per mp3 file enough, the specification doesn’t prohibit that though.
Therefore, I got my solution and submitted it onto Phabricator that is a web-based software development collaboration tools and we use it as a code review platform. The patches we submitted onto Phabricator would be checked automatically by various tools, eg. clang-format-linter, in order to check if there is any syntax or coding style problems.
In Mozilla, we have module owner and peers who are responsible to the code in those modules, but those information on the page isn’t up to date. So if you can’t find them for a review, just looking into the history and see who managed the code review most for the file you want to modify.
Testing! Testing! Testing!
Say it triple because it’s important! In Mozilla, we have various kinds of testing (that is an achieved page, because I didn’t find a new page containing all tests). In the opposite to those Gecko-only testings, we also run the web platform tests which is used to create consistent tests across different browsers. If the thing we’re testing is more related to web API behaviors which should be consistent in all browsers, then writing a web platform test is the best choice. We can see the testing result across different browsers in wpt.fyi. We also have a tool to measure the test coverage in our codebase.
Only limited number of unit tests are integrated into Phabricator, so we have another testing platform to run all automation tests, which is our try server. By using different selectors, we’re able to choose what tests we’re going to run on the try server, and what platforms you want those tests running on.
In the example I showed above, I chose to write a mochitest to simulate playing a mp3 file which has multiple ID3v2 tags via an audio element, and check whether it can start playing correctly. If the bug I’m fixing involves tab’s activity changes, then I would choose to write a higher level test, browser chrome test, which allows operating tabs directly as if users are operating real actions. If we need to compare the visual difference, then reftest would be your friend.
The higher level test is, the higher chance that test would be less stable. Because comparing with lower level test, it would have needed to rely on more modules. So choosing what test you should use is important depending on the scale of the problem you’re working on.
Landing The Fix
Finally, you’re only a step away from the destination, after we pass the code review and all testing on the try server. We have several branches for Firefox, the one we would encouter first is “autoland” which is a branch where all new changes would be landed first to see if they would break anything, from builds to tests. It’s like a frontline of battlefield where all those new changes are battling each other and we only want to pick those good and stable changes into next stage.
After we ensure that those new changes are all good by baking them on the autoland for a while (usaually serveral hours to a day), then they would go to the next branch “central”, which is used to build a Nightly version of Firefox. Besides that, we also have “beta” for Firefox Beta and Developer Edition, and “release”, which would eventaully become the exact Firefox hundreds of millions users are using everyday.
Currently the frequency of releasing a new version of Firefox is on a monthly basis, and it’s even longer on Firefox Extended Support Release (ESR) version. That means the changes in “central” would be merged into “beta” after a month, and merging changes from “beta” to “release” as well. You can check the release schedule here.
That’s what we would go through in our daily development, thank you for your reading and hope you enjoying!
Comments