Aaron Klotz’s Software Blog

My Adventures as a Former Mozilla Employee

2018 Roundup: Q2, Part 3 - Fleshing Out the Launcher Process

| Comments

This is the fourth post in my “2018 Roundup” series. For an index of all entries, please see my blog entry for Q1.

Yes, you are reading the dates correctly: I am posting this nearly two years after I began this series. I am trying to get caught up on documenting my past work!

Once I had landed the skeletal implementation of the launcher process, it was time to start making it do useful things.

Ensuring Medium Integrity

[For an overview of Windows integrity levels, check out this MSDN page – Aaron]

Since Windows Vista, security tokens for standard users have run at a medium integrity level (IL) by default. When UAC is enabled, members of the Administrators group also run as a standard user with a medium IL, with the additional ability of being able to “elevate” themselves to a high IL. When UAC is disabled, an administrator receives a token that always runs at the high integrity level.

Running a process at a high IL is something that is not to be taken lightly: at that level, the process may alter system settings and access files that would otherwise be restricted by the OS.

While our sandboxed content processes always run at a low IL, I believed that defense-in-depth called for ensuring that the browser process did not run at a high IL. In particular, I was concerned about cases where elevation might be accidental. Consider, for example, a hypothetical scenario where a system administrator is running two open command prompts, one elevated and one not, and they accidentally start Firefox from the one that is elevated.

This was a perfect use case for the launcher process: it detects whether it is running at high IL, and if so, it launches the browser with medium integrity.

Unfortunately some users prefer to configure their accounts to run at all times as Administrator with high integrity! This is terrible idea from a security perspective, but it is what it is; in my experience, most users who run with this configuration do so deliberately, and they have no interest in being lectured about it.

Unfortunately, users running under this account configuration will experience side-effects of the Firefox browser process running at medium IL. Specifically, a medium IL process is unable to initiate IPC connections with a process running at a higher IL. This will break features such as drag-and-drop, since even the administrator’s shell processes are running at a higher IL than Firefox.

Being acutely aware of this issue, I included an escape hatch for these users: I implemented a command line option that prevents the launcher process from de-elevating when running with a high IL. I hate that I needed to do this, but moral suasion was not going to be an effective technique for solving this problem.

Process Mitigation Policies

Another tool that the launcher process enables us to utilize is process mitigation options. Introduced in Windows 8, the kernel provides several opt-in flags that allows us to add prophylactic policies to our processes in an effort to harden them against attacks.

Additional flags have been added over time, so we must be careful to only set flags that are supported by the version of Windows on which we’re running.

We could have set some of these policies by calling the SetProcessMitigationPolicy API. Unfortunately this API is designed for a process to use on itself once it is already running. This implies that there is a window of time between process creation and the time that the process enables its mitigations where an attack could occur.

Fortunately, Windows provides a second avenue for setting process mitigation flags: These flags may be set as part of an attribute list in the STARTUPINFOEX structure that we pass into CreateProcess.

Perhaps you can now see where I am going with this: The launcher process enables us to specify process mitigation flags for the browser process at the time of browser process creation, thus preventing the aforementioned window of opportunity for attacks to occur!

While there are other flags that we could support in the future, the initial mitigation policy that I added was the PROCESS_CREATION_MITIGATION_POLICY_IMAGE_LOAD_PREFER_SYSTEM32_ALWAYS_ON flag. [Note that I am only discussing flags applied to the browser process; sandboxed processes receive additional mitigations. – Aaron] This flag forces the Windows loader to always use the Windows system32 directory as the first directory in its search path, which prevents library preload attacks. Using this mitigation also gave us an unexpected performance gain on devices with magnetic hard drives: most of our DLL dependencies are either loaded using absolute paths, or reside in system32. With system32 at the front of the loader’s search path, the resulting reduction in hard disk seek times produced a slight but meaningful decrease in browser startup time! How I made these measurements is addressed in a future post.

Next Time

This concludes the Q2 topics that I wanted to discuss. Thanks for reading! Coming up in H2: Preparing to Enable the Launcher Process by Default.

2018 Roundup: Q2, Part 2 - Implementing a Skeletal Launcher Process

| Comments

This is the third post in my “2018 Roundup” series. For an index of all entries, please see my blog entry for Q1.

Yes, you are reading the dates correctly: I am posting this nearly two years after I began this series. I am trying to get caught up on documenting my past work!

One of the things I added to Firefox for Windows was a new process called the “launcher process.” “Bootstrap process” would be a better name, but we already used the term “bootstrap” for our XPCOM initialization code. Instead of overloading that term and adding potential confusion, I opted for using “launcher process” instead.

The launcher process is intended to be the first process that runs when the user starts Firefox. Its sole purpose is to create the “real” browser process in a suspended state, set various attributes on the browser process, resume the browser process, and then self-terminate.

In bug 1454745 I implemented an initial skeletal (and opt-in) implementation of the launcher process.

This seems like pretty straightforward code, right? Naïvely, one could just rip a CreateProcess sample off of MSDN and call it day. The actual launcher process implementation is more complicated than that, for reasons that I will outline in the following sections.

Built into firefox.exe

I wanted the launcher process to exist as a special “mode” of firefox.exe, as opposed to a distinct executable.

Performance

By definition, the launcher process lies on the critical path to browser startup. I needed to be very conscious of how we affect overall browser startup time.

Since the launcher process is built into firefox.exe, I needed to examine that executable’s existing dependencies to ensure that it is not loading any dependent libraries that are not actually needed by the launcher process. Other than the essential Win32 DLLs kernel32.dll and advapi32.dll (and their dependencies), I did not want anything else to load. In particular, I wanted to avoid loading user32.dll and/or gdi32.dll, as this would trigger the initialization of Windows’ GUI facilities, which would be a huge performance killer. For that reason, most browser-mode library dependencies of firefox.exe are either delay-loaded or are explicitly loaded via LoadLibrary.

Safe Mode

We wanted the launcher process to both respect Firefox’s safe mode, as well as alter its behaviour as necessary when safe mode is requested.

There are multiple mechanisms used by Firefox to detect safe mode. The launcher process detects all of them except for one: Testing whether the user is holding the shift key. Retrieving keyboard state would trigger loading of user32.dll, which would harm performance as I described above.

This is not too severe an issue in practice: The browser process itself would still detect the shift key. Furthermore, while the launcher process may in theory alter its behaviour depending on whether or not safe mode is requested, none of its behaviour changes are significant enough to materially affect the browser’s ability to start in safe mode.

Also note that, for serious cases where the browser is repeatedly unable to start, the browser triggers a restart in safe mode via environment variable, which is a mechanism that the launcher process honours.

Testing and Automation

We wanted the launcher process to behave well with respect to automated testing.

The skeletal launcher process that I landed in Q2 included code to pass its console handles on to the browser process, but there was more work necessary to completely handle this case. These capabilities were not yet an issue because the launcher process was opt-in at the time.

Error Recovery

We wanted the launcher process to gracefully handle failures even though, also by definition, it does not have access to facilities that internal Gecko code has, such as preferences and the crash reporter.

The skeletal launcher process that I landed in Q2 did not yet utilize any special error handling code, but this was also not yet an issue because the launcher process was opt-in at this point.

Next Time

Thanks for reading! Coming up in Q2, Part 3: Fleshing Out the Launcher Process

Coming Around Full Circle

| Comments

One thing about me that most Mozillians don’t know is that, when I first applied to work at MoCo, I had applied to work on the mobile platform. When all was said and done, it was decided at the time that I would be a better fit for an opening on Taras Glek’s platform performance team.

My first day at Mozilla was October 15, 2012 — I will be celebrating my seventh anniversary at MoCo in just a couple short weeks! Some people with similar tenures have suggested to me that we are now “old guard,” but I’m not sure that I feel that way! Anyway, I digress.

The platform performance team eventually evolved into a desktop-focused performance team by late 2013. By the end of 2015 I had decided that it was time for a change, and by March 2016 I had moved over to work for Jim Mathies, focusing on Gecko integration with Windows. I ended up spending the next twenty or so months helping the accessibility team port their Windows implementation over to multiprocess.

Once Firefox Quantum 57 hit the streets, I scoped out and provided technical leadership for the InjectEject project, whose objective was to tackle some of the root problems with DLL injection that were causing us grief in Windows-land.

I am proud to say that, over the past three years on Jim’s team, I have done the best work of my career. I’d like to thank Brad Lassey (now at Google) for his willingness to bring me over to his group, as well as Jim, and David Bolter (a11y manager at the time) for their confidence in me. As somebody who had spent most of his adult life having no confidence in his work whatsoever, their willingness to entrust me with taking on those risks and responsibilities made an enormous difference in my self esteem and my professional life.

Over the course of H1 2019, I began to feel restless again. I knew it was time for another change. What I did not expect was that the agent of that change would be James Willcox, aka Snorp. In Whistler, Snorp planted the seed in my head that I might want to come over to work with him on GeckoView, within the mobile group which David was now managing.

The timing seemed perfect, so I made the decision to move to GeckoView. I had to finish tying up some loose ends with InjectEject, so all the various stakeholders agreed that I’d move over at the end of Q3 2019.

Which brings me to this week, when I officially join the GeckoView team, working for Emily Toop. I find it somewhat amusing that I am now joining the team that evolved from the team that I had originally applied for back in 2012. I have truly come full circle in my career at Mozilla!

So, what’s next?

  • I have a couple of InjectEject bugs that are pretty much finished, but just need some polish and code reviews before landing.

  • For the next month or two at least, I am going to continue to meet weekly with Jim to assist with the transition as he ramps up new staff on the project.

  • I still plan to be the module owner for the Firefox Launcher Process and the MSCOM library, however most day-to-day work will be done by others going forward;

  • I will continue to serve as the mozglue peer in charge of the DLL blocklist and DLL interceptor, with the same caveat.

Switching over to Android from Windows does not mean that I am leaving my Windows experience at the door; I would like to continue to be a resource on that front, so I would encourage people to continue to ask me for advice.

On the other hand, I am very much looking forward to stepping back into the mobile space. My first crack at mobile was as an intern back in 2003, when I was working with some code that had to run on PalmOS 3.0! I have not touched Android since I shipped a couple of utility apps back in 2011, so I am looking forward to learning more about what has changed. I am also looking forward to learning more about native development on Android, which is something that I never really had a chance to try.

As they used to say on Monty Python’s Flying Circus, “And now for something completely different!”