Trimming Unnecessary Dependencies from Projects

When you’re working in large repositories with hundreds of MsBuild projects, you’re bound to have fairly complex build graphs. Over time, these can devolve and you may end up with lots of dependencies between projects which are no longer needed. This can cause builds to slow down as they are less parallelizable, and the developer experience can suffer as you unnecessarily rebuild libraries which have falsely depend on libraries you changed.

Luckily, the Roslyn compiler is smart enough to only include references which are actually used in an assembly’s metadata. So you can actually compare the references which are passed to the compiler against the references that actually make it into the compiled assembly.

You can figure out which references make it into the assembly using a tool like ILSpy or dotPeek, or even use the reflection method Assembly.GetReferencedAssemblies.

There are three different ways that references can be included in your project: Reference, ProjectReference, and PackageReference.

References and Project References

References and Project References are both fairly straightforward. Each reference gets resolved, usually by its HintPath if it isn’t a framework assembly, and passed to the compiler. Similarly for project references, MsBuild does an inner build to determine each project reference’s target assembly and passes that to the compiler for the outer build. For both of these, it’s pretty straightforward to just compare each reference passed to the compiler and check whether it made it into the assembly’s metadata.

Transitivity

But wait, what if an indirect dependency is required at run time, but not at compile time? This can happen if you have a project with a dependency which itself has a dependency which isn’t directly required by the original project. To build the project, the indirect dependency may not be needed, however if the project produces something runnable like an exe or a unit test assembly, certain code paths may require the indirect dependency to get loaded into the App Domain.

So for project which produce something runnable, we have to consider all transitive references and not just the references the main assembly has.

Package References

Package references are where things get a little complicated. In the new Sdk-style projects, MsBuild and NuGet work together to form the entire dependency graph for the package references you specified, collects all the assemblies for all of those packages, and passes every single one of them to the compiler. With the packaging of the framework assemblies themselves (ie. netstandard2.0 and netcoreapp2.0), this can get fairly huge. As an example, for a fairly simple web application I have, a whopping 366 /reference parameters are passed to csc.exe. As if things weren’t complicated enough, some packages like Microsoft.AspNetCore.All are really just meta-packages which themselves don’t have any assemblies but instead just have a number of dependencies which do contain assemblies to be referenced. And then packages like Microsoft.Net.Compilers don’t add any references but instead provide additional build tooling by way of MsBuild props and targets.

So for package references, we have to account for any assembly in the package or any packages it depends on, and also for cases where they don’t provide references at all.

Introducing ReferenceTrimmer

If all of this seems overwhelming enough to make you not want to bother cleaning up your projects, fear not. I’ve create a little tool called ReferenceTrimmer which does all the work for you.

When you run it, it accounts for each of the things discussed above and prints out what it believes are unnecessary references, project references, and package references.

I’m sure it doesn’t cover all cases yet, and likely reports false-positives and false-negatives, but it was able to find some issues in some of my smaller projects even. Contributions are always welcome if you find that it could do better though!

Setting up a Roaming Developer Console

Have you ever wanted to have your personal scripts and aliases just always available to you in any console session? Well, it’s possible!

In my setup I’ll be using OneDrive, but the concept applies to any cloud storage that syncs to your disk like Google Drive or Dropbox.

Setting it up

The Command Prompt has a fairly unknown feature called Autorun, which allows for running a command every time cmd.exe starts. This is done via a registry key, but to make setup easy, you can write a script and put it in your cloud storage. You can take this a step further can have that Autorun script be in your cloud storage as well. This is the basis by which we’ll be creating the roaming dev console.

First, create a setup.cmd in your cloud storage. This script will need to be run once per machine you log into. In my example, I’m putting it under my OneDrive folder under Code\Scripts\setup.cmd.

@echo off

setlocal

REM The script to run each time cmd starts.
set initScript=%HOMEDRIVE%%HOMEPATH%\OneDrive\Code\Scripts\init.cmd

if not exist "%initScript%" (
    echo "%initScript%" does not exist!
    exit /B 1
)

REM Configure the current user's autorun
echo reg add "HKCU\Software\Microsoft\Command Processor" /v "Autorun" /d "\"%initScript%\"" /t REG_EXPAND_SZ /f
call reg add "HKCU\Software\Microsoft\Command Processor" /v "Autorun" /d "\"%initScript%\"" /t REG_EXPAND_SZ /f

echo Autorun configured!

REM init now to avoid the need to restart the console
echo Running "%initScript%" now
"%initScript%"

endlocal

This setup script simply adds the registry key which controls Command Prompt’s Autorun and points it to an init script under your cloud storage.

For those who prefer PowerShell, there is an equivalent concept to Autorun called Profiles. Instead of setting the registry key, add a PowerShell file at %UserProfile%\My Documents\WindowsPowerShell\profile.ps1. This will run initially in any PowerShell window you open. To make the roaming part work, have this profile.ps1 file call an init.ps1 under your cloud storage.

You can actually take this even further and have your setup script also configure other settings (user-wide git config), install various applications (git, nodejs, etc.), and anything else you may want to do once per computer you use.

The init script

Now that init script is configured to run in each Command Prompt or PowerShell window, what’re the kinds of things to add here?

One warning about this is that Autorun/Profile runs before every invocation of cmd or powershell. Thus it’s not recommended to do long-running work there or perhaps more importantly, don’t emit any output. Any application which may spawn child processing using cmd or powershell will incur the cost of your init script, and if it parses the output of the process it launched, it will see any output from your init script. So just keep it quiet, short, and sweet.

Here’s an example of what part of my init.cmd file looks like. In my example, I’m putting it under my OneDrive folder under Code\Scripts\init.cmd.

@echo off

SET PATH=%PATH%;%~dp0

DOSKEY ..=cd ..
DOSKEY n="%ProgramFiles(x86)%/Notepad++/notepad++.exe" $*

Again, for PowerShell users, you can just as easily put similar things in your init.ps1, like global functions.

While this doesn’t look like much, it sets up a few aliases for me, and adds my Code\Scripts OneDrive folder to my PATH. I have other convenience scripts
and tools which don’t require an install under this path as well:

To give an idea of some of the things I have in mine:

  • handle.exe – Helps identify which stubborn process has an open handle on a file.
  • kill.exe – Renamed from PsKill.exe. Easily kills processes
  • nuget.cmd – Calls Nuget\nuget.exe, the NuGet CLI
  • procexp.exe – A better task manager.
  • procmon.exe – Shows real-time file and registry accesses, process spawns, etc.

The beauty about this is that if you add a new tool or alias, your cloud storage will automatically sync to other machines and so will just be available automatically there!

Testing asynchronous code: async vs fake async

In the last post I explored implementing a mock which tested asynchronous code in a “fake” asynchronous way, and I promised to dive a little deeper into that concept and compare it with testing in an asynchronous way.

I say “fake” here because it’s still using async/await, but the way of testing is more of a step by step approach where the unit test ends up effectively waiting on each task or promise to finish before moving on.

Angular has really clear examples of each pattern, so let’s see the differences and compare the pros and cons for each.

Async

This is the patterns I think C# developers are more familiar with, although I’ve seen it used in TypeScript as well. It generally involves the three traditional sections: . Here’s an example of a test using this pattern:

it("should display some links when the user is not logged in", async(() => {
    // Arrange (some arranging was already done in beforeEach as well)
    let authenticationService = TestBed.get(AuthenticationService) as AuthenticationService;
    spyOn(authenticationService, "userInfo").and.returnValue(new BehaviorSubject(notLoggedInUser));

    // Act (detectChanges ends up triggering ngOnInit which actually does the work we're testing)
    fixture.detectChanges();
    fixture.whenStable().then(() => {
        fixture.detectChanges();

        // Assert
        let expectedLinks: { text: string, url: string }[] = [ /* Omitted for brevity */ ];
        let navItems = fixture.debugElement.queryAll(By.css(".nav-item"));
        expect(navItems).not.toBeNull();
        expect(navItems.length).toEqual(expectedLinks.length);

        for (let i = 0; i < navItems.length; i++) {
            let link = navItems[i].query(By.css(".nav-link"));
            expect(link).not.toBeNull();

            let expectations = expectedLinks[i];
            expect(link.nativeElement.innerText).toEqual(expectations.text);
            expect(link.attributes.href).toEqual(expectations.url);
        }

        expect(authenticationService.userInfo).toHaveBeenCalled();
    });
}));

I won’t go into detail about the test as it was pulled from a real example, but the pattern should look familiar. You start in the Arrange section mocking calls and setting up the test, then you trigger the code that’s actually being tested, then finally you assert your expectations on any results or side-effects.

A minor note for those unfamiliar with the use of async here, it’s equivalent to using the done construct and calling done after the last expectation. Or for any C# developers, it’s equivalent to returning a Task from your test. The test framework just waits for all async work to finish before the test ends.

Fake Async

This pattern is used pretty widely in Angular apps, especially when mocking http calls. It’s also how I implemented my Testing.HttpClient C# library. It differs from the usual “Arrange, Act, and Assert” and instead interleaves and combines all three. Here’s an example:

it("should return some data", fakeAsync(() => {
    // Act
    let response: IUser;
    let error: HttpErrorResponse;
    userService.getUser(userName)
        .then((r: IUser) => response = r)
        .catch((e: HttpErrorResponse) => error = e);

    // Arrange, with some implicit assertions
    let expectedResponse: IUser = { name: "someName" };
    let request = httpMock.expectOne({ method: "get", url: `/api/users/${userName}` });
    request.flush(expectedResponse);
    tick();

    // Assert
    expect(response).toEqual(expectedResponse, "should return the expected response");
    expect(error).toBeUndefined();
    expect(httpErrorHandlerService.logError).not.toHaveBeenCalled();
    expect(httpErrorHandlerService.getValidationErrors).not.toHaveBeenCalled();
}));

In this example, the test being called is made immediately (although there is some setup in a beforeEach not shown). It returns a promise, but the promise is unresolved at that point. Next there is an implicit assertion about what a request that the userService.getUser looked like and a mock response is provided. Then the promises are flushed, which is why these are fake async tests. Finally, assertions are made.

Comparing the two

Ironically, the Async pattern more closely follows how you should test synchronous code. Both tend to follow the “Arrange, Act, and Assert” structure. This pattern set up a bunch of mocks and rules beforehand to “wind up” the test, then lets the code under test run to completion, the expectations are checked. This pattern is good for testing how code behaves to inputs and the results it produces.

The fake async pattern is a lot more granular as you essentially interleaves the test code with the code being tested. It’s like stepping through a debugger and providing mocks and asserting as you go, line by line. This pattern is better at testing code that creates side-effects, or code that needs to be done in a specific order. Anecdotally though, the granularity can come at a cost of brittleness. Refactoring can easily break tests even though the code remained functionally correct, so you can end up wasting time fixing tests.

One downside of the fake async pattern is that it tends to require extra code to get the async parts of the code flushed. In Angular tests, the tick function does this magic for you, as Angular is able to wrap all Promises and so tick can wait for their completion for you. In C#, this is fairly cumbersome, as there isn’t a way to provide a replacement to the default TaskFactory, so you end up having to expose TaskCompletionSource objects from all your mocks to provide the flushing functionality.

Additionally, given the right tools, any fake sync test could be made into an async test. In Angular instead of using the HttpTestingController, you could provide your own mock HttpClient object and use spies to both match specific request patterns and provide responses.

Verdict

When writing up these examples, I actually had attempted to use the same test written in each pattern, but one would always feel a little awkward. There is definitely something to be said about using the right tool for the job, so in Angular tests if you find yourself testing code that makes http calls or uses timers, feel free to use the fake async pattern. The magic is provided for you, so you might as well use it.

However, I also feel that usage of fake async is fairly niche. It doesn’t work well in all scenarios, and it’s not even a very common pattern in some languages like C#. The async pattern on the other hand works well in almost all cases, and so it’s the pattern I’d suggest any well-rounded software engineer to master. It’s a commonly used, widely applicable, well-known, and well-understood pattern.

tl;dr if I was stuck on a desert island with only one unit testing pattern, I’d pick the async pattern over the fake async one.