Buffer overruns, license violations, and bad code: FreeBSD 13’s close call

At first glance, Matthew Macy seemed like a perfectly reasonable choice to port WireGuard into the FreeBSD kernel. WireGuard is an encrypted point-to-point tunneling protocol, part of what most people think of as a "VPN." FreeBSD is a Unix-like operating system that powers everything from Cisco and Juniper routers to Netflix's network stack, and Macy had plenty of experience on its dev team, including work on multiple network drivers.

So when Jim Thompson, the CEO of Netgate, which makes FreeBSD-powered routers, decided it was time for FreeBSD to enjoy the same level of in-kernel WireGuard support that Linux does, he reached out to offer Macy a contract. Macy would port WireGuard into the FreeBSD kernel, where Netgate could then use it in the company's popular pfSense router distribution. The contract was offered without deadlines or milestones; Macy was simply to get the job done on his own schedule.

With Macy's level of experience—with kernel coding and network stacks in particular—the project looked like a slam dunk. But things went awry almost immediately. WireGuard founding developer Jason Donenfeld didn't hear about the project until it surfaced on a FreeBSD mailing list, and Macy didn't seem interested in Donenfeld's assistance when offered. After roughly nine months of part-time development, Macy committed his port—largely unreviewed and inadequately tested—directly into the HEAD section of FreeBSD's code repository, where it was scheduled for incorporation into FreeBSD 13.0-RELEASE.

This unexpected commit raised the stakes for Donenfeld, whose project would ultimately be judged on the quality of any production release under the WireGuard name. Donenfeld identified numerous problems with Macy's code, but rather than object to the port's release, Donenfeld decided to fix the issues. He collaborated with FreeBSD developer Kyle Evans and with Matt Dunwoodie, an OpenBSD developer who had worked on WireGuard for that operating system. The three replaced almost all of Macy's code in a mad week-long sprint.

This went over very poorly with Netgate, which sponsored Macy's work. Netgate had already taken Macy's beta code from a FreeBSD 13 release candidate and placed it into production in pfSense's 2.5.0 release. The forklift upgrade performed by Donenfeld and collaborators—along with Donenfeld's sharp characterization of Macy's code—presented the company with a serious PR problem.

Netgate's public response included accusations of "irrational bias against mmacy and Netgate" and irresponsible disclosure of "a number of zero-day exploits"—despite Netgate's near-simultaneous declaration that no actual vulnerabilities existed.

This combative response from Netgate raised increased scrutiny from many sources, which uncovered surprising elements of Macy's own past. He and his wife Nicole had been arrested in 2008 after two years spent attempting to illegally evict tenants from a small San Francisco apartment building the pair had bought.

OH MY GOD THE PERSON WHO WROTE THE PFSENSE WIREGUARD CODE IS THIS GUY https://t.co/e3m2ETLv2c
— Matthew Garrett (@mjg59@nondeterministic.computer) (@mjg59) March 16, 2021

The Macys' attempts to force their tenants out included sawing through floor support joists to make the building unfit for human habitation, sawing holes directly through the floors of tenants' apartments, and forging extremely threatening emails appearing to be from the tenants themselves. The couple fled to Italy to avoid prosecution but were eventually extradited back to the US—where they pled guilty to a reduced set of felonies and served four years and four months each.

Macy's history as a landlord, unsurprisingly, dogged him professionally—which contributed to his own lack of attention to the doomed WireGuard port.

"I didn't even want to do this work," Macy eventually told us. "I was burned out, spent many months with post-COVID syndrome... I'd suffered through years of verbal abuse from non-doers and semi-non-doers in the project whose one big one up on me is that they aren't felons. I jumped at the opportunity to leave the project in December... I just felt a moral obligation to get [the WireGuard port] over the finish line. So you'll have to forgive me if my final efforts were a bit half-hearted."

This admission answers why such an experienced, qualified developer might produce inferior code—but it raises much larger questions about process and procedure within the FreeBSD core committee itself.

How did so much sub-par code make it so far into a major open source operating system? Where was the code review which should have stopped it? And why did both the FreeBSD core team and Netgate seem more focused on the fact that the code was being disparaged than its actual quality?

Code quality

The first issue is whether Macy's code actually had significant problems. Donenfeld said that it did, and he identified a number of major issues:

Sleep to mitigate race conditions
Validation functions which simply return true
Catastrophic cryptographic vulnerabilities
Pieces of the wg protocol left unimplemented
Kernel panics
Security bypasses
Printf statements deep in crypto code
"Spectacular" buffer overflows
Mazes of Linux→FreeBSD ifdefs

But Netgate argued that Donenfeld had gone overboard with his negative assessment. The original Macy code, they argued, was simply not that bad.

Despite not having any kernel developers on-staff, Ars was able to verify at least some of Donenfeld's claims directly, quickly, and without external assistance. For instance, finding a validation function which simply returned true—and printf statements buried deep in cryptographic loops—required nothing more complicated than grep.

Empty validation function

In order to confirm or deny the claim of an empty validation function—one which always "returns true" rather than actually validating the data passed to it—we searched for instances of return true or return (true) in Macy's if_wg code, as checked into FreeBSD 13.0-HEAD.

root@banshee:~/macy-freebsd-wg/sys/dev/if_wg# grep -ir 'return.*true' . | wc -l
21

This is a small enough number of returns to easily hand-audit, so we then used grep to find the same data but with three lines of code coming immediately before and after each return true:

root@banshee:~/macy-freebsd-wg/sys/dev/if_wg# grep -ir -A3 -B3 'return.*true' .

Among the valid uses of return true, we discovered one empty validation function, in module/module.c:

wg_allowedip_valid(const struct wg_allowedip *wip)
{

 return (true);
}

It's probably worth mentioning that this empty validation function is not buried at the bottom of a sprawling mass of code—module.c as written is only 863 total lines of code.

We did not attempt to chase down the use of this function any further, but it appears to be intended to check whether a packet's source and/or destination belongs to WireGuard's allowed-ips list, which determines what packets may be routed down a given WireGuard tunnel.

Printf in crypto loops

Some pfSense 2.5.0 users reported strange hexadecimal output spamming the root console of their router. This matches Donenfeld's description of printf statements deep in crypto code, and we were able to easily discover the source in much the same way we found the empty validation function above.

root@banshee:~/macy-freebsd-wg/sys/dev/if_wg# grep -ir printf . | wc -l
68
root@banshee:~/macy-freebsd-wg/sys/dev/if_wg# grep -ir printf module/crypto | wc -l
7
root@banshee:~/macy-freebsd-wg/sys/dev/if_wg# grep -ir printf module/crypto/zinc/chacha20poly1305.c 
		printf("src_len too short\n");
		printf("calculated: %16D\n", b.mac, "");
		printf("sent      : %16D\n", src + dst_len, "");

root@banshee:~/macy-freebsd-wg/sys/dev/if_wg# grep -ir -A3 -B3 'printf("calc' module/crypto/zinc/chacha20poly1305.c 
	if (likely(!ret))
		chacha20(&chacha20_state, dst, src, dst_len, simd_context);
	else {
		printf("calculated: %16D\n", b.mac, "");
		printf("sent      : %16D\n", src + dst_len, "");
	}
	memzero_explicit(&chacha20_state, sizeof(chacha20_state));

In mostly backend code like a WireGuard implementation, printf statements in general are a bit of a red flag. There's obviously nothing explicitly wrong or unsafe about emitting text to the console—but it's something that's generally done rarely. One of the most frequent uses of printf in this scenario is temporary debugging, when a developer is trying to figure out what's going wrong during the process of writing code.

Usually, when you're done with a bout of this "printf debugging," you remove the printf statements you were using to check data values along the way—and it's not uncommon to specifically grep for stray instances of printf when a piece of code is finished and ready for production, just to make certain you've cleaned up after yourself.

The stray printf statements we found in Macy's chacha20poly1305.c are clearly just such an example of temporary debugging code. By themselves, they're a mild issue—indicative mostly of sloppiness in getting a piece of code ready for prime-time production use. The fact that those statements are still being triggered frequently in production use is a more serious issue—indicative of likely problems in the crypto implementation itself.

A spectacular buffer overflow

There are multiple buffer overflows present in Macy's if_wg implementation, with varying preconditions necessary to exploit them. The first one we became aware of exploits a failure of Macy's code to anticipate the use of jumbo frames.

In most Ethernet networks, the MTU—Maximum Transmission Unit, which roughly speaking is the largest allowable packet size—is either 1,492 bytes or 1,500 bytes. The term Jumbo frames is a human-friendly way to describe frames larger than 1,500 bytes. In networks that need extremely high throughput, configuring jumbo frames—most commonly, up to 9,000 bytes—is a way to increase throughput significantly by decreasing the packet count necessary to transmit a given (presumably large) volume of data.

Macy's code does not take the possibility of jumbo frame configuration into account and blindly allocates a 2,048 byte chunk of kernel memory to fit a packet in. With a typical MTU of 1500, this is sufficient. But when jumbo frames are configured, the packets overrun this 2,048 byte buffer and overwrite sensitive kernel memory.

When we first witnessed this bug demonstrated, the kernel crashed after a few hundred 3,000 byte pings. In the interest of getting a more concise set of screenshots, we used a 64K MTU, and a 64,000 byte packet—and panicked the kernel immediately, on the first ping.

This obviously represents a DoS (Denial of Service) vulnerability—after a kernel panic, a router will generally be out of service until a human operator manually reboots it. (It's also possible, but unlikely and typically considered a very bad idea, to configure the system to reboot on panic rather than hanging.)

Although this is the most "spectacular" buffer overflow we've seen in the code, there are others—some of which do not require configuration of jumbo frames at all.

The rest of the laundry list

Ars Technica isn't Kernel Debugging Quarterly, so we'll leave the rest of the listed quality issues to others. FreeBSD developer Kyle Evans himself verified several of the others, including sleep to mitigate race conditions and at least one security bypass. Other developers and researchers pointed out several other problems privately, including questionable and difficult-to-maintain uses of ifdef.

We also found license problems in if_wg—code in the Linux kernel is licensed GPL v2.0, while code in the FreeBSD kernel must be licensed with weak permissive licenses such as BSD or MIT only. The crypto_xor routine Macy lifted from Linux's kernel does not appear to be properly licensed for use in FreeBSD.

In fairness to Macy, this is an easy mistake to make—and it would be an easy one to forgive if Netgate employee and FreeBSD committer Scott Long hadn't just raked Donenfeld over the coals for a similar error regarding Netgate's copyright:

The Netgate copyright was unilaterally removed. Yes, it was re-instated a few minutes ago, and with good reason. Please don’t do that again.

... An accusation was made, again tonight, that Netgate merely paid to benchmark the code, that we contributed nothing to it, and that it’s bad code that can’t be audited. What I see is that the meat of the implementation is copyrighted by Jason and others. There’s a modest but non-trivial amount of glue code that has (or had) the Netgate copyright. I’m not going to touch the audit nonsense.

(Update: Donenfeld points out in the comments that Netgate's copyright was only missing temporarily while inside a local staging repo with "a ton of churn"—not from any commit to an official FreeBSD repository.)

The big takeaway here is that whatever you think of any one developer's diplomacy and tact, this code was not in a condition ready for production use on a global scale—which is exactly where it was headed.

Netgate had already implemented Macy's if_wg in its pfSense router distribution despite its non-production, release candidate status in FreeBSD—and there was no indication that the FreeBSD community (which includes many Netgate employees) would have caught the issues itself without outside intervention.

Poor code wasn’t par for the course

Was this sort of code consistent with developer Matthew Macy's ability and history, or was it an outlier? Why was Netgate so focused on getting his implementation into the FreeBSD kernel—and into pfSense—despite obvious, easily discoverable indicators of its low quality? And what, if any, procedures were in place that might have caught such issues prior to if_wg going out into the world in FreeBSD 13.0-RELEASE?

We weren't specifically familiar with Macy's prior work—and when we began digging, we discovered Macy's criminal history in San Francisco more easily than his coding history. That partially answered the question of why Macy's work had been difficult to unearth—he's only been known professionally as Matthew Macy for a few years. Armed with both of his professional names, we were able to begin finding answers—Macy has been involved directly and substantially with many high-level and well-regarded projects over the years.

Those projects include multiple network drivers and a driver abstraction library. This qualified him more substantially to write a WireGuard port—WireGuard presents as a virtual network interface, no different in theory from a physical one. But that also made some of his errors even more puzzling—an experienced network interface developer should never have hard-coded packet sizes, leading to the horrendous buffer overflow we demonstrated above.

Interviewing other prominent developers who worked with Macy on those projects proved more difficult. To his credit, Macy does not appear to have hidden his past from coworkers, and the few who would speak to us said he volunteered that information directly. But that same toxic past makes Macy a very touchy subject. Few of his prior co-workers would speak to us at all, and none were willing to go on the record.

Those who did speak described him as a talented, professional coder who produced no more bugs than most and responded well to code reviews and the need to squash bugs when identified. The condition of Macy's if_wg implementation came as a surprise to those who had worked with him in the past—sloppy, half-finished code wasn't simply par for the course when working with him.

Macy himself spoke to us via email over the course of several messages. He initially seemed focused on deflecting challenges to the quality of his code, making accusations that Ars had simply taken Donenfeld's word about all the problems. When we pointed out that we had independently verified many of those claims—and included the simple grep commands necessary to verify them—Macy became more forthcoming, although he was still aggrieved, saying:

That's nice. What were you doing the months it was up for review? (sarcasm) [...] This points more to the absence of quality reviewers.

This is when he described a situation full of personal stress and admitted that his final effort had been "half-hearted."

But Macy went on to recount an impressive number of projects he's been a key contributor to in the past, and he concluded:

Which is all to say that you've indirectly called in to question the integrity of large chunks of the [FreeBSD] kernel.

Macy appears to have intended this statement as an attack—but it's actually a reasonable point.

The kernel's integrity isn't in question simply because of Macy's skills, ethics, or even criminal history. As he himself noted, the real issue is an absence of quality reviewers. More importantly, there seems to be an absence of process to ensure quality code review.

Still, in a further email, Macy argued that the grass is no greener on the other side:

While we definitely fell down to some degree in this particular case, I think you may be overestimating what review catches immediately. There's a reason why many large orgs are still on (for example) Linux 4.9 and many much older. Every version of Linux I've run has had weird crashes.

Netgate responds

We wanted to know how such a developer had produced uncharacteristically poor code—but we also wanted to know why it had been so thoroughly championed by Netgate, even when its problems were placed under a spotlight. We also wanted to know why FreeBSD itself seemed to be merely trailing in Netgate's wake.

In emails to Donenfeld, blog posts at Netgate, and emails to FreeBSD mailing lists, Netgate employee Scott Long seemed to shift seamlessly from speaking in proprietary ways about pfSense—Netgate's FreeBSD-based router distribution—and FreeBSD itself:

I’ve also spoken with the FreeBSD Security Officer, and we’ve agreed that wireguard will be removed from all branches of FreeBSD until further notice. I’ve also informed Kyle of this. I do not support its reintroduction into FreeBSD, whether in the src tree or in the ports tree, at this time. As for pfSense, we are conducting an audit and will decide on the best course of action for our customers and our company.

This declaration comes hot on the heels of an accusation, in the same email, of irresponsibly dropping "zero-day" vulnerabilities:

What you and Kyle did was tell the world that there are a number of zero-day exploits in the code. You gave us no details until after the fact, gave us no time to mitigate, correct, and publish before your announcement and Kyle's code drop, and used the opportunity to bash the code, and by extension us, for your own self-gain.

In the context of FreeBSD, however, these vulnerabilities weren't "zero-day" at all—they were found in beta code, existing in a Release Candidate build which would become production code on March 30 if not addressed before then. From Netgate's perspective, they were "zero-day" because Netgate backported this beta code into pfSense 2.5.0, released in mid-February.

Look if you're trying to integrate your broken code into an upstream project and someone says "this is broken" then complaining that you've already deployed your broken code elsewhere isn't a reason for them to give you a break
— Matthew Garrett (@mjg59@nondeterministic.computer) (@mjg59) March 16, 2021

As prominent Linux developer Matthew Garrett points out above, this was FreeBSD kernel code—and it was not in production status there. This struck us as evidence that the delineation between Long's role as FreeBSD committer and Netgate employee—and possibly between Netgate's own roles as private company and FreeBSD sponsor—might need more clarification.

To investigate further, we spoke on the record with Netgate CEO Jim Thompson, and Netgate Director of Development / FreeBSD committer Scott Long.

Ars: "Can you describe the relationship between Netgate and the FreeBSD core team and foundation?

Thompson: The relationship is that Netgate has two software products and one of them is based on FreeBSD. And in that, they're the vendor for FreeBSD. We have a product based on FreeBSD. There are several employees here, employees or contractors, who are or have been on [FreeBSD] core. There's almost a Chinese wall between activities at Netgate and their activities.

Ars: OK. Do your employees who do work with FreeBSD receive any kind of training or guidance on how to separate their roles as FreeBSD people and as Netgate employees?

Thompson: There actually are explicit requirements in the employee contract that separate Netgate from FreeBSD. However, it's understood that there could be... We're trying to eliminate any kind of influence which is, I think, what you're trying to imply. What they do with FreeBSD and their roles as core committers is them on their own time. That's not a form or function of their employment with Netgate.

Long: So I'll say that being a member of the core team gives me no special privileges to add onto the code, and my voice is one of many. I'm a very small voice in the leadership. I'm a small voice in the community. The structure of FreeBSD is much more flat than in, say, Linux. Everyone has some control and privileges and equal access. But that doesn't mean that any one person is liable to run roughshod or to push their agenda without oversight. The FreeBSD community is very much built on oversight. Very much built on pretty strict distribution of authority and responsibility. And like I said, I'm just one small voice."

Although Long describes himself as "one small voice" inside a system built on "strict distribution of authority and responsibility," his voice was very nearly the only one heard on the subject of this code. Long has seemed concerned with shouting down potential challenges to the quality of pfSense rather than to finding and fixing potential problems in FreeBSD's kernel. We see this in a Netgate official blog post he wrote to pfSense customers:

Right now, we have not found any issues that would result in a remote or unprivileged vulnerability for pfSense users who are running Wireguard. We’ve identified several low-risk issues that are unlikely to be exploitable, except by an attacker who has already compromised the admin permissions of the system. Also, the use of Jumbo Frames appears to be problematic...

When Ars pushed Thompson and Long on this claim in the interview—using the kernel-panicking ping above as a counterpoint—their response was heated. Thompson claimed it was "not an exploit" and that an attacker would "need admin privileges" to exploit it.

Both claims seem suspect. Denial of Service is an entire category of CVE, and an attacker does not need admin privileges in order to exploit this one—the admin merely needs to have made the mistake of configuring a tunnel with jumbo frames enabled.

Long pointed out that his blog post then declared that jumbo frames "appear to be problematic"—but this is thin gruel indeed. "Problematic" is an awfully vague description, and it followed a bold-faced declaration that the code has "no issues that would result in [a] vulnerability."

The response from FreeBSD Core

Where was the FreeBSD Core team while all this was happening? We reached out to them, including directly addressing many of the group's most prominent committers, at the same time that we requested an interview from Netgate.

Several FreeBSD community members would only speak off the record. In essence, most seem to agree, you either have a commit bit (enabling you to commit code to FreeBSD's repositories) or you don't. It's hard to find code reviews, and there generally isn't a fixed process ensuring that vitally important code gets reviewed prior to inclusion. This system thus relies heavily on the ability and collegiality of individual code creators.

In a FreeBSD mailing list posting, developer Alexey Dokuchaev points out that Macy closed code review tasks without actually completing them:

... my questions to mmacy@ about original wireguard commit (and ZoL, FWIW) or Phabricator comments had not been answered; I also recall reviews being closed in "not accepted" state by him. While I appreciate the heavy-lifting, developers should be ready to explain and sometimes defend their work on public forums.

The FreeBSD Core team did not respond to Ars' requests for comment for several days. The response, when it finally came, said:

Core unconditionally values the work of all contributors, and seeks a culture of cooperation, respect, and collaboration. The public discourse over WireGuard in the past week does not meet these standards and is damaging to our community if not checked. As such, WireGuard development for FreeBSD will now proceed outside of the base system. For those who wish to evaluate, test, or experiment with WireGuard, snapshots will be available via the ports and package systems.

As a project, we remain committed to continually improving our development process. We’ll also continue to refine our tooling to make code reviews and continuous integration easier and more effective. The Core Team asks that the community use these tools and work together to improve FreeBSD.

Concluding a difficult saga

FreeBSD is an important project that deserves to be taken seriously. Its downstream consumers include industry giants such as Cisco, Juniper, NetApp, Netflix, Sony, Sophos, and more. The difference in licensing between FreeBSD and Linux gives FreeBSD a reach into many projects and spaces where the Linux kernel would be a difficult or impossible fit.

Neither Netgate's responses, FreeBSD Core's, nor the off-record responses we heard from independent FreeBSD community members lead us to believe that there was in fact any process in place that could reasonably have been expected to catch this issue prior to it going out into the world in 13.0-RELEASE.

We take some heart in the fact that FreeBSD Core team's expressed a commitment to improving processes, refining tooling, and making code reviews more effective—but it's impossible to ignore the fact that this commitment comes as an afterthought to attacking "public discourse" that highlighted the need for those improved processes, refined tools, and more effective reviews in the first place.

You can follow the expected launch of FreeBSD 13.0-RELEASE—without WireGuard—on March 30.