ARM64 Boot Camp

(c) 2024 by Darek Mihocka, founder, Emulators.com.

updated January 16 2024

[ARM64 Boot Camp: Table Of Contents]  [Return to Emulators.com]

A Brief History of Windows on ARM

As I mentioned, my deep dive into all things ARM began in 2013 shortly after the release of Windows RT and my focus as an engineer on the Visual C/C++ compiler went from x86 to ARM32.  Let's look at the evolution of the Windows operating system over this past decade as it pertains to ARM32 and ARM64, and as I got to experience it first hand.

2012 - Microsoft launched the Windows RT operating system on the Surface RT tablet (powered by 1.3 GHz NVIDIA Tegra3 CPU supporting Thumb2 instruction set).  Windows RT was essentially Windows 8 for 32-bit ARM and visually identical to Windows 8 on an Intel machine.

The Tegra3 wasn't a bad mobile CPU at the time. In fact when I ported my Xformer 8-bit emulator and the Bochs open source x86 emulator to Windows RT in 2013, I was pleasantly surprised it was keeping up and even outperforming the 1.6 GHz Intel Atom (a common mobile CPU in "netbooks" and small laptops at the time). It was a clear that in a head-to-head competition running similar code that ARM native code could in fact hold its own against Intel x86 native code. (Until Windows RT, ARM and Intel devices never really intersected in such a manner since iPod and Android apps on ARM aren't comparable to Win32 apps on Intel).

Home run, yes?  No. Unfortunately, to the paying public a retail Surface RT was locked down to the average user to mainly support Windows Store apps, similar to what later was released in Windows 10 "S" mode.  It was Windows 8 with a lot of restrictions.  Since I had the benefit of working at Microsoft and having an unlocked Surface RT developer kit, I could arbitrarily compile and run unsigned binaries and treat the developer kit almost as a normal Windows 8 desktop PC.  My experience was great; the paying customer's was not.  Locking down Windows RT was a business decision, not a technical limitation, since it was built from the same source code as Windows 8.  Interestingly when Microsoft launched the Intel Atom based Surface 3 tablet in 2015 to replace the Surface RT, it was left fully unlocked.

The real steak-through-the-heart deal breaker for RT was the lack of any kind of x86 compatibility (which the Intel Atom of course supported).  Windows RT had no built-in x86 emulator, so other than compiling your own private build of Bochs and running it on an unlocked RT device as I did, there was no easy way to run legacy Windows XP or Windows 7 apps on the Surface RT.

2015 - Windows RT was cancelled.  But development work on ARM was far from dead!

2016 - Microsoft and Qualcomm jointly announced an effort to bring full ("unlocked") desktop Windows to 64-bit ARM64 devices; this time with x86 emulation support to offer the backward compatibility that Windows RT was missing.  This was a good business decision and I was fully on board!  It was an exciting moment for me, because this new project was about to go out of "science project mode" and into real production.

2017 - A mad dash of development was happening as we tried to get the x86 emulator in shape and test as many Windows XP and Windows 7 apps as possible to make sure they emulated correctly.  That initial x86 emulator for ARM64 was a codebase previously used for the x86-on-PowerPC emulation for Xbox 360 a decade earlier, which itself was based on Virtual PC for Mac from the 1990's and acquired by Microsoft in 2003.  In 2005 it was sufficient to emulate a 32-bit Pentium III supporting Intel's SSE instruction set for Xbox compatibility, as well as Windows XP compatibility for "Project Helium".  But by 2017 as Windows 10 had raised the minimum hardware requirement to SSE2 (and for all intents an purposes it was effectively SSE3) our team had dozens of new Intel SSE2 and SSE3 instructions to implement and test; while simultaneously porting thousands of lines of assembly code and C++ code from PowerPC to ARM64.

2018 - the first wave of Windows 10 on ARM (or as they call them at Qualcomm - Windows on Snapdragon) devices launched - you can read my old blog posts and watch my unboxing videos of the HP Envy X2, the Lenovo Miix, and the ASUS Novago.  These devices were in hindsight Minimum Viable Products, launched using the Qualcomm Snapdragon 835 processor with a bare minimum of RAM (typically 4GB) and disk space (typically 64GB).  As I mentioned in the intro, these were basically cell phones with a keyboard trying to run full Windows 10.  Unlike Windows RT, Windows 10 for ARM came with a built-in 32-bit Intel x86 emulator called 'xtajit'.   However, we had a big problem - the 835 only supported the ARMv8.0 instruction set - which is lacking a few things handy for emulation such as Intel-style compare-swap atomics:  ARMv8.0 only supports the old ARM32-style atomics, which do not scale well as CPU core count increases on larger processors.

This is not the first time this kind of CPU-misstep has happened.  When Intel released its first 32-bit x86 CPU the 80386 (or just 386 for short) which was supported by Windows NT 3.1 and by Windows 95, it was soon realized that the 386 was missing a few key instructions such as compare-swap atomics to efficiently support multi-threading.  The older 8086-style LOCK prefix for atomics was not sufficient to implement certain types of locking primitives since it was impossible to know the previous value of an atomic variable.  The addition of the CMPXCHG and XADD instructions to the 486 instruction set solved these problems, which is why going forward Windows NT 4.0 and Windows 98 required a 486.  Unfortunately Microsoft repeated history by going with an ARMv8.0-only Snapdragon 835.

A second wave of devices in late 2018 - the Lenovo Yoga C630 and the Samsung Galaxybook 2 - were based on the Snapdragon 850 processors which did in fact support ARMv8.1 and delivered a nice 25% or higher performance speedup (a combination of a faster 3.0 GHz clock speed and improved IPC).  Looking back, a great great deal of engineering effort was wasted supporting ARMv8.0, which was a speed hurdle not only for emulation but also for plain old native ARM64 code.  As I'll explain later, the ghost of ARMv8.0 still haunts us today.

2019 - Porting code from x86 or x64 to ARM64 is not trivial, especially if Intel-specific compiler intrinsics are being used in C/C++ source code (which the native ARM64 compiler can't parse, and it cannot parse Intel's _m128 data type).  So in the spring of 2019 I began work on a "soft intrinsics" layer for ARM64 to aid in porting of existing C/C++ source code that relied on such Intel intrinsics.  This was a necessary step to helping developers move away from _emulating_ their apps on ARM64 to _porting_ their apps to ARM64.  The idea was that if porting to ARM64 was made easier there would be less need to keep improving emulation or to support 64-bit Intel apps.  In my opinion, soft intrinsics were a necessary tool regardless; but to hope that this would avoid the need to truly support the full x86 instruction set was a bad bet which proved to be problematic when Microsoft launched the Surface Pro X in late 2019.  Pro X was based on a new generation of Qualcomm Snapdragon 8180 processor (also called "8cx gen 1" and "SQ1") and this CPU offered about an additional 50% speedup over the 850.  Pro X should have been a grand slam home run, but unfortunately the Pro X was generally panned in reviews of its continued lack of 64-bit emulation support, which coincided with apps such as Cinebench and Photoshop dropped their 32-bit x86 builds.

If more Windows apps followed suit and went 64-bit Intel only, this would quickly degrade Windows on ARM back to Windows RT doorstop status.  Microsoft overlooked that x86 is a constantly evolving moving target so for as long as AMD and Intel continue to crank out new x86 processors, then developers will keep releasing new games and applications to target those new x86 processors; emulation has to follow and keep up with the latest x86 features.  Just because emulation worked sufficiently at emulating 32-bit x86 in 2018 does _not_ make it a "one and done".  Fortunately my existing work on soft intrinsics was about to come in very handy...

2020/2021 - Microsoft management gave us the green light to implement emulation of 64-bit x86 (which as explained earlier I will refer to as 'x64' to distinguish it from 32-bit 'x86').  As part of my work to bring up soft intrinsics, I'd written an x64 test interpreter called 'xtabase' which used and tested those soft intrinsics to implement SSE handlers, so I already had most of a command-line based x64 test emulator at the ready if needed. In early February of 2020 my colleagues and I wired xtabase into the OS and within days we had an x64 "hello world" running; and within weeks we had GUI apps (including the x64 build of Xformer 10) running. You can still find the functional xtabase.dll in the C:\Windows\System32 directory on Windows 11 on ARM releases. Only months later in June 2020 did Apple announce their own ARM64-based M1 processor and demonstrate x64 emulation on Rosetta, and we'd unfortunately passed the opportunity to announce and demo x64 emulation first at Microsoft's BUILD developer conference a month earlier, doh!  Microsoft's x64 emulation based on JIT (using a fork of xtajit called 'xtajit64') was finally announced in October 2020 (which made it look like Microsoft was playing copycat with Apple) and officially launched in 2021 as part of the first Windows 11 (SV1) build 22000 or "Cobalt".

2022 - The annual Windows upgrade Windows 11 (SV2) launched in late 2022 as build 22621 or "Nickel" and contained quite a few compatibility improvements and for example the addition of SSE4.2 emulation to both the x86 and x64 emulators.  The lack of SSE4.2 emulation had been a painful source of crashes and performance issues and was technical debt that needed to be fixed.  We'd also worked with game companies such as Valve to dig into and fix issues with things like anti-cheat which had been preventing some games from working correct with Cobalt.  SV2/22621/Nickle is also known as Windows 11 22H2 and is currently as of this writing in January 2024 the latest official shipping release of Windows 11.  Despite Microsoft's marketing claim that Windows 11 on ARM _requires_ ARMv8.1, Nickel is actually still compatible with ARMv8.0 and thus with the Snapdragon 835.  I run Nickel on both my HP Envy X2 and my Lenovo Miix tablets in addition to on my supported ARMv8.1 devices such as the Lenovo Yoga, Surface Pro X, and Surface Pro 9.  The ARMv8.0 "tax" is unfortunately still being paid for anyone not using the more recent Windows Insider releases.

On the hardware front, Qualcomm released an updated 3rd generation 8cx in 2022 called the Snapdragon 8280, which brought another significant jump in performance from the previous 8180 generation - about 30% to 100% speedup depending on the app.  The 8280 is the same chip used by the Lenovo x13s laptop and supports the ARMv8.4 instruction set, similar to what the Apple M1 supports.  Not surprisingly the 8280 is just a hair slower than the M1 (partly due to M1's higher clock speed).

What is remarkable is that in just a 4-year span the Windows on ARM devices based on Snapdragon experienced about a 3x to 4x performance increase natively, and typically over 5x emulated.  I will show some side-by-side comparisons in demo videos.

As a sign that Windows on ARM devices are achieving parity with their AMD and Intel counterpart, Microsoft finally dropped the "X" (as in Surface Pro X) when it launched both the Intel and Qualcomm based versions of the Surface Pro 9 in October 2022 - no more "RT" or "X" designations to signify ARM64.  This was a significant milestone, because it reduces the confusion (um, well...) that consumers experience with that whole "Pro" vs "Pro X" naming convention and brings ARM into line with the common model naming they've used for AMD-based Surface devices.  After all, a Windows on ARM PC is "just a PC" as the saying in 2018 went; it's getting closer.

A third (and my favorite device from 2022) is the 8280-based "Volterra", properly known as the "Microsoft Windows Dev Kit 2023":

https://arstechnica.com/gadgets/2022/11/project-volterra-review-microsofts-600-arm-pc-that-almost-doesnt-suck/

Volterra is a non-tablet ARM64 device, and at $600 a fantastic developer kit and starter device for those looking to ramp up on ARM64 without spending upwards of $2000 on a similar tablet. With 32GB of RAM, 512GB SSD, DisplayPort support, wired Ethernet, it's Microsoft's equivalent of a Mac Mini M1. I have my Volterra docked to a keyboard and 4k monitor and use it as my primary development device these days. And it has the RAM and plenty of disk space to host multiple Hyper-V ARM64 VMs, which allows me to easily compare the behavior of different Windows builds on the same device. I highly recommend the Volterra as a developer starter kit.

2023 - the first big news of 2023 came in February with the announcement that Microsoft officially supports Windows on ARM running on Apple Silicon (i.e. the M1 and M2).  Your Apple Macbook M1 or Mac Mini M2 is as legitimate a Windows device as a Surface Pro X or Surface Pro 9, which makes it a two-vendor race - Apple and Qualcomm.

In October 2023 both companies raised the stakes by announcing new processors (the Apple M3 and the Snapdragon X) which both push ARM64 clock speeds above the 4.0 GHz threshold.  No wonder then that rumors abound that other silicon vendors such as AMD, NVIDIA, and SAMSUNG might join the Windows on ARM party:

https://www.sammobile.com/news/samsung-may-be-developing-exynos-chip-for-windows-on-arm-laptop/

https://www.reuters.com/technology/nvidia-make-arm-based-pc-chips-major-new-challenge-intel-2023-10-23/

https://www.digitaltrends.com/computing/nvidia-and-amd-may-rival-apple-with-arm-processors/

https://www.tomshardware.com/pc-components/cpus/windows-on-arm-may-be-a-thing-of-the-past-soon-arm-ceo-confirms-qualcomms-exclusivity-agreement-with-microsoft-expires-this-year/

This is all pure speculation as far as I'm concerned, but to go from an exclusive supplier to two CPU suppliers to _five_ would do wonders for the Windows on ARM ecosystem.  Remember how fast innovation took place in x86 during the late 1990's when you have not just AMD but also Transmeta, Cyrix, VIA, and others competing in the x86 desktop market.  If we've learned anything over the years, Intel and Microsoft tend to do their best work when there is real competition and I'm hoping to see this same sense of competition light up the ARM hardware world.

Another cool hardware device based on Snapdragon 8280 which released in 2023 is the Samsung Galaxybook2 Pro 360.  This beefed up 8280 successor to their older 7cx-based Galaxybook Go features 16GB of RAM, a beautiful OLED screen, and three USB-C ports.  Like the 2018 Lenovo Yoga C630 it is a clamshell laptop form factor where the keyboard flips around 360 degrees to turn into a tablet - none of that flimsy detachable keyboard nonsense.  And best of all the GB2 Pro 360 is _lighter_ than the Surface Pro X, the Surface Pro 9, or the Macbook M1 weighing in at just 36 ounces - just over 1 kilogram.

2024 - Remember I mentioned that the ghost of Snapdragon 835 still haunts Windows on ARM today.  That is true for Windows 10, Windows 11 SV1, and Windows 11 SV2.  However, for those not tracking the week-to-week Windows Insider builds, 2023 brought a lot of nice improvements to Windows on ARM which can be tested and evaluated by installing new Insider builds.  Here is a small list of what I have observed:

- the native ARM64 release of Visual Studio and ARM64 .NET no longer require an Intel-hosted cross-compile or ARM64-hosted emulated environment to developer for ARM.  i.e. you can edit, compile, run, and debug all on the same ARM64 device now.

- new performance improvements in Visual Studio's native ARM64 compiler code quality targeting modern ARM64 processors - the first batch of ARM64 code quality improvements arrived in release 17.4 and a second batch of improvements in 17.6, along corresponding optimizations in the Windows Insider SDK).

- the Windows kernel dropped ARMv8.0 support shortly after builds 22621, now requiring and enforcing a minimum of ARMv8.1 and thus the new atomics instructions.  These improvements showed up during the "Zinc" milestone (ZN_RELEASE builds) which eliminated much of the ghost of ARMv8.0.

- additional native performance improvements arrived in the latest 26xxx builds as in-box binaries are being recompiled from using ARMv8.0 instructions such as (DMB LDXR and STXR) to using more optimized barrier and atomics instructions.  You can easily see this by disassembling binaries in C:\Windows\System32 and comparing the build 22621 code sequences against the latest Insider builds.

- the Windows kernel dropped support for 32-bit ARM (Thumb2) binaries entirely after the ZN_RELEASE build 25393.  Starting with build 25905, only 64-bit ARM binaries are installed, which frees up about 1GB of disk space on the C:\ drive (i.e. the "C:\Program File (Arm)" and "C:\Windows\SysArm32" directories have effectively gone away) which leaves the Windows on-disk layout very similar to the AMD/Intel layout.  This was necessary since newer processors such as Apple M1/M2 and new Snapdragon X have no hardware support for ARM32.

- very measurable improvements in emulation performance since SV2 for all existing supported silicon (Snapdragon 850, 8180, 8280, and Apple M1/M2).  This is from a combination of translation improvements of existing ARMv8.0 and v8.1 code sequences, and the use of newer ARMv8.4 instructions exposed by the M1/M2 and 8280 on that newer hardware.

As of this writing the current Windows Insider build is 26020 (and will typically go up about 4 or 5 build numbers every weekly release).  You can see the Geekbench performance improvement in my brand new SV2 vs. Insider demo video:

Even on the $200-ish Snapdragon 7cx (7180) Samsung Go laptop that I demoed on, the native Geekbench scores are up almost 10% for single-core benchmarks and closer to 20% for multi-core benchmarks.  That's an older CPU design.  For emulation (as I will demo in future videos) I am seeing improvements of 20% to 40% since SV2.  This is the cumulative effect of the switch to newer ARMv8.1 instructions, better native ARM64 code optimizations by the compiler, the elimination of ARM32 gunk in the kernel and on disk, and emulator improvements.  This was part of a very deliberate focus on performance by Microsoft and Qualcomm which I participated in throughout 2023.

This is a summary of major Windows 10 and Windows 11 releases for ARM since 2020 which illustrates the build-to-build differences that I just described:

Build number Build Branch Elemental Codename Name x86 emulation x64 emulation SSE4.2 emulation ARMv8.0 / SD835 support ARM32 support Apple Silicon support
19041 VB_RELEASE Vibranium Windows 10 20H1 Yes No No Yes Yes No
19042 VB_RELEASE Vibranium Windows 10 20H2 Yes No No Yes Yes No
19043 VB_RELEASE Vibranium Windows 10 21H1 Yes No No Yes Yes No
19044 VB_RELEASE Vibranium Windows 10 21H2 Yes No No Yes Yes No
19045 VB_RELEASE Vibranium Windows 10 22H2 Yes No No Yes Yes No
22000 CO_RELEASE Cobalt Windows 11 21H2 Yes Yes No Yes Yes No
22621 NI_RELEASE Nickel Windows 11 22H2 Yes Yes Yes Yes Yes Yes
22631 NI_RELEASE Nickel Windows 11 23H2 Yes Yes Yes untested Yes Yes
25393 ZN_RELEASE Zinc Insider Preview 23H2 Yes Yes Yes No Yes Yes
26020 RS_RELEASE TBD Insider Preview (24H1?) Yes Yes Yes No No Yes

Yes, I have personally tested all of these builds on my assortment of ARM devices!


[ARM64 Boot Camp: Table Of Contents] [Return to Emulators.com]