ARM64 Boot Camp
(c) 2024 by Darek Mihocka, founder, Emulators.com.
updated January 16 2024
A Brief History of Windows on
ARM
As I mentioned, my deep dive into all things ARM began in 2013
shortly after the release of Windows RT and my focus as an engineer
on the Visual C/C++ compiler went from x86 to ARM32. Let's look at the evolution of the Windows operating system over this
past decade as it pertains to ARM32 and ARM64, and as I got to experience it
first hand.
2012 - Microsoft launched
the
Windows RT operating system on the
Surface RT tablet (powered by 1.3 GHz NVIDIA
Tegra3 CPU supporting
Thumb2 instruction set). Windows RT was essentially Windows 8
for 32-bit ARM and visually identical to Windows 8 on an Intel
machine.
The Tegra3 wasn't a bad mobile CPU at the time. In fact when I
ported my Xformer 8-bit emulator and the Bochs open source x86
emulator to Windows RT in 2013, I was pleasantly surprised it was
keeping up and even outperforming the 1.6 GHz Intel Atom (a common
mobile CPU in "netbooks" and small laptops at the time). It was a
clear that in a head-to-head competition running similar code that
ARM native code could in fact hold its own against Intel x86 native
code. (Until Windows RT, ARM and Intel devices never really
intersected in such a manner since iPod and Android apps on ARM
aren't comparable to Win32 apps on Intel).
Home run, yes? No. Unfortunately, to the paying public a retail
Surface RT was locked
down to the average user to mainly support Windows Store apps, similar
to what later was released in
Windows 10 "S" mode.
It was Windows 8 with a lot of restrictions. Since I had
the benefit of working at Microsoft and having an unlocked Surface RT developer kit, I could arbitrarily compile and run unsigned
binaries and treat the developer kit almost as a normal Windows 8 desktop PC. My
experience was great; the paying customer's was not. Locking down
Windows RT was a business decision, not a technical limitation,
since it was built from the same source code as Windows 8. Interestingly when Microsoft launched the Intel Atom based Surface 3
tablet in 2015 to replace the Surface RT, it was left fully
unlocked.
The real steak-through-the-heart deal breaker for RT was the lack of any kind
of x86 compatibility (which the Intel Atom of course supported). Windows RT had no built-in x86 emulator, so other than compiling
your own private build of Bochs and running it on an unlocked RT
device as I did, there was no easy way to run legacy Windows XP or
Windows 7 apps on the Surface RT.
2015 -
Windows RT was cancelled. But development work on ARM was
far from dead!
2016 - Microsoft and Qualcomm jointly announced an effort to
bring full ("unlocked") desktop
Windows to 64-bit ARM64 devices;
this time with x86 emulation support to offer the backward
compatibility that Windows RT was missing. This was a good business decision
and I was fully on board! It was an exciting moment for me, because this new project was about
to go out of "science project mode" and into real production.
2017 - A mad dash of development
was happening as we tried to
get the x86 emulator in shape and test as many Windows XP and
Windows 7 apps as possible to make sure they emulated correctly.
That initial x86 emulator for ARM64 was a codebase previously used
for the
x86-on-PowerPC emulation for Xbox 360 a decade earlier, which
itself was based on Virtual PC for Mac from the 1990's and acquired
by Microsoft in 2003. In 2005
it was sufficient to emulate a 32-bit Pentium III supporting
Intel's SSE instruction set for Xbox compatibility, as well as
Windows XP compatibility for "Project
Helium". But by 2017 as Windows 10 had raised
the
minimum hardware requirement to SSE2 (and for all intents an
purposes it was effectively SSE3) our team had dozens of new Intel
SSE2 and SSE3 instructions to implement and test; while simultaneously porting
thousands of lines of assembly code and C++ code from PowerPC to ARM64.
2018 - the first wave of Windows 10 on ARM (or as they call
them at Qualcomm -
Windows on Snapdragon) devices launched -
you can read my old blog posts and watch my unboxing videos of the
HP
Envy X2, the Lenovo Miix, and the
ASUS Novago. These devices were
in hindsight Minimum Viable Products, launched using the Qualcomm
Snapdragon 835 processor with a bare minimum of RAM (typically 4GB)
and disk space (typically 64GB). As I mentioned in the intro,
these were basically cell phones
with a keyboard trying to run full Windows 10. Unlike Windows RT, Windows 10 for ARM came with a built-in 32-bit Intel x86
emulator called 'xtajit'.
However, we had a big problem - the 835 only supported the ARMv8.0 instruction set
- which is lacking a few things handy for emulation such as Intel-style
compare-swap atomics: ARMv8.0 only supports the old ARM32-style
atomics, which do not scale well as CPU core count increases on
larger processors.
This is not the first time this kind of CPU-misstep has happened. When Intel
released its first 32-bit x86 CPU the 80386 (or just 386 for short)
which was supported by Windows NT 3.1 and by Windows 95, it was soon
realized that the 386 was missing a few key instructions such as
compare-swap atomics to efficiently support multi-threading. The
older 8086-style LOCK prefix for atomics was not sufficient to
implement certain types of locking primitives since it was
impossible to know the previous value of an atomic variable. The
addition of the CMPXCHG and
XADD instructions to the 486
instruction set solved these problems, which is why going forward
Windows NT 4.0 and Windows 98 required a 486. Unfortunately Microsoft repeated history by going with an ARMv8.0-only Snapdragon 835.
A second wave of devices in late 2018 - the
Lenovo Yoga C630 and
the Samsung Galaxybook 2 - were based on the
Snapdragon 850
processors which did in fact support ARMv8.1 and delivered a nice
25% or higher performance speedup (a combination of a faster 3.0 GHz clock
speed and improved IPC). Looking back, a great great deal of
engineering effort was wasted supporting ARMv8.0, which was a speed
hurdle not only for emulation but also for plain old native ARM64
code. As I'll explain later, the ghost of ARMv8.0 still haunts
us
today.
2019 - Porting code from x86 or x64 to ARM64 is not trivial,
especially if
Intel-specific compiler intrinsics
are being used in
C/C++ source code (which the native ARM64 compiler can't parse, and
it cannot parse Intel's _m128 data type). So in the spring of
2019 I began work on a "soft intrinsics" layer for
ARM64 to aid in porting of existing C/C++ source code that relied on
such Intel intrinsics. This was a necessary step to helping developers move away from
_emulating_ their apps on ARM64 to _porting_ their apps to ARM64.
The idea was that if porting to ARM64 was made easier there would be
less need to keep improving emulation or to support 64-bit Intel
apps. In my opinion, soft intrinsics were a necessary tool regardless; but
to hope that this would avoid the need to truly support the full x86
instruction set was a bad bet which proved to be problematic when
Microsoft launched the Surface Pro X in late 2019. Pro X was based
on a new generation of Qualcomm Snapdragon 8180 processor (also
called "8cx gen 1" and "SQ1") and this CPU offered about
an additional 50%
speedup over the 850. Pro X should have been a grand slam home run, but
unfortunately the Pro X was generally
panned in reviews of its
continued lack of 64-bit emulation support, which coincided with
apps such as Cinebench and Photoshop dropped their 32-bit x86 builds.
If more Windows apps followed suit
and went 64-bit Intel only, this
would quickly degrade Windows on ARM back to Windows RT doorstop
status. Microsoft overlooked that x86 is a
constantly evolving moving target so for as long as AMD and Intel
continue to crank out new x86 processors, then developers will keep releasing new
games and applications to target those new x86 processors; emulation
has to follow
and keep up with the latest x86 features. Just because emulation worked sufficiently at
emulating 32-bit x86 in 2018 does
_not_ make it a "one and done". Fortunately my existing work on soft intrinsics was about to come in very handy...
2020/2021 - Microsoft management gave us the green light to
implement emulation of 64-bit x86 (which as explained
earlier I will refer
to as 'x64' to distinguish it from 32-bit 'x86'). As part of my work
to bring up soft intrinsics, I'd written an x64 test interpreter
called 'xtabase' which used and tested those soft intrinsics to
implement SSE handlers, so I already had most of a command-line
based x64 test emulator at the ready if needed. In early February of
2020 my colleagues and I wired xtabase into the OS and within days
we had an x64 "hello world" running; and within weeks we had GUI
apps (including the x64 build of Xformer 10) running. You can still
find the functional xtabase.dll in the C:\Windows\System32 directory
on Windows 11 on ARM releases. Only months later in June 2020 did
Apple announce their own ARM64-based M1 processor and demonstrate
x64 emulation on Rosetta, and we'd unfortunately passed the
opportunity to announce and demo x64 emulation first at Microsoft's
BUILD developer conference a month earlier, doh! Microsoft's
x64
emulation based on JIT (using a fork of xtajit called
'xtajit64') was finally announced in October 2020 (which
made it look like Microsoft was playing copycat with Apple) and
officially launched in 2021 as part of the first Windows 11 (SV1)
build 22000 or "Cobalt".
2022 - The annual Windows upgrade Windows 11 (SV2) launched in
late 2022 as build 22621 or "Nickel" and contained quite a few
compatibility improvements and for example the addition of SSE4.2
emulation to both the x86 and x64 emulators. The lack of SSE4.2
emulation had been a painful source of crashes and performance
issues and was technical debt that needed to be fixed. We'd also
worked with game companies such as Valve to dig into and fix issues
with things like anti-cheat which had been preventing some games
from working correct with Cobalt. SV2/22621/Nickle is also known as
Windows 11 22H2 and is currently as of this writing in January 2024
the latest official shipping release of Windows 11. Despite
Microsoft's marketing claim that Windows 11 on ARM _requires_ ARMv8.1,
Nickel is actually still compatible with ARMv8.0 and thus with the
Snapdragon 835. I run Nickel on both my HP Envy X2 and my Lenovo Miix tablets in addition to on my supported ARMv8.1 devices such as
the Lenovo Yoga, Surface Pro X, and Surface Pro 9. The ARMv8.0 "tax"
is unfortunately still being paid for anyone not using the more
recent Windows Insider releases.
On the hardware front, Qualcomm released an updated
3rd generation
8cx in 2022 called the Snapdragon 8280, which brought another
significant jump in performance from the previous 8180 generation -
about 30% to 100% speedup depending on the app. The 8280 is the same
chip used by the
Lenovo x13s laptop and supports the ARMv8.4
instruction set, similar to what the Apple M1 supports. Not
surprisingly the 8280 is just a hair slower than the M1 (partly due
to M1's higher clock speed).
What is remarkable is that in just a 4-year span the
Windows on ARM devices based on Snapdragon experienced about a 3x to
4x performance increase natively, and typically over 5x emulated.
I will show some side-by-side comparisons in demo videos.
As a sign that Windows on ARM devices are achieving parity with
their AMD and Intel counterpart, Microsoft finally dropped the "X"
(as in Surface Pro X) when it launched both the Intel and Qualcomm
based versions of the Surface Pro 9 in October 2022 - no more "RT"
or "X" designations to signify ARM64. This was a significant milestone, because it reduces
the confusion (um, well...) that consumers experience with that whole "Pro" vs
"Pro X" naming convention and brings ARM into line with the common
model naming they've used for AMD-based Surface devices. After all,
a Windows on ARM PC is "just a PC" as the saying in 2018
went; it's getting closer.
A third (and my favorite device from 2022) is the 8280-based "Volterra",
properly known as the "Microsoft Windows Dev Kit 2023":
https://arstechnica.com/gadgets/2022/11/project-volterra-review-microsofts-600-arm-pc-that-almost-doesnt-suck/
Volterra is a non-tablet ARM64
device, and at $600 a fantastic developer kit and starter device for
those looking to ramp up on ARM64 without spending upwards of $2000
on a similar tablet. With 32GB of RAM, 512GB SSD, DisplayPort
support, wired Ethernet, it's Microsoft's equivalent of a Mac Mini
M1. I have my Volterra docked to a keyboard and 4k monitor and use
it as my primary development device these days. And it has the RAM
and plenty of disk space to host multiple Hyper-V ARM64 VMs, which
allows me to easily compare the behavior of different Windows builds
on the same device. I highly recommend the Volterra as a developer
starter kit.
2023 - the first big news of 2023 came in February with the
announcement that
Microsoft officially supports Windows on ARM
running on Apple Silicon (i.e. the M1 and M2). Your Apple Macbook M1 or
Mac Mini M2 is as legitimate a Windows device as a Surface Pro X or
Surface Pro 9,
which makes it a two-vendor race - Apple and Qualcomm.
In October 2023 both companies raised the stakes by announcing new processors (the Apple M3 and the Snapdragon X) which both push ARM64 clock speeds above the 4.0 GHz threshold. No wonder then that rumors abound that other silicon vendors such as AMD, NVIDIA, and SAMSUNG might join the Windows on ARM party:
https://www.sammobile.com/news/samsung-may-be-developing-exynos-chip-for-windows-on-arm-laptop/https://www.digitaltrends.com/computing/nvidia-and-amd-may-rival-apple-with-arm-processors/
This is all pure speculation as far as I'm concerned, but to go from an exclusive supplier to two CPU suppliers to _five_ would do wonders for the Windows on ARM ecosystem. Remember how fast innovation took place in x86 during the late 1990's when you have not just AMD but also Transmeta, Cyrix, VIA, and others competing in the x86 desktop market. If we've learned anything over the years, Intel and Microsoft tend to do their best work when there is real competition and I'm hoping to see this same sense of competition light up the ARM hardware world.
Another cool hardware device based on Snapdragon 8280 which released in 2023 is the Samsung Galaxybook2 Pro 360. This beefed up 8280 successor to their older 7cx-based Galaxybook Go features 16GB of RAM, a beautiful OLED screen, and three USB-C ports. Like the 2018 Lenovo Yoga C630 it is a clamshell laptop form factor where the keyboard flips around 360 degrees to turn into a tablet - none of that flimsy detachable keyboard nonsense. And best of all the GB2 Pro 360 is _lighter_ than the Surface Pro X, the Surface Pro 9, or the Macbook M1 weighing in at just 36 ounces - just over 1 kilogram.
2024 - Remember I mentioned that the ghost of Snapdragon 835 still haunts Windows on ARM today. That is true for Windows 10, Windows 11 SV1, and Windows 11 SV2. However, for those not tracking the week-to-week Windows Insider builds, 2023 brought a lot of nice improvements to Windows on ARM which can be tested and evaluated by installing new Insider builds. Here is a small list of what I have observed:
- the native ARM64 release of Visual Studio and ARM64 .NET no longer require an Intel-hosted cross-compile or ARM64-hosted emulated environment to developer for ARM. i.e. you can edit, compile, run, and debug all on the same ARM64 device now.
- new performance improvements in Visual Studio's native ARM64 compiler code quality targeting modern ARM64 processors - the first batch of ARM64 code quality improvements arrived in release 17.4 and a second batch of improvements in 17.6, along corresponding optimizations in the Windows Insider SDK).
- the Windows kernel dropped ARMv8.0 support shortly after builds 22621, now requiring and enforcing a minimum of ARMv8.1 and thus the new atomics instructions. These improvements showed up during the "Zinc" milestone (ZN_RELEASE builds) which eliminated much of the ghost of ARMv8.0.
- additional native performance improvements arrived in the latest 26xxx builds as in-box binaries are being recompiled from using ARMv8.0 instructions such as (DMB LDXR and STXR) to using more optimized barrier and atomics instructions. You can easily see this by disassembling binaries in C:\Windows\System32 and comparing the build 22621 code sequences against the latest Insider builds.
- the Windows kernel dropped support for 32-bit ARM (Thumb2) binaries entirely after the ZN_RELEASE build 25393. Starting with build 25905, only 64-bit ARM binaries are installed, which frees up about 1GB of disk space on the C:\ drive (i.e. the "C:\Program File (Arm)" and "C:\Windows\SysArm32" directories have effectively gone away) which leaves the Windows on-disk layout very similar to the AMD/Intel layout. This was necessary since newer processors such as Apple M1/M2 and new Snapdragon X have no hardware support for ARM32.
- very measurable improvements in emulation performance since SV2 for all existing supported silicon (Snapdragon 850, 8180, 8280, and Apple M1/M2). This is from a combination of translation improvements of existing ARMv8.0 and v8.1 code sequences, and the use of newer ARMv8.4 instructions exposed by the M1/M2 and 8280 on that newer hardware.
As of this writing the current Windows Insider build is 26020 (and will typically go up about 4 or 5 build numbers every weekly release). You can see the Geekbench performance improvement in my brand new SV2 vs. Insider demo video:
Even on the $200-ish Snapdragon 7cx (7180) Samsung Go laptop that I demoed on, the native Geekbench scores are up almost 10% for single-core benchmarks and closer to 20% for multi-core benchmarks. That's an older CPU design. For emulation (as I will demo in future videos) I am seeing improvements of 20% to 40% since SV2. This is the cumulative effect of the switch to newer ARMv8.1 instructions, better native ARM64 code optimizations by the compiler, the elimination of ARM32 gunk in the kernel and on disk, and emulator improvements. This was part of a very deliberate focus on performance by Microsoft and Qualcomm which I participated in throughout 2023.
This is a summary of major Windows 10 and Windows 11 releases for ARM since 2020 which illustrates the build-to-build differences that I just described:
Build number Build Branch Elemental Codename Name x86 emulation x64 emulation SSE4.2 emulation ARMv8.0 / SD835 support ARM32 support Apple Silicon support 19041 VB_RELEASE Vibranium Windows 10 20H1 Yes No No Yes Yes No 19042 VB_RELEASE Vibranium Windows 10 20H2 Yes No No Yes Yes No 19043 VB_RELEASE Vibranium Windows 10 21H1 Yes No No Yes Yes No 19044 VB_RELEASE Vibranium Windows 10 21H2 Yes No No Yes Yes No 19045 VB_RELEASE Vibranium Windows 10 22H2 Yes No No Yes Yes No 22000 CO_RELEASE Cobalt Windows 11 21H2 Yes Yes No Yes Yes No 22621 NI_RELEASE Nickel Windows 11 22H2 Yes Yes Yes Yes Yes Yes 22631 NI_RELEASE Nickel Windows 11 23H2 Yes Yes Yes untested Yes Yes 25393 ZN_RELEASE Zinc Insider Preview 23H2 Yes Yes Yes No Yes Yes 26020 RS_RELEASE TBD Insider Preview (24H1?) Yes Yes Yes No No Yes
Yes, I have personally tested all of these builds on my assortment of ARM devices!
[ARM64 Boot Camp: Table Of Contents] [Return to Emulators.com]