[librm] Work around two errata in the 386's "popal" instruction

Detailed experiments show that at least one model of 386 CPU has a previously undocumented errata in the "popal" instruction. Specifically: when the stack-address size is 16 bits and the operand size is 32 bits, the "popal" instruction will erroneously load the high 16 bits of %esp from the value stored on the stack. The "movl -20(%esp), %esp" instruction near the end of virt_call() currently relies on the assumption that the high 16 bits of %esp will already be zero, since they were set to zero by the "movzwl %bp, %esp" instruction at the end of prot_to_real() and will not have been subsequently modified by the "popal". This 386 CPU errata invalidates that assumption, with the result that we end up loading the stack pointer from an essentially undefined memory location. Fix by inserting a "movzwl %sp, %esp" after the "popal" to explicitly zero the high 16 bits of %esp. Inserting this instruction also happens to work around another (known and documented) errata in the 386, in which the CPU may malfunction if "popal" is followed immediately by an instruction that uses a base address register to form an effective address. Debugged-by: Jaromir Capik <jaromir.capik@email.cz> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2026-05-05 12:26:37 +02:00 · 2026-02-10 10:00:27 +00:00 · 2026-02-10 10:00:27 +00:00 · 481e043116
commit 481e043116
parent cd9b44e574
1 changed files with 18 additions and 2 deletions
--- a/src/arch/x86/transitions/librm.S
+++ b/src/arch/x86/transitions/librm.S
@ -1103,9 +1103,25 @@ vc_rmode:
 	popal
 	/* popal skips %esp.  We therefore want to do "movl -20(%sp),
 	 * %esp", but -20(%sp) is not a valid 80386 expression.
-	 * Fortunately, prot_to_real() zeroes the high word of %esp, so
-	 * we can just use -20(%esp) instead.
+	 *
+	 * In theory, the high word of %esp is already zero at this
+	 * point (since prot_to_real() should set it to zero), and the
+	 * popal instruction does not load %esp from the saved
+	 * register dump.  This would allow us to just use -20(%esp)
+	 * instead.
+	 *
+	 * However, some 386 chips are observed to have an
+	 * undocumented errata that causes the popal instruction to
+	 * load the high 16 bits of %esp.  We therefore explicitly
+	 * zero-extend %sp to %esp to work around this errata.
+	 *
+	 * Inserting this instruction also happens to work around
+	 * another (known and documented) errata in the 386, in which
+	 * the CPU may malfunction if popal is followed immediately by
+	 * an instruction that uses a base address register to form an
+	 * effective address.
 	 */
+	movzwl	%sp, %esp
 	addr32 movl -20(%esp), %esp
 	popfl
 	popw	%ss /* padding */