pmap_emulate_modify was assuming that no changes to the pmap could take
place between the TLB signaling the fault and pmap_emulate_modify's
acquisition of the pmap lock, but that's clearly not even true in the
uniprocessor case, nevermind the SMP case.
Address each possibility in turn.
Thanks to Konstantin Belousov and Mark Johnston for confirming the bug.