DragonFly BSD
DragonFly kernel List (threaded) for 2013-07
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: [GSOC] Implement hardware nested page table support for vkernels


From: Mihai Carabas <mihai.carabas@xxxxxxxxx>
Date: Mon, 29 Jul 2013 14:44:01 +0300

--089e01419fd634351b04e2a502db
Content-Type: text/plain; charset=ISO-8859-1

Hello,


> I have some problems with the stack mapping (I get some wierd page-faults
> at address 0 when accessing the stack - I missed something about the stack
> growing I guess). I will investigate this issue in order to go further and
> run the vkernel process in the GUEST context.
>
The problem was not the stack. I've introduce a silly bug in my ASM code
where the RDI (which is the first parameter in x86-64 calling convention)
and R11 were saved in the same memory location. Thus, the RDI was
overwritten with a bogus value at restoring.

After solving the issue above, any simple programs were running ok under
the VMX non-root context. I started modifying the vkernel code in order to
run in the VMX non-root context. First of all the vkernel needed the cpuid
instruction to be emulated (this was straight forward to implement). Also I
had to intercept the set_tls_area syscall in order to configure the %gs
base and the %fs as needed. Than several issues rised up with signal
handling syscalls which modified the RIP at a custom one (if you remember
from my last e-mail I had to increment the RIP with the size of the syscall
ASM instruction, in order to pass over it - if the RIP got changed I
shouldn't have been modifying it...modify when returning to the original
syscall ASM instruction). With these issue I lost two days because at
first, the fault returned by the debugger was indicanting something
regarding that %fs base is 0. I started investingating this issue (it
wasn't easy, because I couldn't read the %fs base from userspace....but
with the help of vsrinivas and dillon I managed to do this). Anyway, after
observing that the %fs was ok, I started printing out RIP/RSP of the VM
context and then disassmble the binaries to see at what instructions the
code started to act abnormally. This guided me to the signal handling
syscalls which were modifying RIP. More to say, while handling a signal,
before calling sigreturn, another signal could be raised up and treated
before calling sigreturn.

Going further, some faults appeared due to bogus %fs/%gs base. This was
because not only the set_tls_area syscall was modifying the base addresses.
Also vmspace_ctl was calling the syscall function directly from the kernel.
To solve this, I removed the hook from my VMM module for set_tls_area
syscall and introduce it in the code of the system call. At this point the
init process started, but the /bin/sh process created by init was killed
with a signal 12 (undefined syscall). This was due to the fact that I don't
verify for the moment the instruction opcode when I get a fault of UD
(Undefined instruction). I assume that is syscall for sure. I investigated
my VMM logs and found the RIP that was causing this. Then I disassamble
again the binaries and saw that the "cvttsd2si" asm instruction was
executed. Checked the manual and saw that I didn't enable the CR4_FXSR and
CR4_XMM, causing my UD fault.


At this point I have a single-core vkernel running in VMX non-root context,
without sendmail. The sendmail is throwing an UD fault. I will investigate
today and see what instruction is missing. Also I will implement the check
for UD instruction (if it is "syscall" opcode or anything else). Another
thing is modifying the vkernel a bit further in order to be able to run
with multiple cores.

Also I need to study to see if the cothreads (the ones that handle i/o)
needs to run in the VMX non-root context when I will start to implement the
EPT. Another thing to investigate if the migration of the VMX thread from
one CPU to another is handled correctly (I managed to see some failures,
but they weren't reproduceable).

That is for now. Will keep you in touch with any new progress as I get to
it:).

Thanks,
Mihai

--089e01419fd634351b04e2a502db
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hello,<div><br></div><div class=3D"gmail_extra"><div class=
=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8=
ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div><br><=
/div><div>
I have some problems with the stack mapping (I get some wierd page-faults a=
t address 0 when accessing the stack - I missed something about the stack g=
rowing I guess). I will investigate this issue in order to go further and r=
un the vkernel process in the GUEST context.</div>
</div></blockquote><div>The problem was not the stack. I&#39;ve introduce a=
 silly bug in my ASM code where the RDI (which is the first parameter in x8=
6-64 calling convention) and R11 were saved in the same memory location. Th=
us, the RDI was overwritten with a bogus value at restoring.</div>
<div><br></div><div>After solving the issue above, any simple programs were=
 running ok under the VMX non-root context. I started modifying the vkernel=
 code in order to run in the VMX non-root context. First of all the vkernel=
 needed the cpuid instruction to be emulated (this was straight forward to =
implement). Also I had to intercept the set_tls_area syscall in order to co=
nfigure the %gs base and the %fs as needed. Than several issues rised up wi=
th signal handling syscalls which modified the RIP at a custom one (if you =
remember from my last e-mail I had to increment the RIP with the size of th=
e syscall ASM instruction, in order to pass over it - if the RIP got change=
d I shouldn&#39;t have been modifying it...modify when returning to the ori=
ginal syscall ASM instruction). With these issue I lost two days because at=
 first, the fault returned by the debugger was indicanting something regard=
ing that %fs base is 0. I started investingating this issue (it wasn&#39;t =
easy, because I couldn&#39;t read the %fs base from userspace....but with t=
he help of vsrinivas and dillon I managed to do this). Anyway, after observ=
ing that the %fs was ok, I started printing out RIP/RSP of the VM context a=
nd then disassmble the binaries to see at what instructions the code starte=
d to act abnormally. This guided me to the signal handling syscalls which w=
ere modifying RIP. More to say, while handling a signal, before calling sig=
return, another signal could be raised up and treated before calling sigret=
urn.</div>
<div><br></div><div>Going further, some faults appeared due to bogus %fs/%g=
s base. This was because not only the set_tls_area syscall was modifying th=
e base addresses. Also vmspace_ctl was calling the syscall function directl=
y from the kernel. To solve this, I removed the hook from my VMM module for=
 set_tls_area syscall and introduce it in the code of the system call. At t=
his point the init process started, but the /bin/sh process created by init=
 was killed with a signal 12 (undefined syscall). This was due to the fact =
that I don&#39;t verify for the moment the instruction opcode when I get a =
fault of UD (Undefined instruction). I assume that is syscall for sure. I i=
nvestigated my VMM logs and found the RIP that was causing this. Then I dis=
assamble again the binaries and saw that the &quot;cvttsd2si&quot; asm inst=
ruction was executed. Checked the manual and saw that I didn&#39;t enable t=
he CR4_FXSR and CR4_XMM, causing my UD fault.</div>
<div><br></div><div><br></div><div>At this point I have a single-core vkern=
el running in VMX non-root context, without sendmail. The sendmail is throw=
ing an UD fault. I will investigate today and see what instruction is missi=
ng. Also I will implement the check for UD instruction (if it is &quot;sysc=
all&quot; opcode or anything else). Another thing is modifying the vkernel =
a bit further in order to be able to run with multiple cores.</div>
<div><br></div><div>Also I need to study to see if the cothreads (the ones =
that handle i/o) needs to run in the VMX non-root context when I will start=
 to implement the EPT. Another thing to investigate if the migration of the=
 VMX thread from one CPU to another is handled correctly (I managed to see =
some failures, but they weren&#39;t reproduceable).</div>
<div><br></div><div>That is for now. Will keep you in touch with any new pr=
ogress as I get to it:).</div><div><br></div><div>Thanks,</div><div>Mihai</=
div></div></div></div>

--089e01419fd634351b04e2a502db--



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]