MIT_6.828_Lab4 Part B
实验地址:Lab 4
在上次的实验中,dumbfork()为子进程分配空间,然后将父进程的代码和数据复制进去,完成新的进程的创建。但是这样可能带来效率低下的问题。因为一般来说,fork()之后常常执行exec()函数,exec()将代码、数据替换为其他程序的代码、数据,所以上面复制父进程的代码数据的工作就没有必要做。
因此,fork()采用了一种不同的方式:写时复制。子进程、父进程分享同一地址空间,并且里面的数据、代码都是只读。当子进程或父进程对数据进行修改的时候,会引发页错误。此时页错误处理函数重新分配一个新的物理页,将发生错误的地址重新映射到这个物理页上,并将这个物理页标记为可写。这样,子、父进程进可以继续下面的工作,提高了效率。当然这也只是相对的,对父进程而言,这样降低了效率,因为每次写一个新的页都要重新分配物理页;对子进程而言,若子进程马上执行exec(),那么就提高了效率。所以总的来说,应该是提高了效率。
User-level page fault handling
这一小节的目标是完成用户层面的页错误处理函数。执行错误处理的时候并没有涉及特权级别的切换。
Setting the Page Fault Handler
在这里,需要完成的工作是设置当前进程的页错误处理函数。当发生页错误的时候,在保存了相应的上下文信息之后,转去执行在此处设置的错误处理函数。
Exercise 8. Implement the sys_env_set_pgfault_upcall system call. Be sure to enable permission checking when looking up the environment ID of the target environment, since this is a “dangerous” system call.
注意到结构体struct Env:
struct Env {
struct Trapframe env_tf; // Saved registers
struct Env *env_link; // Next free Env
envid_t env_id; // Unique environment identifier
envid_t env_parent_id; // env_id of this env's parent
enum EnvType env_type; // Indicates special system environments
unsigned env_status; // Status of the environment
uint32_t env_runs; // Number of times environment has run
int env_cpunum; // The CPU that the env is running on
// Address space
pde_t *env_pgdir; // Kernel virtual address of page dir
// Exception handling
void *env_pgfault_upcall; // Page fault upcall entry point
// Lab 4 IPC
bool env_ipc_recving; // Env is blocked receiving
void *env_ipc_dstva; // VA at which to map received page
uint32_t env_ipc_value; // Data value sent to us
envid_t env_ipc_from; // envid of the sender
int env_ipc_perm; // Perm of page mapping received
};
有void *env_pgfault_upcall;
域。我们的任务就是设置当前的void *env_pgfault_upcall;
域:
static int
sys_env_set_pgfault_upcall(envid_t envid, void *func)
{
// LAB 4: Your code here.
struct Env*e;
if (envid2env(envid,&e,1)< 0)
return -E_BAD_ENV;
e->env_pgfault_upcall = func;
return 0;
// panic("sys_env_set_pgfault_upcall not implemented");
}
此处实现很简单,不必多说。
Normal and Exception Stacks in User Environments
以前的错误处理都是在内核态进行的,用的是内核栈。这次的页错误处理函数是在用户态进行处理的,所以需要在用户的地址空间分配一个栈,供其使用。在inc/memlayout.h中,用户态的错误处理栈是[UXSTACKTOP-PGSIZE,UXSTACKTOP]
这个区间的内存空间。
Invoking the User Page Fault Handler
发生页错误的时候,会发生页错误中断,转去执行中断处理函数void page_fault_handler(struct Trapframe *tf)
。我们需要完成的目标就是修改void page_fault_handler(struct Trapframe *tf)
,保存现场,转去执行前面已经设置好的页错误处理函数(在用户态执行)。
Exercise 9. Implement the code in page_fault_handler in kern/trap.c required to dispatch page faults to the user-mode handler. Be sure to take appropriate precautions when writing into the exception stack. (What happens if the user environment runs out of space on the exception stack?)
实现如下:
void
page_fault_handler(struct Trapframe *tf)
{
uint32_t fault_va;
// Read processor's CR2 register to find the faulting address
fault_va = rcr2();
// Handle kernel-mode page faults.
//
// LAB 3: Your code here.
if ((tf->tf_cs & 3) == 0)
panic("in function page_fault_handler:pge fault in kernel mode,at:%d!\n",fault_va);
// We've already handled kernel-mode exceptions, so if we get here,
// the page fault happened in user mode.
// Call the environment's page fault upcall, if one exists. Set up a
// page fault stack frame on the user exception stack (below
// UXSTACKTOP), then branch to curenv->env_pgfault_upcall.
//
// The page fault upcall might cause another page fault, in which case
// we branch to the page fault upcall recursively, pushing another
// page fault stack frame on top of the user exception stack.
//
// It is convenient for our code which returns from a page fault
// (lib/pfentry.S) to have one word of scratch space at the top of the
// trap-time stack; it allows us to more easily restore the eip/esp. In
// the non-recursive case, we don't have to worry about this because
// the top of the regular user stack is free. In the recursive case,
// this means we have to leave an extra word between the current top of
// the exception stack and the new stack frame because the exception
// stack _is_ the trap-time stack.
//
// If there's no page fault upcall, the environment didn't allocate a
// page for its exception stack or can't write to it, or the exception
// stack overflows, then destroy the environment that caused the fault.
// Note that the grade script assumes you will first check for the page
// fault upcall and print the "user fault va" message below if there is
// none. The remaining three checks can be combined into a single test.
//
// Hints:
// user_mem_assert() and env_run() are useful here.
// To change what the user environment runs, modify 'curenv->env_tf'
// (the 'tf' variable points at 'curenv->env_tf').
// LAB 4: Your code here.
struct UTrapframe *utf;
if (curenv->env_pgfault_upcall)
{
if (UXSTACKTOP-PGSIZE <= tf->tf_esp &&
tf->tf_esp <= UXSTACKTOP-1)
utf = (struct UTrapframe*)(tf->tf_esp - (sizeof(struct UTrapframe)+sizeof(uint32_t)));
else
utf = (struct UTrapframe*)(UXSTACKTOP - sizeof(struct UTrapframe));
user_mem_assert(curenv,(void*)utf,sizeof(struct UTrapframe),PTE_W|PTE_U);
utf->utf_fault_va = fault_va;
utf->utf_err = tf->tf_trapno;
utf->utf_regs = tf->tf_regs;
utf->utf_eip = tf->tf_eip;
utf->utf_eflags = tf->tf_eflags;
utf->utf_esp = tf->tf_esp;
tf->tf_eip = (uint32_t )(curenv->env_pgfault_upcall);
tf->tf_esp = (uint32_t)utf;
env_run(curenv);
}
// Destroy the environment that caused the fault.
cprintf("[%08x] user fault va %08x ip %08x\n",
curenv->env_id, fault_va, tf->tf_eip);
print_trapframe(tf);
env_destroy(curenv);
}
如果发生嵌套的错误,那么形成的栈结构大致如下:
更加详尽的分析见后文。
如果耗尽了错误处理栈,会引发页错误,内核会销毁这个进程。
User-mode Page Fault Entrypoint
Exercise 10. Implement the _pgfault_upcall routine in lib/pfentry.S. The interesting part is returning to the original point in the user code that caused the page fault. You’ll return directly there, without going back through the kernel. The hard part is simultaneously switching stacks and re-loading the EIP.
首先来看具体实现:
_pgfault_upcall:
// Call the C page fault handler.
pushl %esp // function argument: pointer to UTF
movl _pgfault_handler, %eax
call *%eax
addl $4, %esp // pop function argument
// Now the C page fault handler has returned and you must return
// to the trap time state.
// Push trap-time %eip onto the trap-time stack.
//
// Explanation:
// We must prepare the trap-time stack for our eventual return to
// re-execute the instruction that faulted.
// Unfortunately, we can't return directly from the exception stack:
// We can't call 'jmp', since that requires that we load the address
// into a register, and all registers must have their trap-time
// values after the return.
// We can't call 'ret' from the exception stack either, since if we
// did, %esp would have the wrong value.
// So instead, we push the trap-time %eip onto the *trap-time* stack!
// Below we'll switch to that stack and call 'ret', which will
// restore %eip to its pre-fault value.
//
// In the case of a recursive fault on the exception stack,
// note that the word we're pushing now will fit in the
// blank word that the kernel reserved for us.
//
// Throughout the remaining code, think carefully about what
// registers are available for intermediate calculations. You
// may find that you have to rearrange your code in non-obvious
// ways as registers become unavailable as scratch space.
//
// LAB 4: Your code here.
//将预留的0置为eip,以便使用ret返回,esp指向此处
movl 48(%esp),%ebp
subl $4,%ebp
movl %ebp,48(%esp)
movl 40(%esp),%eax
movl %eax,(%ebp)
// Restore the trap-time registers. After you do this, you
// can no longer modify any general-purpose registers.
// LAB 4: Your code here.
addl $8,%esp
popal
// Restore eflags from the stack. After you do this, you can
// no longer use arithmetic operations or anything else that
// modifies eflags.
// LAB 4: Your code here.
addl $4,%esp
popfl
// Switch back to the adjusted trap-time stack.
// LAB 4: Your code here.
popl %esp
// Return to re-execute the instruction that faulted.
// LAB 4: Your code here.
ret
之前的栈内容如下:
在执行完
//将预留的0置为eip,以便使用ret返回,esp指向此处
movl 48(%esp),%ebp
subl $4,%ebp
movl %ebp,48(%esp)
movl 40(%esp),%eax
movl %eax,(%ebp)
后,栈的内容发生了改变:
执行完
// LAB 4: Your code here.
addl $8,%esp
popal
// Restore eflags from the stack. After you do this, you can
// no longer use arithmetic operations or anything else that
// modifies eflags.
// LAB 4: Your code here.
addl $4,%esp
popfl
// Switch back to the adjusted trap-time stack.
// LAB 4: Your code here.
popl %esp
后,栈的内容如下:
最后执行ret,就可以恢复eip的值了。
Exercise 11. Finish set_pgfault_handler() in lib/pgfault.c.
实现如下:
void
set_pgfault_handler(void (*handler)(struct UTrapframe *utf))
{
int r;
if (_pgfault_handler == 0) {
// First time through!
// LAB 4: Your code here.
sys_page_alloc(sys_getenvid(),(void*)(UXSTACKTOP-PGSIZE),PTE_SYSCALL);
sys_env_set_pgfault_upcall(sys_getenvid(),_pgfault_upcall);
//panic("set_pgfault_handler not implemented");
}
// Save handler pointer for assembly to call.
_pgfault_handler = handler;
}
Implementing Copy-on-Write Fork
首先,在lib/entry.S中定义了uvpt,uvpd:
.globl envs
.set envs, UENVS
.globl pages
.set pages, UPAGES
.globl uvpt
.set uvpt, UVPT
.globl uvpd
.set uvpd, (UVPT+(UVPT>>12)*4)
通过访问uvpt,uvpd我们就可以方便地实现访问一个具体虚拟地址的页目录(uvpd[PDX(va)]),页表uvpt[PGNUM(va)])了。
具体原理见 clever mapping trick。
Exercise 12. Implement fork, duppage and pgfault in lib/fork.c.
Test your code with the forktree program. It should produce the following messages, with interspersed ‘new env’, ‘free env’, and ‘exiting gracefully’ messages. The messages may not appear in this order, and the environment IDs may be different.
先给出实现。
- pgfault
static void
pgfault(struct UTrapframe *utf)
{
void *addr = (void *) utf->utf_fault_va;
uint32_t err = utf->utf_err;
int r;
pte_t *pte=(pte_t*)UVPT;
// Check that the faulting access was (1) a write, and (2) to a
// copy-on-write page. If not, panic.
// Hint:
// Use the read-only page table mappings at uvpt
// (see <inc/memlayout.h>).
// LAB 4: Your code here.
if (!(err & FEC_WR) || !(uvpt[PGNUM(addr)] & PTE_COW))
panic("pgfault: err or PTE_COW is wrong!\n");
// Allocate a new page, map it at a temporary location (PFTEMP),
// copy the data from the old page to the new page, then move the new
// page to the old page's address.
// Hint:
// You should make three system calls.
// LAB 4: Your code here.
envid_t thisid = sys_getenvid();
assert(sys_page_alloc(thisid,(void*)PFTEMP,PTE_W|PTE_U|PTE_P) == 0);
addr = ROUNDDOWN(addr,PGSIZE);
memmove(PFTEMP,addr,PGSIZE);
assert(sys_page_unmap(thisid,addr) == 0);
assert(sys_page_map(thisid,PFTEMP,thisid,addr,PTE_W|PTE_U|PTE_P) == 0);
assert(sys_page_unmap(thisid,PFTEMP) == 0);
//panic("pgfault not implemented");
}
这就是具体的实现页错误处理的函数。
- duppage
static int
duppage(envid_t envid, unsigned pn)
{
int r;
// LAB 4: Your code here.
void *addr;
pte_t pte;
int perm;
addr = (void*)((uint32_t)(pn*PGSIZE));
pte = uvpt[pn];
perm = PTE_P | PTE_U;
if ((pte & PTE_W) || (pte & PTE_COW))
perm |= PTE_COW;
if ((r = sys_page_map(thisenv->env_id,addr,envid,addr,
perm)) < 0)
{
panic("duppage:page map failed:%e\n",r);
return r;
}
//再次映射父进程
if (perm & PTE_COW)
{
if ((r = sys_page_map(thisenv->env_id,addr,thisenv->env_id,
addr,perm)) < 0)
{
panic("duppage:map itself failed %e\n",r);
return r;
}
}
//panic("duppage not implemented");
return 0;
}
- fork
envid_t
fork(void)
{
// LAB 4: Your code here.
envid_t envid;
int r;
size_t i,j,pn;
//为父进程设置错误页错误处理函数,并且设置错误处理栈
set_pgfault_handler(pgfault);
assert((envid = sys_exofork()) >= 0);
if (envid == 0)
{
thisenv = &envs[ENVX(sys_getenvid())];
return 0;
}
for (pn = PGNUM(UTEXT); pn < PGNUM(UXSTACKTOP-PGSIZE); pn++)
{
if ((uvpd[pn>>10] & PTE_P) && (uvpt[pn] & PTE_P))
{
assert(duppage(envid,pn) == 0);
}
}
assert(sys_page_alloc(envid,(void*)(UXSTACKTOP-PGSIZE),PTE_P|PTE_W|PTE_U)
== 0);
extern void _pgfault_upcall(void);
assert(sys_env_set_pgfault_upcall(envid,_pgfault_upcall) == 0);
assert(sys_env_set_status(envid,ENV_RUNNABLE) == 0);
return envid;
//panic("fork not implemented");
}
fork函数具体实现的功能:将父进程的所有页设置为只读,子进程与父进程共享地址空间。发生缺页的时候,调用错误处理函数。因为父进程设置了错误处理函数,子进程全盘复制了(除了错误处理栈)父进程的代码、数据,所以子进程同样具有与父进程一样的错误处理函数(pgfault())。
现在我们来模拟一下调用fork后出现页错误的处理过程:首先由于页错误,会触发一个T_PGFLT的中断,经过一系列保护上下文后进入trap()中,trap()再进入page_fault_handler()函数中,保存上下文信息至UXSTACKTOP那个栈中,然后转去执行函数_pgfault_upcall函数,该函数调用pgfault()函数。pgfault()函数重新分配页,并设置为可写,然后返回。_pgfault_upcall恢复寄存器,返回错误发生的地方,继续执行。需要注意的是,子进程需要重新注册一下页错误处理函数assert(sys_env_set_pgfault_upcall(envid,_pgfault_upcall) == 0);
,因为纵使子进程复制了父进程的代码与数据,但是页错误会调用void *env_pgfault_upcall;
,此部分是在内核中的,所以需要重新设定子进程的void *env_pgfault_upcall;
。
END.
上一篇: CUDA学习2
下一篇: 第三章 从循环到网络
推荐阅读