|
Lab 4: Preemptive Multitasking and Program Loading
Due Friday, November 19
Introduction
In this lab you will implement preemptive multitasking among multiple
simultaneously active user-mode environments.
In part A, you will implement a Unix-like fork() function,
which allows one user-mode environment to fork off other, "child"
environments, which start off as virtual "clones" of the parent but
can subsequently execute independently of the parent.
In part B, you will add support for inter-process
communication (IPC), allowing different user-mode environments to
communicate and synchronize with each other explicitly. You will also
add support for hardware clock interrupts and preemption.
And in part C, you'll add an exokernel version of exec().
Getting Started
To fetch the new source, use Git to commit your lab 3 and save that code
in your lab3 branch, which should have been created in the
last lab. Then fetch the latest version of the course repository, and
update your local branch based on our lab4 branch,
origin/lab4 :
% git branch
... Check that your current branch is 'lab3'.
% git commit -am 'my solution to lab3'
% git fetch
% git checkout -b lab4 origin/lab4
Branch lab4 set up to track remote branch refs/remotes/origin/lab4.
Switched to a new branch "lab4"
% git merge lab3
Again, if you encounter conflicts, the git merge
command will tell you which files are conflicted, and you should first
resolve the conflict (by editing the relevant files) and then commit the
resulting files with git commit -a .
Or you can download the tarball if you'd
like.
Lab Requirements
You will need to do all of the regular exercises described in the lab
and at least one challenge problem.
If you cannot complete your challenge problem, give us a hint of what you
were aiming for. If you can complete it, provide a short
(e.g., one paragraph) description of what you did.
If you implement more than one challenge problem,
you only need to describe one of them in the write-up,
though of course you are welcome to do more.
There is no need to answer any questions this time, just write up your
challenges.
Place the write-up in a file called answers.txt (plain text)
or answers.html (HTML format)
in the top level of your lab4 directory
before handing in your work to CourseWeb.
Note: You can work on the parts of this lab in any
order.
Part A: Copy-on-Write Fork
The Unix fork() system call
copies the address space of the calling process (the parent)
to create a new process (the child).
Early Unix fork(), like JOS's dumbfork(),
allocated a whole new memory region for the child
and copied the parent's address space there.
But this became very expensive and increased memory pressure.
These costs seem especially unfortunate
when the child's address space remains mostly similar to the parent's.
This is actually pretty common.
For example, in Unix, a call to fork()
is often followed almost immediately
by a call to exec() in the child.
Any time spent copying the parent's address space is largely wasted,
because the child process will use
very little of its memory before calling exec().
Later versions of Unix took advantage
of virtual memory hardware
to safely share memory between parent and child.
A page is copied only when one of the processes actually modifies it.
This technique is known as copy-on-write.
To do this,
on fork() the kernel would
copy the address space mappings
from the parent to the child
instead of the contents of the mapped pages,
and at the same time mark the now-shared pages read-only.
When one of the two processes tries to write to one of these shared pages,
the process takes a page fault.
The kernel's page fault handler checks whether faults are due to copy-on-write.
For copy-on-write faults, the kernel
makes a new, private copy of the page
and transparently restarts the faulting process.
In this way, the contents of individual pages aren't actually copied
until they are actually written to.
This optimization makes a fork() followed by
an exec() in the child much cheaper:
the child will probably only need to copy one page
(the current page of its stack)
before it calls exec() .
In the next piece of this lab, you will implement a "proper"
Unix-like fork() with copy-on-write,
as a user space library routine.
Implementing fork() and copy-on-write support in user space
has the benefit that the kernel remains much simpler
and thus more likely to be correct.
It also lets individual user-mode programs
define their own semantics for fork() .
A program that wants a slightly different implementation
(for example, the expensive always-copy version like dumbfork(),
or one in which the parent and child actually share memory afterward)
can easily provide its own.
But first, you have some cleanup to do.
Exercise 1.
Lab 4 introduces a special idle environment
which is always present as envs[0] .
Change your scheduler in kern/sched.c
so that it does not run this idle environment
unless nothing else is runnable.
This should take just a line of code or two.
|
User-level page fault handling
Code that implements copy-on-write must be part of the path that
handles page faults on write-protected pages,
so you'll first implement a way for user-level programs to handle their
own page faults.
This is useful for copy-on-write and for many other purposes.
It's common to set up a process's address space so that some page faults
do not result from bugs, but rather indicate that some action needs to take place.
For example,
most Unix kernels initially map a single page
in a new process's stack region,
and allocate and map additional stack pages later "on demand"
as the process's stack consumption increases
and causes page faults on stack addresses that are not yet mapped.
A typical Unix kernel must keep track of what action to take
when a page fault occurs in each region of a process's address space.
For example,
a fault in the stack region will typically
allocate and map a new page of physical memory.
A fault in the program's BSS region will typically
allocate a new page, fill it with zeroes, and map it.
In systems with demand-paged executables,
a fault in the text region will read the corresponding page
of the binary off of disk and then map it in.
Instead of taking the traditional Unix approach,
in JOS we push this fault handling functionality into user space.
Programs can define their memory regions with great flexibility.
Not only do we get the ability to implement copy-on-write fork(),
but you'll use user-level page fault handling later
for mapping and accessing files on a disk-based file system.
Setting the Page Fault Handler
In order to handle its own page faults,
a user environment will need to register
a page fault upcall with the JOS kernel.
The user environment registers its page fault upcall
via the new sys_env_set_pgfault_upcall system call.
We have added a new member to the Env structure,
env_pgfault_upcall,
to record this information.
Exercise 2.
Implement the sys_env_set_pgfault_upcall system call
in kern/syscall.c and hook it up to syscall .
|
Normal and Exception Stacks in User Environments
During normal execution,
a user environment in JOS
will run on the normal user stack:
its %esp register starts out pointing at USTACKTOP,
and the stack data it pushes resides on the page
between USTACKTOP-PGSIZE and USTACKTOP-1 inclusive.
When a page fault occurs in user mode,
however,
the kernel will restart the user environment
running a designated user-level page fault handler
on a different stack,
namely the user exception stack.
This resembles the way the x86 processor
switches stacks when a processor exception
transfers control from user mode to kernel mode.
The JOS user exception stack is also one page in size,
and its top is defined to be at virtual address UXSTACKTOP ,
so the valid bytes of the user exception stack
are from UXSTACKTOP-PGSIZE through UXSTACKTOP-1 inclusive.
While running on this exception stack,
the user-level page fault handler
can use JOS's regular system calls to map new pages or adjust mappings
so as to fix whatever problem originally caused the page fault.
Then the user-level page fault handler returns,
via an assembly language stub,
to the faulting code on the original stack.
This return takes place entirely in user mode!
The sequence of events will look like this:
- Normal user code causes a page fault.
- The processor saves its state and branches to the kernel's IDT entry
for page faults, which calls
page_fault_handler .
- The kernel sets up an exception frame on the user exception stack and
branches to the environment's page fault upcall. All registers have the
same values as at fault time except for
%esp ,
%eip , and %eflags . The kernel sets
%esp to the exception frame's address and %eip to
the page fault upcall. The kernel does not intentionally change
%eflags , but it'll inevitably have a slightly different value
once the page fault handler starts executing; to see why, read about the RF flag.
- The page fault upcall handles the page fault, making sure to save any
important register values.
- After restoring the register values, the page fault upcall branches
directly to the
%eip that caused the fault.
Each user environment that wants to support user-level page fault handling
will need to allocate memory for its own exception stack, using the
sys_page_alloc() system call introduced in Lab
3.
Invoking the User Page Fault Handler
You will now need to
change the page fault handling code in kern/trap.c
to handle page faults from user mode as follows.
We will call the state of the user environment at the time of the
fault the trap-time state.
If there is no page fault handler registered,
the JOS kernel destroys the user environment with a message as before.
Otherwise,
the kernel sets up a trap frame on the exception stack that looks like
a struct UTrapframe from inc/trap.h.
(Offsets are relative to the %esp when the handler is run.)
<-- UXSTACKTOP
48(%esp) utf_esp trap-time esp
44(%esp) utf_eflags trap-time eflags
40(%esp) utf_eip trap-time eip
36(%esp) utf_regs.reg_eax trap-time eax start of struct PushRegs
32(%esp) utf_regs.reg_ecx trap-time ecx
28(%esp) utf_regs.reg_edx trap-time edx
24(%esp) utf_regs.reg_ebx trap-time ebx
20(%esp) utf_regs.reg_oesp NOT USEFUL
16(%esp) utf_regs.reg_ebp trap-time ebp
12(%esp) utf_regs.reg_esi trap-time esi
8(%esp) utf_regs.reg_edi trap-time edi end of struct PushRegs
4(%esp) utf_err error code
0(%esp) utf_fault_va virtual address that caused fault
^-- %esp when handler is run = UXSTACKTOP - 52
The kernel then arranges for the user environment to resume
running the page fault handler
on the exception stack with this stack frame.
You must figure out how to make this happen.
The utf_fault_va is the virtual address
that caused the page fault.
If the user environment is already running on the user exception stack
when an exception occurs,
then the page fault handler itself has faulted.
In this case,
you should start the new stack frame just under the current
tf->tf_esp rather than at UXSTACKTOP .
You should first push an empty 32-bit word, then a struct UTrapframe.
(...existing contents
of exception stack...)
<-- trap-time %esp
52(%esp) empty word
48(%esp) utf_esp trap-time esp
44(%esp) utf_eflags trap-time eflags
...
0(%esp) utf_fault_va virtual address that caused fault
^-- %esp when handler is run = trap-time %esp - 56
To test whether tf->tf_esp is already on the user
exception stack, check whether it is in the range
between UXSTACKTOP-PGSIZE and UXSTACKTOP-1, inclusive.
Exercise 3 (Long).
Implement the code in kern/trap.c
required to dispatch page faults to the user-mode handler.
Be sure to take appropriate precautions
when writing into the exception stack.
|
User-mode Page Fault Entrypoint
Next, you need to implement the user-mode assembly routine that will
take care of calling the C page fault handler and resume
execution at the original faulting instruction.
This assembly routine is the handler that will be registered
with the kernel using sys_env_set_pgfault_upcall().
Exercise 4 (Long).
Implement the _pgfault_upcall routine
in lib/pfentry.S .
The interesting part is returning to the original point in
the user code that caused the page fault.
You'll return directly there, without going back through
the kernel. This requires near-simultaneously
switching stacks and reloading the EIP!
|
Finally, you need to implement the C user library side
of the user-level page fault handling mechanism.
Exercise 5.
Finish set_pgfault_handler()
in lib/pgfault.c .
|
Testing
make run-faultread should produce:
[00000000] new env 00001001
[00001001] user fault va 00000000 ip 0080003a
TRAP frame ...
[00001001] free env 00001001
make run-faultdie should produce:
[00000000] new env 00001001
i faulted at va deadbeef, err 6
[00001001] exiting gracefully
[00001001] free env 00001001
make run-faultalloc should produce:
[00000000] new env 00001001
fault deadbeef
this string was faulted in at deadbeef
fault cafebffe
fault cafec000
this string was faulted in at cafebffe
[00001001] exiting gracefully
[00001001] free env 00001001
If you see only the first "this string" line,
it means you are not handling
recursive page faults properly.
make run-faultallocbad should produce:
[00000000] new env 00001001
[00001001] user_mem_check assertion failure for va deadbeef
[00001001] free env 00001001
Make sure you understand why user/faultalloc and
user/faultallocbad behave differently.
Challenge!
Extend your kernel so that not only page faults,
but all types of processor exceptions
that code running in user space can generate,
can be redirected to a user-mode exception handler.
Write user-mode test programs
to test user-mode handling of various exceptions
such as divide-by-zero, general protection fault,
and illegal opcode.
|
Implementing Copy-on-Write Fork
You now have the facilities
to implement copy-on-write fork
entirely in user space.
We have provided a skeleton for fork()
in lib/fork.c.
Like dumbfork(),
fork() creates a new environment,
then scans through the parent environment's entire address space
and sets up corresponding page mappings in the child.
The key difference is that,
while dumbfork() copied entire pages,
fork() will initially only copy page mappings.
Notice that the duppage() helper function in dumbfork
calls sys_page_alloc() to allocate a new page of physical memory
for each page in the parent,
and then calls memcpy() to copy the contents of the parent's page
into the child's new page.
These calls to memcpy() represent the bulk of the time
dumbfork() takes to run,
and so fork() attempts to "optimize away" most of this page copying
by copying pages lazily only when they are actually modified.
The basic control flow for fork() is as follows:
- The parent installs
pgfault()
as the C-level page fault handler,
using the set_pgfault_handler() function
you implemented above.
- The parent calls
sys_exofork() to allocate
a child environment.
- For each writable or copy-on-write page in its address space below UTOP,
the parent maps the page copy-on-write into the address
space of the child and then remaps the page copy-on-write
in its own address space. The parent sets both PTEs to
non-writable, and to contain
PTE_COW . This bit
is part of the PTE_AVAIL field allocated for user
program use. We use it to distinguish copy-on-write pages from
genuine read-only pages.
The exception stack is not remapped this way, however.
Instead you need to allocate a fresh page in the child for
the exception stack. Since the page fault handler will be
doing the actual copying and the page fault handler runs
on the exception stack, the exception stack cannot be made
copy-on-write: who would copy it?
- The parent sets the page fault upcall for the child
to look like its own.
- The child is now ready to run, so the parent marks it runnable.
Each time one of the environments writes a copy-on-write page that it
hasn't yet written, it will take a page fault.
Here's the control flow for the user page fault handler:
- The kernel propagates the page fault to
_pgfault_upcall ,
which calls fork()'s pgfault() handler.
-
pgfault() checks that the fault is a copy-on-write.
If not, it panics.
-
pgfault() allocates a new page mapped
at a temporary location (namely, PFTEMP ) and copies
the contents of the faulting page contents into it.
Then the fault handler maps the new page at the
appropriate address with read/write permissions,
in place of the old read-only mapping.
Exercise 6 (Long).
Implement fork and pgfault
in lib/fork.c .
Test your code with make run-forktree .
This should produce the following messages,
with interspersed 'new env', 'free env',
and 'exiting gracefully' messages, and
without any 'fork not implemented' messages.
(The messages may not appear in exactly this order, and
some of the environment IDs may be different.)
1001: I am ''
1002: I am '0'
1003: I am '00'
1004: I am '000'
2001: I am '01'
2002: I am '010'
2004: I am '011'
1005: I am '1'
1006: I am '001'
2003: I am '10'
3002: I am '100'
3004: I am '101'
4002: I am '11'
2005: I am '110'
4004: I am '111'
|
Challenge!
Implement a shared-memory fork
called sfork . This version should have the parent
and child share all their memory pages
(writes in one environment will appear in the other)
except for pages in the stack area,
which should be treated in the usual copy-on-write manner.
Modify user/forktree.c
to use sfork() instead of regular fork().
Also, once you have finished implementing IPC,
use your sfork() to run user/pingpongs .
You will have to find a new way to provide the functionality
of the global thisenv pointer.
|
Challenge! The current
copy-on-write fork copies more pages than necessary.
In particular, if both the parent and the child write to a page,
then that page will be copied twice: once in the parent, and
once in the child. The second of these copies is clearly
unnecessary. Write a version of fork that can avoid
this second copy in some circumstances. Make sure you correctly
handle the case where more than two environments share a
copy-on-write page.
|
Challenge!
Your implementation of fork
makes a huge number of system calls. On the x86, switching into
the kernel has non-trivial cost. Augment the system call interface
so that it is possible to send a batch of system calls at once.
Then change fork to use this interface.
How much faster is your new fork ?
You can answer this (roughly) by using analytical
arguments to estimate how much of an improvement batching
system calls will make to the performance of your
fork: How expensive is an int 0x30
instruction? How many times do you execute int 0x30
in your fork? Is accessing the TSS stack
switch also expensive? And so on...
Alternatively, you can boot your kernel on real hardware
and really benchmark your code. See the RDTSC
(read time-stamp counter) instruction, defined in the IA32
manual, which counts the number of clock cycles that have
elapsed since the last processor reset. QEMU doesn't emulate
this instruction faithfully.
|
This ends part A. As usual, you can grade your submission with make
grade .
Part B: Preemptive Multitasking and Inter-Process communication (IPC)
In the second part of lab 4 you'll introduce true multitasking.
Your JOS kernel will preempt uncooperative environments
and let environments pass messages to each other.
Clock Interrupts and Preemption
Run the user/spin test program.
This test program forks off a child environment,
which simply spins forever in a tight loop
once it receives control of the CPU.
Neither the parent environment nor the kernel ever regains the CPU.
This is obviously not an ideal situation
in terms of protecting the system from bugs or malicious code
in user-mode environments,
because any user-mode environment can bring the whole system to a halt
simply by getting into an infinite loop and never giving back the CPU.
In order to allow the kernel to preempt a running environment,
forcefully retake control of the CPU from it,
we must extend the JOS kernel to support external hardware interrupts
from the clock hardware.
We'll program the hardware to generate clock interrupts
periodically,
which will force control back to the kernel.
Interrupt discipline
External interrupts (i.e., device interrupts) are referred to as IRQs.
There are 16 possible IRQs, numbered 0 through 15.
The mapping from IRQ number to IDT entry is not fixed.
Pic_init in picirq.c maps IRQs 0-15
to IDT entries IRQ_OFFSET through IRQ_OFFSET+15 .
In kern/picirq.h,
IRQ_OFFSET is defined to be decimal 32.
Thus the IDT entries 32-47 correspond to the IRQs 0-15.
The clock interrupt is IRQ 0,
so IDT[32] contains the address of
the clock's interrupt handler routine in the kernel.
The IRQ_OFFSET of 32 was chosen so that the device interrupts
do not overlap with the processor exceptions,
which could obviously cause confusion.
(In fact, in the early days of PCs running MS-DOS,
the IRQ_OFFSET effectively was zero,
which indeed caused massive confusion between handling hardware interrupts
and handling processor exceptions!)
In JOS, we make a key simplification compared to Unix.
External device interrupts are always disabled
when in the kernel and always enabled when in user space.
External interrupts are controlled by the FL_IF flag bit
of the %eflags register
(see inc/mmu.h).
When this bit is set, external interrupts are enabled.
While the bit can be modified in several ways,
because of our simplification, we will handle it solely
through the process of saving and restoring %eflags register
as we enter and leave user mode.
You will have to ensure that the FL_IF flag is set in
user environments when they run. That way, when an interrupt arrives, it
gets passed through to the processor and handled by your interrupt code.
Otherwise, interrupts are masked,
or ignored until interrupts are re-enabled.
Interrupts are masked by default after processor reset,
and so far we have never gotten around to enabling them.
Exercise 7.
Modify kern/trapentry.S and kern/trap.c
to initialize the appropriate entries in the IDT
and provide handlers for IRQs 0 through 15.
Make sure that all entry points into the kernel
turn off interrupts. (Check the calls to SETGATE .
You might want to re-read section 9.2 of the
80386 Reference Manual,
or section 5.8 of the
IA-32 Intel Architecture Software Developer's Manual, Volume 3.)
Then modify the code in env_alloc()
to ensure that user environments are always run with
interrupts enabled.
The processor never pushes an error code
or checks the Descriptor Privilege Level (DPL) of the IDT entry
when invoking a hardware interrupt handler.
After doing this exercise,
if you run your kernel with any test program
that runs for a non-trivial length of time
(e.g., dumbfork),
you should see a kernel panic shortly into the program's execution,
followed by some strange output from JOS.
This is because our code has set up the clock hardware
to generate clock interrupts,
and interrupts are now enabled in the processor,
but JOS isn't yet handling them.
|
Handling Clock Interrupts
Now, you'll write the code to handle clock interrupts.
(The calls to pic_init and kclock_init
in init.c , which we have written for you,
set up the clock and the interrupt controller to generate interrupts, but
JOS doesn't handle them yet.)
Exercise 8.
Modify the kernel's trap_dispatch() function
so that it calls sched_yield()
to find and run a different environment
whenever a clock interrupt takes place.
You should now be able to get the user/spin test to work:
the parent environment should fork off the child,
sys_yield() to it a couple times
but in each case regain control of the CPU after one time slice,
and finally kill the child environment and terminate gracefully.
|
Some optional questions for you:
- How many instructions of user code are executed between each
interrupt?
- How many instructions of kernel code are executed to handle the
interrupt?
Hint: use
make run-gdb and gdb's b
and si commands.
Inter-Process Communication (IPC)
(Technically in JOS this is "inter-environment communication" or "IEC",
but everyone else calls it IPC, so we'll use the standard
term.)
We've been focusing on the isolation aspects of the operating
system, the ways it provides the illusion that each program
has a machine all to itself. Another important OS service
is to let programs communicate
with each other when they want to. It can be quite powerful
to let programs interact with other programs. The Unix
pipe model is the canonical example.
There are many models for interprocess communication.
We'll implement a simple one and then try it out.
IPC in JOS
You will implement two additional JOS kernel system calls,
sys_ipc_recv and sys_ipc_try_send ,
that collectively provide a simple interprocess communication mechanism.
Then you will implement two library wrappers
ipc_recv and ipc_send .
The "messages" that user environments can send to each other
using JOS IPC
consist of two components,
a single 32-bit value
and a single optional page mapping.
Allowing environments to pass page mappings in messages
provides an efficient way to transfer more data
than will fit into a single 32-bit integer,
and also allows environments to set up shared memory arrangements easily.
Sending and Receiving Messages
To receive a message, an environment calls
sys_ipc_recv .
This system call deschedules the current
environment and does not run it again until a message has
been received.
When an environment is waiting to receive a message,
any other environment can send it a message --
not just a particular environment,
and not just environments that have a parent/child arrangement
with the receiving environment.
In other words, the permission checking used in Lab 3's system calls
will not apply to IPC,
because the IPC system calls are carefully designed so as to be "safe":
an environment cannot cause another environment to malfunction
simply by sending it messages
(unless the target environment is also buggy).
To try to send a value, an environment calls
sys_ipc_try_send with both the receiver's
environment id and the value to be sent. If the named
environment is actually receiving (it has called
sys_ipc_recv and not gotten a value yet),
then the send delivers the message and returns 0. Otherwise
the send returns -E_IPC_NOT_RECV to indicate
that the target environment is not currently expecting
to receive a value.
A library function ipc_recv in user space will take care
of calling sys_ipc_recv and then looking up
the information about the received values in the current
environment's struct Env ,
and a library function ipc_send will
take care of repeatedly calling sys_ipc_try_send
until the send succeeds. Transferring Pages
When an environment calls sys_ipc_recv
with a dstva parameter below UTOP ,
the environment is stating that it is willing to receive a page mapping.
If the sender sends a page,
then that page should be mapped at dstva
in the receiver's address space.
If the receiver already had a page mapped at dstva ,
then that previous page is unmapped.
When an environment calls sys_ipc_try_send
with a srcva parameter below UTOP ,
it means the sender wants to send the page
currently mapped at srcva to the receiver,
with permissions perm .
After a successful IPC,
the sender keeps its original mapping
for the page at srcva in its address space,
but the receiver also obtains a mapping for this same physical page
at the dstva originally specified by the receiver,
in the receiver's address space.
As a result this page becomes shared between the sender and receiver.
If either the sender or the receiver does not indicate
that a page should be transferred,
then no page is transferred.
After any IPC
the kernel sets the new field env_ipc_perm
in the receiver's Env structure
to the permissions of the page received,
or zero if no page was received.
Implementing IPC
Exercise 9.
Implement sys_ipc_recv and
sys_ipc_try_send in kern/syscall.c .
Then implement the user versions,
ipc_recv and ipc_send ,
in lib/ipc.c .
Use the user/pingpong and user/primes functions
to test your IPC mechanism.
You might find it interesting to read user/primes.c
to see all the forking and IPC going on behind the scenes.
|
Challenge!
The ipc_send function is not very fair.
Run three copies of user/fairness and you will
see this problem. The first two copies are both trying to send to
the third copy, but only one of them will ever succeed.
Make the IPC fair, so that each copy has approximately
equal chance of succeeding.
|
Challenge!
Why does ipc_send
have to loop? Change the system call interface so it
doesn't have to. Make sure you can handle multiple
environments trying to send to one environment at the
same time.
|
Challenge!
The prime sieve is only one neat use of
message passing between a large number of concurrent programs.
Read C. A. R. Hoare, "Communicating Sequential Processes,"
Communications of the ACM 21(8) (August 1978), 666-677,
and implement the matrix multiplication example.
|
Challenge!
Make JOS's IPC mechanism more efficient
by applying some of the techniques from Liedtke's paper,
"Improving IPC by Kernel Design",
or any other tricks you may think of.
Feel free to modify the kernel's system call API for this purpose,
as long as your code is backwards compatible
with what our grading scripts expect.
|
Challenge!
Generalize the JOS IPC interface so it is more like L4's,
supporting more complex message formats.
|
This ends part B. As usual, you can grade your submission with
make grade . If you are trying to figure out why a
particular test case is failing, look at the generated
jos.out.FAILINGTEST files or try sh grade-lab4.sh
-v .
Part C: Program Loading
In the final portion of the lab, you'll implement spawn , a
library OS function that creates a new environment, loads a program image
from the kernel, and then starts the child environment running this
program. The parent environment then continues running independently of the
child. The spawn function acts effectively like a UNIX
fork followed by an immediate exec .
We're implementing spawn rather than a UNIX-style
exec because spawn is easier to implement from
user space in "exokernel fashion", without special help from the
kernel. Think about what you would have to do in order to implement
exec in user space, and be sure you understand why it is
harder.
In later labs, you'll load program images from a tiny file system. But
we don't have a file system yet, so we've introduced three system calls
that let you access the user-level program binaries that are linked into
the kernel. The sys_program_lookup system call looks up a
named binary, such as "dumbfork" or
"faultbadalloc" , and returns its ID. The
sys_program_read system call copies data from a named binary
into an environment's memory.
Exercise 10. Check out the
kern/programs.h header file.
In kern/syscall.c , implement the
sys_program_read function.
Then change your
syscall() function to dispatch the relevant system calls
to the sys_program_lookup and
sys_program_read functions.
Use make run-programread
to check your work.
|
The other new system call you'll use is sys_env_set_trapframe ,
which lets an environment set its children's struct
Trapframe (or its own) to an arbitrary value. Our
spawn will call sys_env_set_trapframe to make the
child environment start executing the loaded program, rather than starting at
the location of the instruction immediately following the parent's
sys_exofork .
Exercise 11.
In kern/syscall.c , implement the
sys_env_set_trapframe function.
Then change your syscall() function to dispatch
the relevant system call to sys_env_set_trapframe .
Use make run-settrapframe
and make run-evilsettrapframe
to check your work. The evilsettrapframe
program is working correctly when it takes a page fault exception
(or GPF)
without printing a message about
"sys_env_set_trapframe works inappropriately".
|
The skeleton for the spawn function is in
lib/spawn.c . We will put off the implementation of
argument passing until the next exercise. Fill in
spawn so that it operates roughly as follows:
- Read in the ELF header of the named program (we've done this already).
- Create a new environment.
- Load the program text, data, and bss at the appropriate addresses
specified in the ELF executable. The ELF loading pattern should be
familiar to you from Lab 3, but now you're doing it at user level.
- Set up a stack at
USTACKTOP - PGSIZE using the provided
init_stack function (which you must complete).
- Initialize the child's register state using the new
sys_env_set_trapframe system call.
- Start the environment running!
Exercise 12. Finish spawn and segment_alloc .
(You don't need to finish init_stack just yet.)
Test your code using make run-spawnhello .
This should print out something
like this, as the "parent environment" spawns off the hello
program:
[00000000] new env 00001001
i am parent environment 00001001
[00001001] new env 00001002
[00001001] exiting gracefully
[00001001] free env 00001001
hello, world
i am environment 00001002
[00001002] exiting gracefully
[00001002] free env 00001002
|
Challenge! Implement Unix-style exec.
|
Last but not least, you'll update init_stack in
lib/spawn.c to pass any command-line arguments into the
spawned environment, via argc and argv . Like
many operating systems, JOS stores these command-line arguments on the
child environment's initial stack.
There are two components of this work: what the parent does and what
the child does.
On the parent side,
spawn must setup the new environment's initial stack page
so that the arguments are available to the child's umain() function.
The parent formats the memory
according to the following diagram.
USTACKTOP:
+--------------+
| block of | Block of strings. In the example
| memory | "simple", "-f", "foo", "-c", and
| holding '\0' | "junk" would be stored here.
| terminated |
| argv strings |
+--------------+
| NULL |
| &argv[n] | Next, comes the argv array--an array of
| . | pointers to the string. Each &argv[*] points
| . | into the "block of strings" above.
| . | The array is terminated with a NULL pointer.
| &argv[1] |
| &argv[0] |<-.
+--------------+ |
| argv ptr |--' In the body of umain, access to argc
%esp -> | argc | and argv reference these two values.
+--------------+
If these values are on the stack when umain is called,
then umain will be able to access its arguments via the
int argc and char* argv[] parameters.
Warning: the diagram shows the memory at USTACKTOP since this
is where it will be mapped in the child's address space. However, be
careful! When the parent formats the arguments, it must do so at a
temporary address, since it can't (well, shouldn't) map over its own stack.
Similarly, take care when setting the pointers argv ptr ,
&argv[0] .. &argv[n] . These pointers
need to account for the fact that the data will be remapped into the child
at USTACKTOP .
Exercise 13. Finish init_stack .
Test your code using make run-spawninit .
This should print out something
like this, as the "parent environment" spawns off the init
program:
[00000000] new env 00001001
i am parent environment 00001001
[00001001] new env 00001002
[00001001] exiting gracefully
[00001001] free env 00001001
init: running
init: data seems okay
init: bss seems okay
init: args: 'init' 'one' 'two'
init: exiting
[00001002] exiting gracefully
[00001002] free env 00001002
|
Challenge! Implement Unix-like environment variables.
|
On the child side, spawn examines the
entry path of the child environment under the start label.
It is written such that libmain() and
umain() both take arguments (int argc, char
*argv[]) .
libmain() simply passes its arguments along to umain().
You'll also notice that the entry path also takes
care of the case when a new environment is created by the kernel, in which
case no arguments are passed.
The code on the child side has been done for you;
you do not need to make any changes.
Technical Detail: Actually only the argc and the
argv ptr must be placed on the new env's stack. The
argv ptr must point to the &argv[0]
.. &argv[n] array, each of which point to a string. As a
consequence, the &argv[0] .. &argv[n] array and the
"block of strings" can be located anywhere in the new env's address
space--not necessarily on the stack. In practice, we find it convenient to
store all of these values on the stack as has been presented in this
exercise.
Finally, you'll fix something that may have been bothering
you for a while. Normally, the code segment in a Unix program is mapped as
read-only. This makes certain awful types of bug easier to find (it is
really unfortunate when a program's instructions change accidentally). But
so far, your user-level programs probably run with all data read/write,
including program instructions. You can fix that in spawn
pretty easily.
Exercise 14. Update
spawn so that program segments where (ph->p_flags &
ELF_PROG_FLAG_WRITE) == 0 are mapped read-only when
the child process executes. Use
make run-spawnreadonlytext to check your
work.
Hint: You can't allocate the pages read-only, or you
wouldn't be able to write to them. What can you do?
|
This completes the lab.
Back to CS 235 Advanced Operating Systems
|