![[Kernel, courtesy IowaFarmer.com CornCam]](kernel.jpg) |
Lab 3: User Environments
Introduction
In this lab you will implement the basic kernel facilities
required to get a protected user-mode environment (i.e., "process") running.
You will enhance the JOS kernel
to set up the data structures to keep track of user environments,
create a single user environment,
load a program image into it,
and start it running.
You will also make the JOS kernel capable
of handling any system calls the user environment makes
and handling any other exceptions it causes.
Note:
In this lab, the terms environment and process are
interchangeable -- they have roughly the same meaning. We introduce the
term "environment" instead of the traditional term "process"
in order to stress the point that JOS environments do not provide
the same semantics as UNIX processes,
even though they are roughly comparable.
Getting Started
Download our reference code for lab 3 from
lab3.tar.gz and untar it, then merge
your changes for lab2 into the lab3 source directory, as you did for Lab 2.
Lab 3 contains a number of new source files,
which you should browse through as your merge them into your kernel:
inc/ |
env.h |
Public definitions for user-mode environments |
|
syscall.h |
Public definitions for system calls
from user environments to the kernel |
|
lib.h |
Public definitions for the user-mode support library |
kern/ |
env.h |
Kernel-private definitions for user-mode environments |
|
env.c |
Kernel code implementing user-mode environments |
|
sched.h |
Schedule multiple user environments |
|
syscall.h |
Kernel-private definitions for system call handling |
|
syscall.c |
System call implementation code |
lib/ |
Makefrag |
Makefile fragment to build user-mode library,
obj/lib/libuser.a |
|
entry.S |
Assembly-language entrypoint for user environments |
|
libmain.c |
User-mode library setup code called from entry.S |
|
syscall.c |
User-mode system call stub functions |
|
console.c |
User-mode implementations of
putchar and getchar,
providing console I/O |
|
exit.c |
User-mode implementation of exit |
|
panic.c |
User-mode implementation of panic |
user/ |
* |
Various test programs to check lab 3 functionality |
In addition, a number of the source files we handed out for lab2
are modified in lab3.
To see the differences, you can type:
$ diff -ur lab2 lab3
Lab Requirements
This lab is divided into three parts.
As in lab 2,
you will need to do all of the regular exercises described in the lab
and write up brief answers
to the questions posed in the lab.
Please attempt at least one challenge problem,
but it is not required for this lab.
(Also, the challenge problems for this lab aren't super exciting at the
moment.
More suggestions are welcome: send them to the class mailing list!
Or go back and implement another challenge from an earlier lab.)
If you do solve a challenge problem, provide
a short (e.g., one or two paragraph) description of what you did
to solve your chosen challenge problem.
Place the write-up in a file called answers.txt (plain text)
or answers.html (HTML format)
in the top level of your lab2 directory
before handing in your work.
Debugging tips
Bochs is a much more
hospitable debugging environment than a real processor.
Put it to work for you!
- The
vb command sets a breakpoint
at a particular CS:EIP address.
Since the kernel code segment selector is 8,
vb 8:0xf0101234
sets a breakpoint at the given kernel address.
Similarly,
since the user segment selector is 0x1b, vb 0x1b:0x80020
sets a breakpoint at the given user address.
Finally, note that passing all the gmake grade tests
does not mean your code is perfect. It may have subtle bugs that will
only be tickled by future labs.
In a perfect world, gmake grade would find all your bugs,
but no one builds operating systems
in a perfect world anyway.
Keep in mind that debugging an operating system is a
very holistic task -- there are abstraction boundaries, but you
can't necessarily place much trust in them since nothing is really
enforcing them. If you get all sorts of weird crashes
that don't seem to be explainable by a single bug in the layer you're
working on, it's likely that they're explainable by a single bug in
a different layer.
Hand-In Procedure
As before,
you can test your code against our test scripts
by running gmake grade .
When you are done,
run gmake tarball to tar up your source, and send it to me in
an email.
Part 1: User Environments
The new include file inc/env.h
contains basic definitions for user environments in JOS.
The kernel uses the Env data structure
to keep track of critical data pertaining to each user environment.
In this lab we will initially only actually create one environment,
but we will need to design the JOS kernel
to support multiple simultaneously active environments,
because in lab 4 we will take advantage of this functionality
by allowing a user environment to fork other environments.
As you can see in kern/env.c,
the kernel maintains three main global variables
pertaining to environments:
struct Env* envs = NULL; /* All environments */
struct Env* curenv = NULL; /* the current env */
static struct Env_list env_free_list; /* Free list */
Once JOS gets up and running,
the envs pointer points to an array of Env structures
representing all the environments in the system.
In our design,
the JOS kernel will support a maximum of NENV
simultaneously active environments,
although there will typically be far fewer running environments
at any given time.
(NENV is a constant #define'd in inc/env.h.)
Once it is allocated,
the envs array will contain
a single instance of the Env data structure
for each of the NENV possible environments.
The JOS kernel keeps all of the inactive Env structures
on the env_free_list .
This design allows efficient allocation and
deallocation of environments,
as they merely have to be added to or removed from the free list.
The kernel uses the curenv variable
to keep track of the currently executing environment at any given time.
During boot up, before the first environment is run,
curenv is initially set to NULL .
Environment State
The Env structure
is defined in inc/env.h as follows
(although more fields will be added in future labs):
struct Env {
struct Trapframe env_tf; // Saved registers
LIST_ENTRY(Env) env_link; // Free list link pointers
envid_t env_id; // Unique environment identifier
envid_t env_parent_id; // env_id of this env's parent
unsigned env_status; // Status of the environment
// Address space
pde_t* env_pgdir; // Kernel virtual address of page dir
physaddr_t env_cr3; // Physical address of page dir
};
We now briefly describe the state kept by the kernel for each user
environment.
- env_tf
- Holds the current state of an environment's registers while that
environment is not running: i.e., when the kernel or a different
environment is running. The kernel saves the processor state into
env_tf when switching from user to kernel mode, so that the
environment can later be resumed where it left off. We first saw
struct Trapframe in Lab 2. (How did
we use it there?)
- env_link
- A pair of pointers allowing the Env to be placed on
the
env_free_list . See inc/queue.h for
details.
- env_id
- An integer value that uniquely identifies the environment currently
using this
Env structure (i.e., using this particular
slot in the envs array). After a user environment
terminates, the kernel may subsequently re-allocate the same
Env structure to a different environment, but the
env_id will probably be different. (It might not be
different if enough environments were created that the relevant
counter wrapped.) The Env structure for envid_t
e is located at envs[ENVX(e)] (unless
environment e was killed, and the slot was reused in
the meantime).
- env_parent_id
- The
env_id of the environment that created this environment.
In this way the environments form a "family tree,"
which will be useful for making security decisions
about which environments are allowed to do what to whom.
- env_status
- This variable holds one of the following values:
ENV_FREE
- The Env structure is inactive,
and therefore on the env_free_list.
ENV_RUNNABLE
- The Env structure
represents a currently active environment,
and the environment is waiting to run on the processor.
ENV_NOT_RUNNABLE
- The Env structure
represents a currently active environment,
but it is not currently ready to run:
for example, because it is waiting
for an interprocess communication (IPC)
from another environment.
- env_pgdir
- A virtual address pointer
to this environment's page directory.
- env_cr3
- The corresponding physical address
for this environment's page directory.
Like a Unix process, a JOS environment couples the concepts of "thread", or
processor and stack context, and "address space", or memory context. The
thread is defined primarily by the saved registers (the env_tf
field), and the address space is defined by the page directory and page
tables pointed to by env_pgdir /env_cr3 . To run
an environment, the kernel must set up the CPU with both the saved
registers and the appropriate address space.
In JOS,
individual environments do not have their own kernel stacks
as processes do in Linux and other conventional UNIXes.
Instead, all JOS kernel code runs on a single kernel stack,
and the kernel saves user-mode register state explicitly
in each struct Env 's env_tf
rather than implicitly on the relevant process's kernel stack.
Allocating the Environments Array
In lab 2,
you allocated memory in i386_vm_init()
for the pages array,
which is a table the kernel uses to keep track of
which pages are free and which are not.
You will now need to modify i386_vm_init() further
to allocate a similar array of Env structures,
called envs.
Exercise 1.
Modify i386_vm_init() in kern/pmap.c
to allocate and map the envs array.
This array consists of
exactly NENV instances of the Env structure,
laid out consecutively in the kernel's virtual address space
starting at address UENVS
(defined in inc/pmap.h).
The physical pages that these virtual addresses map to
do not have to be contiguous,
since the kernel only ever uses virtual addresses
to access the envs array.
You should be able to allocate and map this array
in exactly the same way as you did for the pages array.
|
Creating and Running Environments
You will now write the code in kern/env.c
necessary to run a user environment.
Because we do not yet have a filesystem,
we will set up the kernel to load a static ELF executable image
that is embedded within the kernel itself.
Once you integrate our Lab 3 code with your Lab 2 solutions,
you will notice that our makefiles generate a number of binary images
in the obj/user/ directory.
Further, if you look at kern/Makefrag,
you will notice some magic that takes all of these binaries
and "links" them directly into the kernel executable
as if they were .o files.
The '-b binary' option on the linker command line
causes these files to be linked in as "raw" uninterpreted binary files
rather than as regular .o files produced by the compiler.
(As far as the linker is concerned,
these files do not have to be ELF images at all --
they could be anything, such as text files or pictures!)
If you look at obj/kern/kernel.sym after building the kernel,
you will notice that the linker has "magically" produced
a number of funny symbols with obtuse names like
_binary_obj_user_hello_start,
_binary_obj_user_hello_end, and
_binary_obj_user_hello_size.
The linker generated these symbol names
simply by mangling the file names of these binary files;
these magic symbols provide provide the regular kernel code with a way
to reference the embedded binary files.
In this lab, the kernel will start up and run one of those binary images.
The code to select a binary image is in kern/init.c .
The grade script links different binary images into your kernel, to test
different properties of your user environment handling. If you're not
running the grade script, the kernel normally runs the hello
program, defined in user/hello.c , which prints
hello, world!
in the old-school manner. You're free to run whatever binary you want, but
don't change the version inside #ifdef TEST .
In kern/env.h you will find some macros
that kern/init.c uses to load one of these binary images
into a user environment via env_create
and then run it via env_run.
However,
the critical functions to set up user environments are not complete;
you will need to fill them in.
Exercise 2 (Long!).
In the file env.c ,
finish coding the following functions:
- env_init():
- Initialize all of the Env structures
in the envs array
and add them to the env_free_list.
- env_setup_vm():
- Allocate a page directory for a new environment
and initialize the kernel portion
of the new environment's address space.
- load_icode():
- Parse an ELF binary image,
much like the boot loader already does,
and load its contents into the user address space
of a new environment.
- env_create():
- Allocate an environment with env_alloc
and call load_icode load an ELF binary into it.
- env_run():
- Start a given environment running in user mode.
As you write these functions,
you might find printf 's new %e coverter
useful -- it prints a description corresponding to an error code.
For example,
r = -E_NO_MEM;
panic("env_alloc: %e", r);
will panic with the message "env_alloc: out of memory".
|
Once you are done you should compile your kernel and run it under Bochs.
Below is a call graph of the code up to the point where the user
code is invoked.
Make sure you understand the purpose of each step.
-
start (kern/entry.S )
-
i386_init
-
cons_init
-
i386_detect_memory
-
i386_vm_init
-
page_init
-
env_init
-
idt_init
-
env_create
-
env_run
At this point, Bochs will start running user/hello.c
in user mode!
To see how this happens,
Set a Bochs breakpoint at env_pop_tf,
which should be the last function you hit before actually entering user mode.
Step through this function;
the processor should enter user mode after the iret instruction.
(How can you tell?)
You should then see the first instruction
in the user environment's executable,
which is the cmpl instruction at the label start
in lib/entry.S.
If you continue past this point, hello should run successfully
until it first hits an int $48 instruction,
which is what user-mode code executes
to make a system call.
(See lib/syscall.c to see how this works.)
Then, your trap code from the previous lab should activate
and kill the process!
(We've changed trap() to handle
uncaught user-mode exceptions by killing the offending environment.)
If you cannot get to this point,
then something is wrong with your address space setup
or program loading code;
go back and fix it before continuing.
If you run make grade at this point, you should pass the
divzero , breakpoint , softint , and
badsegment tests, and get 20 points.
Question:
- Did you have to do anything
to make the user/softint program behave correctly
(i.e., generate a general protection fault, as the grade script expects)?
Why is this the correct behavior?
What happens if the kernel actually allows softint's
int $14 instruction to invoke the kernel's page fault handler
(which is interrupt number 14)?
|
Part 2: User-Level Exceptions and System Calls
Now, we'll update the exception handling support you added to the last
lab, using it to provide important operating
system functionality.
The Breakpoint Exception
In the last lab, you turned the breakpoint exception, interrupt number 3
(T_BRKPT ), into a primitive debugging instruction that invokes
the JOS kernel monitor. The user-mode implementation of panic()
in lib/panic.c, for example, performs an int3 after
displaying its panic message. Make sure at this point that this
functionality works! The breakpoint user program tests it by
invoking an int3 instruction.
Challenge Note: If you implemented
the single-stepping challenge in Lab 2, you might want to verify that your
code works on user-level programs too. |
Question:
- Executing
int3 at user level might deliver a general
protection fault to the kernel, rather than a breakpoint exception,
depending on how you initialized the breakpoint entry in the IDT
(i.e., your call to SETGATE from
idt_init ). What change would you make to cause
user-level breakpoints to generate a GPF? Why does this
functionality exist? |
Page Faults
The page fault exception, interrupt number 14 (T_PGFLT),
is a particularly important one that we will exercise heavily
throughout this lab and the next.
When the processor takes a page fault,
it stores the linear address that caused the fault
in a special processor control register, CR2.
In trap.c
we have provided the beginnings of a special function,
page_fault_handler(),
to handle page fault exceptions.
Exercise 3.
Modify trap()
to dispatch page fault exceptions
to page_fault_handler().
You should now be able to get make grade
to succeed on the faultread, faultreadkernel,
faultwrite, and faultwritekernel tests.
If any of them don't work, figure out why and fix them.
|
You will further refine the kernel's page fault handling below,
as you implement system calls.
System Calls
User processes ask the kernel to do things for them by
invoking system calls. When the user process invokes a system call,
the processor enters kernel mode,
the processor and the kernel cooperate
to save the user process's state,
and the kernel executes appropriate code in order to carry out the system
call. When it's done, it resumes the user process.
The exact
details of how the user process gets the kernel's attention
and how it specifies which call it wants to execute vary
from system to system.
In the x86 kernel, we will use the int
instruction, which causes a processor interrupt.
In particular, we will use int $48
as the system call interrupt.
We have defined the constant
T_SYSCALL to 48 for you. You will have to
set up the interrupt descriptor to allow user processes to
cause that interrupt. Note that interrupt 48 cannot be
generated by hardware, so there is no ambiguity caused by
allowing user code to generate it.
In the x86 kernel, we will pass the system call number and
the system call arguments in registers. This way, we don't
need to grub around in the user environment's stack
or instruction stream. The
system call number will go in %eax , and the
arguments (up to five of them) will go in %edx ,
%ecx , %ebx , %edi ,
and %esi , respectively. The kernel passes the
return value back in %eax . The assembly code to
invoke a system call has been written for you, in
syscall() in lib/syscall.c . You
should read through it and make sure you understand what
is going on.
You may also find it helpful to read inc/syscall.h .
Exercise 4.
Add a handler in the kernel
for interrupt number T_SYSCALL .
You will have to edit kern/trapentry.S and
kern/trap.c 's idt_init() . You
also need to change trap() to handle the
system call interrupt by calling syscall()
(defined in kern/syscall.c)
with the appropriate arguments,
and then arranging for
the return value to be passed back to the user process
in %eax .
Finally, you need to implement syscall() in
kern/syscall.c ; it should dispatch to one of the
sys_ functions defined there.
See inc/syscall.h for system call numbers.
Make sure syscall() returns -E_INVAL
if the system call number is invalid.
Run the hello program under your kernel.
(Your kernel runs this program by default.)
It should print "hello, world " on the console
and then cause a page fault in user mode.
If this does not happen, it probably means
your system call handler isn't quite right.
If you the kernel doesn't appear to be receiving a system call interrupt,
check your call to SETGATE : are the privileges right?
|
User-mode Environment Setup
Now, you'll fix the user-mode page fault in
user/hello.c .
At this point, this happens when the umain function tries to
access env->env_id .
The JOS library OS is supposed to set the global pointer env
to point at the current environment's struct Env , in the
read-only copy of the envs[] array you allocated in Part 1.
This global pointer lets the process efficiently access its state.
But currently the pointer is just null.
Exercise 5. JOS user programs
start running at the top of lib/entry.S . Trace through, find
the point where env should be set, and set it. Note that
lib/entry.S has already defined envs to point at
the UENVS mapping you set up in lab 2. Hint: You'll want to
use a system call.
This is the first point in the lab where you test the user-level
read-only mapping of envs[] at UENVS , so you may
want to check your code from Part 1 if you have problems here. And don't
forget that envid_t s aren't just indexes! |
At this point, user/hello should print "hello,
world ", then "i am environment 00000400 ". It
then attempts to "exit" by calling sys_env_destroy() (see
lib/libmain.c and lib/exit.c). Since the kernel
currently only supports one user environment, it should report that
it has destroyed the only environment and then drop into the kernel
monitor.
Page faults and memory protection
In this section of the lab, you'll begin refining JOS's response to
user-level page fault exceptions, which happen when an application tries to
access an invalid address or an address for which it has no permissions.
Memory protection is a crucial operating system feature, since it can help
the OS ensure that bugs in one program cannot corrupt other programs or the
operating system itself.
On an invalid access, the processor stops the program at the instruction
causing the fault and then traps into the kernel with information about the
attempted operation. If the fault is fixable, the kernel can fix it and
let the program continue running. If the fault is not fixable, then the
program cannot continue, since it will never get past the instruction
causing the fault.
As an example of a fixable fault, consider an automatically extended stack.
In many systems the kernel allocates a single stack page, and then
if a program faults accessing pages further down the stack, the kernel
will allocate those pages automatically and let the program continue.
By doing this, the kernel only allocates the memory that the program
is going to use, but the program can work under the illusion that it
has an arbitrarily large stack.
System calls present an interesting problem for memory protection.
Most system call interfaces let user programs pass pointers to the
kernel. These pointers point at user buffers to be read or written.
The kernel then dereferences these pointers on behalf of the user
while carrying out the system call.
There are two problems with this:
- A page fault in the kernel
is taken a lot more seriously than a page fault in a user program.
If the kernel page faults, that's usually a kernel bug, and the
fault handler will panic the kernel
(and hence the whole system).
In a system call,
when the kernel is dereferencing pointers to the user's address space,
we need a way to remember that any page faults these dereferences cause
is actually on behalf of the user program.
- The kernel typically has more memory permissions than the user program.
The user program might ask the kernel to read from or write to a
location in kernel memory that the user program cannot access but that
the kernel can.
If the kernel is not careful,
a buggy or malicious user program can trick the kernel
into using its greater privilege in unintended ways,
possibly so as to destroy the integrity of the kernel completely.
For both of these reasons the kernel must be extremely careful when
handling pointers presented by user programs.
You will now need to implement solutions to these two problems
in your kernel.
To address the first problem,
you will use a global variable page_fault_mode
to let the fault handler know when the kernel is manipulating memory
on behalf of the user environment. If a fault happens then,
the user environment will be destroyed.
(Otherwise, if a fault happens, the kernel should panic.)
Exercise 6.
Change kern/trap.c 's page fault handler as follows.
If a page fault happens while in kernel mode, check the setting
of page_fault_mode and act accordingly.
The possible page fault modes are listed
in kern/trap.h .
If you destroy the current environment,
print a message explaining the fault in the following format:
printf("[%08x] PFM_KILL va %08x ip %08x\n",
curenv->env_id, fault_va, tf->tf_eip);
Hint: To determine whether a fault happened in user mode or
in kernel mode, check the low bits of the tf_cs .
Change kern/syscall.c to set the page fault mode
correctly when handling the user pointer in sys_cputs .
Make sure you reset the page fault mode when the code finishes
handling the user pointer.
Change kern/init.c to run user/buggyhello
instead of user/hello . Compile your kernel and boot it.
The environment should be destroyed,
and the kernel should not panic.
You should see:
[00000000] new env 00000400
[00000400] PFM_KILL va 00000001 ip f010263d
TRAP frame ...
[00000400] free env 00000400
Destroyed the only environment - nothing more to do!
(Your ip may be different
but should begin f01 .)
|
The check you just added protects against buggy environments that pass
invalid pointers, but does not protect against evil environments that
pass pointers to valid kernel memory. user/evilhello
is one such program.
To address this second protection problem,
you will "sanitize" all user pointers
by using the TRUP macro ("TRanslate User Pointer")
defined in kern/pmap.h.
This macro will leave valid user pointers
as is, but will translate all other pointers to ULIM ,
which will always definitely cause a page fault when accessed.
Exercise 7.
Change the definition of sys_cputs to protect itself
against malicious user environments by using TRUP .
Change kern/init.c to run user/evilhello .
Compile your kernel and boot it.
The environment should be destroyed,
and the kernel should not panic.
You should see:
[00000000] new env 00000400
[00000400] PFM_KILL va ef800000 ip f010263d
[00000400] free env 00000400
(Your ip may be different
but should begin f01 .)
|
Part 3: Creating User Environments and Cooperative Multitasking
Now, you'll implement some new JOS kernel system calls
to allow user-level environments to create
additional new environments.
You will also implement cooperative round-robin scheduling,
allowing the kernel to switch from one environment to another
when the current environment voluntarily relinquishes the CPU (or exits).
In the next lab you'll implement preemptive scheduling,
which allows the kernel to re-take control of the CPU from an environment
even if the environment does not cooperate.
Round-Robin Scheduling
Your first task in this lab is to change the JOS kernel
so that it does not always just run the environment in envs[0],
but instead can alternate between multiple environments
in "round-robin" fashion.
Round-robin scheduling in JOS works as follows:
- The first environment, in envs[0],
will from now on always be a special idle environment,
which always runs the program user/idle.c.
The purpose of this program is simply to "waste time"
whenever the processor has nothing better to do -
it just perpetually attempts to give up the CPU
to another environment.
Read the code and comments in user/idle.c
for other useful details.
kern/init.c
will create this special idle environment in envs[0]
before creating the first "real" environment in envs[1].
- The function sched_yield() in the new kern/sched.c
is responsible for selecting a new environment to run.
It searches sequentially through the envs[] array
in circular fashion,
starting just after the previously running environment
(or at the beginning of the array
if there was no previously running environment),
picks the first environment it finds
with a status of ENV_RUNNABLE
(see inc/env.h),
and calls env_run() to jump into that environment.
However, sched_yield() is aware
that envs[0] is the special idle environment,
and never picks it unless
there are no other runnable environments.
- User environments call the
sys_yield()
system call
to invoke the kernel's sched_yield() function,
and thereby voluntarily give up the CPU to a different environment.
As you can see in user/idle.c,
the idle process does this routinely.
- Whenever the kernel switches from one environment to another
via env_run(),
it must save the old environment's register state
so that it can be restored properly later
when the first environment is eventually re-entered.
(Why? Where is the user environment's register state stored
on entry to a system call or trap handler in the kernel,
and where does this state need to be stored
in order to keep it safe for next time the environment runs?)
There's a panic()
in env_run() to point out where this needs to happen.
Exercise 8.
In kern/env.h , change the #define of
JOS_MULTIENV to 1 .
Then implement round-robin scheduling in sched_yield()
as described above,
and implement the crucial register state saving code
in env_run().
Modify kern/init.c to create two (or more!) environments
that all run the program user/yield.c.
You should see the environments
switch back and forth between each other
five times before terminating,
at which point the idle process runs
and invokes the JOS kernel debugger.
If this does not happen or the output looks wrong,
then fix your code before proceeding.
|
Question:
In your implementation of env_run() you should have
called lcr3() .
This loads the %cr3 register, and instantly changes the
addressing context used by the MMU. But virtual addresses, such as
e itself, have meaning relative to a given address context.
Why can the pointer e be dereferenced both before and after
the addressing switch?
|
Challenge!
Add a less trivial scheduling policy to the kernel,
such as a fixed-priority scheduler that allows each environment
to be assigned a priority
and ensures that higher-priority environments
are always chosen in preference to lower-priority environments.
If you're feeling really adventurous,
try implementing a Unix-style adjustable-priority scheduler
or even a lottery or stride scheduler.
(Look up "lottery scheduling" and "stride scheduling" in Google.)
Write a test or two
that verifies that your scheduling algorithm is working correctly
(i.e., the right environments get run in the right order).
|
Challenge!
The JOS kernel currently does not allow applications
to use the x86 processor's x87 floating-point unit (FPU),
MMX instructions, or Streaming SIMD Extensions (SSE).
Extend the Env structure
to provide a save area for the processor's floating point state,
and extend the context switching code
to save and restore this state properly
when switching from one environment to another.
The FXSAVE and FXRSTOR instructions may be useful,
but note that these are not in the old i386 user's manual
because they were introduced in more recent processors.
Write a user-level test program
that does something cool with floating-point.
|
System Calls for Environment Creation
Although your kernel is now capable of running and switching between
multiple user-level environments,
it is still limited to running environments
that the kernel initially set up.
You will now implement the necessary JOS system calls
to allow user environments to create and start
other new user environments.
Unix provides the fork() system call
as its process creation primitive.
Unix fork() copies
the entire address space of the calling process (the parent)
to create a new process (the child).
The only differences between the two observable from user space
are their process IDs and parent process IDs
(as returned by getpid and getppid ).
In the parent,
fork() returns the child's process ID,
while in the child, fork() returns 0.
The two processes do not share any memory: writes to one
process's memory do not appear in the other and vice versa.
In JOS we will provide a different, much more primitive
set of system system calls
for creating new user-mode environments.
With these system calls we will be able to implement
Unix-like fork() functionality entirely in user space,
in addition to other types of environment creation functionality.
The new system calls we will use in JOS are as follows:
- sys_exofork:
- This system call creates a new environment with an almost blank slate:
nothing is mapped in the user portion of its address space,
and it is not runnable.
The new environment will have the same register state as the
parent environment at the time of the
sys_exofork call.
In the parent, sys_exofork
will return the envid_t of the newly created
environment
(or a negative error code if the environment allocation failed).
In the child, however, it will return 0.
(Since the child starts out marked as not runnable,
sys_exofork will not actually return in the child
until the parent has explicitly allowed this
by marking the child runnable using....)
- sys_env_set_status:
- Sets the status of a specified environment
to ENV_RUNNABLE or ENV_NOT_RUNNABLE.
This system call is typically used
to mark a new environment ready to run,
once its address space and register state
has been fully initialized.
- sys_page_alloc:
- Allocates a page of physical memory
and maps it at a given virtual address
in a given environment's address space.
- sys_page_map:
- Copy a page mapping (not the contents of a page!)
from one environment's address space to another,
leaving a memory sharing arrangement in place
so that the new and the old mappings can both be used
to access the same page of physical memory.
- sys_page_unmap:
- Unmap a page mapped at a given virtual address
in a given environment.
In any of the system calls that accept environment IDs,
an envid_t value of 0 means "the current environment."
This convention is implemented by envid2env()
in kern/env.cc.
We have provided a very primitive implementation
of Unix-like fork() functionality
in the test program user/dumbfork.c.
This test program uses the above system calls
to create and run a child environment
with a copy of its own address space.
The two environments
then switch back and forth using sys_yield
as in the previous exercise.
The parent exits after 10 iterations,
whereas the child exits after 20.
Exercise 9.
Implement the system calls described above
in kern/syscall.c.
You will need to use various functions
in kern/pmap.c and kern/env.cc,
particularly envid2env().
Whenever you call envid2env(),
pass 1 in the checkperm parameter
to check permissions.
Be sure you check for any invalid system call arguments,
returning -E_INVAL in that case.
Test your JOS kernel with user/dumbfork
and make sure it works before proceeding.
|
Challenge!
Add the additional system calls necessary
to read all of the vital state of an existing environment
as well as set it up.
Then implement a user mode program that forks off a child process,
runs it for a while (e.g., a few iterations of sys_yield()),
then takes a complete snaphost or checkpoint
of the child process,
runs the child for a while longer,
and finally restores the child process to the state it was in
at the checkpoint
and continues it from there.
Thus, you are effectively "replaying"
the execution of the child process from an intermediate state.
Make the child process perform some interaction with the user
using sys_cgetc() or readline()
so that the user can view and mutate its internal state,
and verify that with your checkpoint/restart functionality
you can give the child process a case of selective amnesia,
making it "forget" everthing that happened beyond a certain point.
|
This completes the lab.
Back to Advanced Operating Systems,
Fall 2004
|