[Kernel, courtesy IowaFarmer.com CornCam]

Advanced Operating Systems, Fall 2004

Lab 3: User Environments

Introduction

In this lab you will implement the basic kernel facilities required to get a protected user-mode environment (i.e., "process") running. You will enhance the JOS kernel to set up the data structures to keep track of user environments, create a single user environment, load a program image into it, and start it running. You will also make the JOS kernel capable of handling any system calls the user environment makes and handling any other exceptions it causes.

Note: In this lab, the terms environment and process are interchangeable -- they have roughly the same meaning. We introduce the term "environment" instead of the traditional term "process" in order to stress the point that JOS environments do not provide the same semantics as UNIX processes, even though they are roughly comparable.

Getting Started

Download our reference code for lab 3 from lab3.tar.gz and untar it, then merge your changes for lab2 into the lab3 source directory, as you did for Lab 2.

Lab 3 contains a number of new source files, which you should browse through as your merge them into your kernel:

`inc/`	`env.h`	Public definitions for user-mode environments
	`syscall.h`	Public definitions for system calls from user environments to the kernel
	`lib.h`	Public definitions for the user-mode support library
`kern/`	`env.h`	Kernel-private definitions for user-mode environments
	`env.c`	Kernel code implementing user-mode environments
	`sched.h`	Schedule multiple user environments
	`syscall.h`	Kernel-private definitions for system call handling
	`syscall.c`	System call implementation code
`lib/`	`Makefrag`	Makefile fragment to build user-mode library, `obj/lib/libuser.a`
	`entry.S`	Assembly-language entrypoint for user environments
	`libmain.c`	User-mode library setup code called from `entry.S`
	`syscall.c`	User-mode system call stub functions
	`console.c`	User-mode implementations of `putchar` and `getchar`, providing console I/O
	`exit.c`	User-mode implementation of `exit`
	`panic.c`	User-mode implementation of `panic`
`user/`	`*`	Various test programs to check lab 3 functionality

In addition, a number of the source files we handed out for lab2 are modified in lab3. To see the differences, you can type:

$ diff -ur lab2 lab3

Lab Requirements

This lab is divided into three parts. As in lab 2, you will need to do all of the regular exercises described in the lab and write up brief answers to the questions posed in the lab. Please attempt at least one challenge problem, but it is not required for this lab. (Also, the challenge problems for this lab aren't super exciting at the moment. More suggestions are welcome: send them to the class mailing list! Or go back and implement another challenge from an earlier lab.) If you do solve a challenge problem, provide a short (e.g., one or two paragraph) description of what you did to solve your chosen challenge problem. Place the write-up in a file called answers.txt (plain text) or answers.html (HTML format) in the top level of your lab2 directory before handing in your work.

Debugging tips

Bochs is a much more hospitable debugging environment than a real processor. Put it to work for you!

The vb command sets a breakpoint at a particular CS:EIP address. Since the kernel code segment selector is 8, vb 8:0xf0101234 sets a breakpoint at the given kernel address. Similarly, since the user segment selector is 0x1b, vb 0x1b:0x80020 sets a breakpoint at the given user address.

Finally, note that passing all the gmake grade tests does not mean your code is perfect. It may have subtle bugs that will only be tickled by future labs. In a perfect world, gmake grade would find all your bugs, but no one builds operating systems in a perfect world anyway. Keep in mind that debugging an operating system is a very holistic task -- there are abstraction boundaries, but you can't necessarily place much trust in them since nothing is really enforcing them. If you get all sorts of weird crashes that don't seem to be explainable by a single bug in the layer you're working on, it's likely that they're explainable by a single bug in a different layer.

Hand-In Procedure

As before, you can test your code against our test scripts by running gmake grade. When you are done, run gmake tarball to tar up your source, and send it to me in an email.

Part 1: User Environments

The new include file inc/env.h contains basic definitions for user environments in JOS. The kernel uses the Env data structure to keep track of critical data pertaining to each user environment. In this lab we will initially only actually create one environment, but we will need to design the JOS kernel to support multiple simultaneously active environments, because in lab 4 we will take advantage of this functionality by allowing a user environment to fork other environments.

As you can see in kern/env.c, the kernel maintains three main global variables pertaining to environments:

struct Env* envs = NULL;		/* All environments */
struct Env* curenv = NULL;	        /* the current env */
static struct Env_list env_free_list;	/* Free list */

Once JOS gets up and running, the envs pointer points to an array of Env structures representing all the environments in the system. In our design, the JOS kernel will support a maximum of NENV simultaneously active environments, although there will typically be far fewer running environments at any given time. (NENV is a constant #define'd in inc/env.h.) Once it is allocated, the envs array will contain a single instance of the Env data structure for each of the NENV possible environments.

The JOS kernel keeps all of the inactive Env structures on the env_free_list. This design allows efficient allocation and deallocation of environments, as they merely have to be added to or removed from the free list.

The kernel uses the curenv variable to keep track of the currently executing environment at any given time. During boot up, before the first environment is run, curenv is initially set to NULL.

Environment State

The Env structure is defined in inc/env.h as follows (although more fields will be added in future labs):

struct Env {
        struct Trapframe env_tf;        // Saved registers
        LIST_ENTRY(Env) env_link;       // Free list link pointers
        envid_t env_id;                 // Unique environment identifier
        envid_t env_parent_id;          // env_id of this env's parent
        unsigned env_status;            // Status of the environment

        // Address space
        pde_t* env_pgdir;               // Kernel virtual address of page dir
        physaddr_t env_cr3;             // Physical address of page dir
};

We now briefly describe the state kept by the kernel for each user environment.

env_tf

Holds the current state of an environment's registers while that environment is not running: i.e., when the kernel or a different environment is running. The kernel saves the processor state into env_tf when switching from user to kernel mode, so that the environment can later be resumed where it left off. We first saw struct Trapframe in Lab 2. (How did we use it there?)

env_link

A pair of pointers allowing the Env to be placed on the env_free_list. See inc/queue.h for details.

env_id

An integer value that uniquely identifies the environment currently using this Env structure (i.e., using this particular slot in the envs array). After a user environment terminates, the kernel may subsequently re-allocate the same Env structure to a different environment, but the env_id will probably be different. (It might not be different if enough environments were created that the relevant counter wrapped.) The Env structure for

envid_t
	e

is located at envs[ENVX(e)] (unless environment e was killed, and the slot was reused in the meantime).

env_parent_id

The env_id of the environment that created this environment. In this way the environments form a "family tree," which will be useful for making security decisions about which environments are allowed to do what to whom.

env_status

This variable holds one of the following values:

ENV_FREE: The Env structure is inactive, and therefore on the env_free_list.
ENV_RUNNABLE: The Env structure represents a currently active environment, and the environment is waiting to run on the processor.
ENV_NOT_RUNNABLE: The Env structure represents a currently active environment, but it is not currently ready to run: for example, because it is waiting for an interprocess communication (IPC) from another environment.

env_pgdir

A virtual address pointer to this environment's page directory.

env_cr3

The corresponding physical address for this environment's page directory.

Like a Unix process, a JOS environment couples the concepts of "thread", or processor and stack context, and "address space", or memory context. The thread is defined primarily by the saved registers (the env_tf field), and the address space is defined by the page directory and page tables pointed to by env_pgdir/env_cr3. To run an environment, the kernel must set up the CPU with both the saved registers and the appropriate address space.

In JOS, individual environments do not have their own kernel stacks as processes do in Linux and other conventional UNIXes. Instead, all JOS kernel code runs on a single kernel stack, and the kernel saves user-mode register state explicitly in each struct Env's env_tf rather than implicitly on the relevant process's kernel stack.

Allocating the Environments Array

In lab 2, you allocated memory in i386_vm_init() for the pages array, which is a table the kernel uses to keep track of which pages are free and which are not. You will now need to modify i386_vm_init() further to allocate a similar array of Env structures, called envs.

Exercise 1. Modify i386_vm_init() in kern/pmap.c to allocate and map the envs array. This array consists of exactly NENV instances of the Env structure, laid out consecutively in the kernel's virtual address space starting at address UENVS (defined in inc/pmap.h). The physical pages that these virtual addresses map to do not have to be contiguous, since the kernel only ever uses virtual addresses to access the envs array. You should be able to allocate and map this array in exactly the same way as you did for the pages array.

Creating and Running Environments

You will now write the code in kern/env.c necessary to run a user environment. Because we do not yet have a filesystem, we will set up the kernel to load a static ELF executable image that is embedded within the kernel itself.

Once you integrate our Lab 3 code with your Lab 2 solutions, you will notice that our makefiles generate a number of binary images in the obj/user/ directory. Further, if you look at kern/Makefrag, you will notice some magic that takes all of these binaries and "links" them directly into the kernel executable as if they were .o files. The '-b binary' option on the linker command line causes these files to be linked in as "raw" uninterpreted binary files rather than as regular .o files produced by the compiler. (As far as the linker is concerned, these files do not have to be ELF images at all -- they could be anything, such as text files or pictures!) If you look at obj/kern/kernel.sym after building the kernel, you will notice that the linker has "magically" produced a number of funny symbols with obtuse names like _binary_obj_user_hello_start, _binary_obj_user_hello_end, and _binary_obj_user_hello_size. The linker generated these symbol names simply by mangling the file names of these binary files; these magic symbols provide provide the regular kernel code with a way to reference the embedded binary files.

In this lab, the kernel will start up and run one of those binary images. The code to select a binary image is in kern/init.c. The grade script links different binary images into your kernel, to test different properties of your user environment handling. If you're not running the grade script, the kernel normally runs the hello program, defined in user/hello.c, which prints

hello, world!

in the old-school manner. You're free to run whatever binary you want, but don't change the version inside #ifdef TEST.

In kern/env.h you will find some macros that kern/init.c uses to load one of these binary images into a user environment via env_create and then run it via env_run. However, the critical functions to set up user environments are not complete; you will need to fill them in.

Exercise 2 (Long!). In the file env.c, finish coding the following functions:

env_init():: Initialize all of the Env structures in the envs array and add them to the env_free_list.
env_setup_vm():: Allocate a page directory for a new environment and initialize the kernel portion of the new environment's address space.
load_icode():: Parse an ELF binary image, much like the boot loader already does, and load its contents into the user address space of a new environment.
env_create():: Allocate an environment with env_alloc and call load_icode load an ELF binary into it.
env_run():: Start a given environment running in user mode.

As you write these functions, you might find printf's new %e coverter useful -- it prints a description corresponding to an error code. For example,

	r = -E_NO_MEM;
	panic("env_alloc: %e", r);

will panic with the message "env_alloc: out of memory".

Once you are done you should compile your kernel and run it under Bochs. Below is a call graph of the code up to the point where the user code is invoked. Make sure you understand the purpose of each step.

start (kern/entry.S)
i386_init

cons_init
i386_detect_memory
i386_vm_init
page_init
env_init
idt_init
env_create
env_run
- env_pop_tf

At this point, Bochs will start running user/hello.c in user mode! To see how this happens, Set a Bochs breakpoint at env_pop_tf, which should be the last function you hit before actually entering user mode. Step through this function; the processor should enter user mode after the iret instruction. (How can you tell?) You should then see the first instruction in the user environment's executable, which is the cmpl instruction at the label start in lib/entry.S. If you continue past this point, hello should run successfully until it first hits an int $48 instruction, which is what user-mode code executes to make a system call. (See lib/syscall.c to see how this works.) Then, your trap code from the previous lab should activate and kill the process! (We've changed trap() to handle uncaught user-mode exceptions by killing the offending environment.) If you cannot get to this point, then something is wrong with your address space setup or program loading code; go back and fix it before continuing.

If you run make grade at this point, you should pass the divzero, breakpoint, softint, and badsegment tests, and get 20 points.

Question:

Did you have to do anything to make the user/softint program behave correctly (i.e., generate a general protection fault, as the grade script expects)? Why is this the correct behavior? What happens if the kernel actually allows softint's int $14 instruction to invoke the kernel's page fault handler (which is interrupt number 14)?

Part 2: User-Level Exceptions and System Calls

Now, we'll update the exception handling support you added to the last lab, using it to provide important operating system functionality.

The Breakpoint Exception

In the last lab, you turned the breakpoint exception, interrupt number 3 (T_BRKPT), into a primitive debugging instruction that invokes the JOS kernel monitor. The user-mode implementation of panic() in lib/panic.c, for example, performs an int3 after displaying its panic message. Make sure at this point that this functionality works! The breakpoint user program tests it by invoking an int3 instruction.

Challenge Note: If you implemented the single-stepping challenge in Lab 2, you might want to verify that your code works on user-level programs too.

Question:

Executing int3 at user level might deliver a general protection fault to the kernel, rather than a breakpoint exception, depending on how you initialized the breakpoint entry in the IDT (i.e., your call to SETGATE from idt_init). What change would you make to cause user-level breakpoints to generate a GPF? Why does this functionality exist?

Page Faults

The page fault exception, interrupt number 14 (T_PGFLT), is a particularly important one that we will exercise heavily throughout this lab and the next. When the processor takes a page fault, it stores the linear address that caused the fault in a special processor control register, CR2. In trap.c we have provided the beginnings of a special function, page_fault_handler(), to handle page fault exceptions.

Exercise 3. Modify trap() to dispatch page fault exceptions to page_fault_handler(). You should now be able to get make grade to succeed on the faultread, faultreadkernel, faultwrite, and faultwritekernel tests. If any of them don't work, figure out why and fix them.

You will further refine the kernel's page fault handling below, as you implement system calls.

System Calls

User processes ask the kernel to do things for them by invoking system calls. When the user process invokes a system call, the processor enters kernel mode, the processor and the kernel cooperate to save the user process's state, and the kernel executes appropriate code in order to carry out the system call. When it's done, it resumes the user process.

The exact details of how the user process gets the kernel's attention and how it specifies which call it wants to execute vary from system to system. In the x86 kernel, we will use the int instruction, which causes a processor interrupt. In particular, we will use int $48 as the system call interrupt. We have defined the constant T_SYSCALL to 48 for you. You will have to set up the interrupt descriptor to allow user processes to cause that interrupt. Note that interrupt 48 cannot be generated by hardware, so there is no ambiguity caused by allowing user code to generate it.

In the x86 kernel, we will pass the system call number and the system call arguments in registers. This way, we don't need to grub around in the user environment's stack or instruction stream. The system call number will go in %eax, and the arguments (up to five of them) will go in %edx, %ecx, %ebx, %edi, and %esi, respectively. The kernel passes the return value back in %eax. The assembly code to invoke a system call has been written for you, in syscall() in lib/syscall.c. You should read through it and make sure you understand what is going on. You may also find it helpful to read inc/syscall.h.

Exercise 4. Add a handler in the kernel for interrupt number T_SYSCALL. You will have to edit kern/trapentry.S and kern/trap.c's idt_init(). You also need to change trap() to handle the system call interrupt by calling syscall() (defined in kern/syscall.c) with the appropriate arguments, and then arranging for the return value to be passed back to the user process in %eax.

Finally, you need to implement syscall() in kern/syscall.c; it should dispatch to one of the sys_ functions defined there. See inc/syscall.h for system call numbers. Make sure syscall() returns -E_INVAL if the system call number is invalid.

Run the hello program under your kernel. (Your kernel runs this program by default.) It should print "hello, world" on the console and then cause a page fault in user mode. If this does not happen, it probably means your system call handler isn't quite right. If you the kernel doesn't appear to be receiving a system call interrupt, check your call to SETGATE: are the privileges right?

User-mode Environment Setup

Now, you'll fix the user-mode page fault in user/hello.c. At this point, this happens when the umain function tries to access env->env_id. The JOS library OS is supposed to set the global pointer env to point at the current environment's struct Env, in the read-only copy of the envs[] array you allocated in Part 1. This global pointer lets the process efficiently access its state. But currently the pointer is just null.

Exercise 5. JOS user programs start running at the top of lib/entry.S. Trace through, find the point where env should be set, and set it. Note that lib/entry.S has already defined envs to point at the UENVS mapping you set up in lab 2. Hint: You'll want to use a system call.

This is the first point in the lab where you test the user-level read-only mapping of envs[] at UENVS, so you may want to check your code from Part 1 if you have problems here. And don't forget that envid_ts aren't just indexes!

At this point, user/hello should print "hello, world", then "i am environment 00000400". It then attempts to "exit" by calling sys_env_destroy() (see lib/libmain.c and lib/exit.c). Since the kernel currently only supports one user environment, it should report that it has destroyed the only environment and then drop into the kernel monitor.

Page faults and memory protection

In this section of the lab, you'll begin refining JOS's response to user-level page fault exceptions, which happen when an application tries to access an invalid address or an address for which it has no permissions. Memory protection is a crucial operating system feature, since it can help the OS ensure that bugs in one program cannot corrupt other programs or the operating system itself.

On an invalid access, the processor stops the program at the instruction causing the fault and then traps into the kernel with information about the attempted operation. If the fault is fixable, the kernel can fix it and let the program continue running. If the fault is not fixable, then the program cannot continue, since it will never get past the instruction causing the fault.

As an example of a fixable fault, consider an automatically extended stack. In many systems the kernel allocates a single stack page, and then if a program faults accessing pages further down the stack, the kernel will allocate those pages automatically and let the program continue. By doing this, the kernel only allocates the memory that the program is going to use, but the program can work under the illusion that it has an arbitrarily large stack.

System calls present an interesting problem for memory protection. Most system call interfaces let user programs pass pointers to the kernel. These pointers point at user buffers to be read or written. The kernel then dereferences these pointers on behalf of the user while carrying out the system call. There are two problems with this:

A page fault in the kernel is taken a lot more seriously than a page fault in a user program. If the kernel page faults, that's usually a kernel bug, and the fault handler will panic the kernel (and hence the whole system). In a system call, when the kernel is dereferencing pointers to the user's address space, we need a way to remember that any page faults these dereferences cause is actually on behalf of the user program.
The kernel typically has more memory permissions than the user program. The user program might ask the kernel to read from or write to a location in kernel memory that the user program cannot access but that the kernel can. If the kernel is not careful, a buggy or malicious user program can trick the kernel into using its greater privilege in unintended ways, possibly so as to destroy the integrity of the kernel completely.

For both of these reasons the kernel must be extremely careful when handling pointers presented by user programs.

You will now need to implement solutions to these two problems in your kernel. To address the first problem, you will use a global variable page_fault_mode to let the fault handler know when the kernel is manipulating memory on behalf of the user environment. If a fault happens then, the user environment will be destroyed. (Otherwise, if a fault happens, the kernel should panic.)

Exercise 6. Change kern/trap.c's page fault handler as follows. If a page fault happens while in kernel mode, check the setting of page_fault_mode and act accordingly. The possible page fault modes are listed in kern/trap.h. If you destroy the current environment, print a message explaining the fault in the following format:

	printf("[%08x] PFM_KILL va %08x ip %08x\n", 
               curenv->env_id, fault_va, tf->tf_eip);

Hint: To determine whether a fault happened in user mode or in kernel mode, check the low bits of the tf_cs.

Change kern/syscall.c to set the page fault mode correctly when handling the user pointer in sys_cputs. Make sure you reset the page fault mode when the code finishes handling the user pointer.

Change kern/init.c to run user/buggyhello instead of user/hello. Compile your kernel and boot it. The environment should be destroyed, and the kernel should not panic. You should see:

	[00000000] new env 00000400
	[00000400] PFM_KILL va 00000001 ip f010263d
	TRAP frame ...
	[00000400] free env 00000400
	Destroyed the only environment - nothing more to do!

(Your ip may be different but should begin f01.)

The check you just added protects against buggy environments that pass invalid pointers, but does not protect against evil environments that pass pointers to valid kernel memory. user/evilhello is one such program.

To address this second protection problem, you will "sanitize" all user pointers by using the TRUP macro ("TRanslate User Pointer") defined in kern/pmap.h. This macro will leave valid user pointers as is, but will translate all other pointers to ULIM, which will always definitely cause a page fault when accessed.

Exercise 7. Change the definition of sys_cputs to protect itself against malicious user environments by using TRUP.

Change kern/init.c to run user/evilhello. Compile your kernel and boot it. The environment should be destroyed, and the kernel should not panic. You should see:

	[00000000] new env 00000400
	[00000400] PFM_KILL va ef800000 ip f010263d
	[00000400] free env 00000400

(Your ip may be different but should begin f01.)

Part 3: Creating User Environments and Cooperative Multitasking

Now, you'll implement some new JOS kernel system calls to allow user-level environments to create additional new environments. You will also implement cooperative round-robin scheduling, allowing the kernel to switch from one environment to another when the current environment voluntarily relinquishes the CPU (or exits). In the next lab you'll implement preemptive scheduling, which allows the kernel to re-take control of the CPU from an environment even if the environment does not cooperate.

Round-Robin Scheduling

Your first task in this lab is to change the JOS kernel so that it does not always just run the environment in envs[0], but instead can alternate between multiple environments in "round-robin" fashion. Round-robin scheduling in JOS works as follows:

The first environment, in envs[0], will from now on always be a special idle environment, which always runs the program user/idle.c. The purpose of this program is simply to "waste time" whenever the processor has nothing better to do - it just perpetually attempts to give up the CPU to another environment. Read the code and comments in user/idle.c for other useful details. kern/init.c will create this special idle environment in envs[0] before creating the first "real" environment in envs[1].
The function sched_yield() in the new kern/sched.c is responsible for selecting a new environment to run. It searches sequentially through the envs[] array in circular fashion, starting just after the previously running environment (or at the beginning of the array if there was no previously running environment), picks the first environment it finds with a status of ENV_RUNNABLE (see inc/env.h), and calls env_run() to jump into that environment. However, sched_yield() is aware that envs[0] is the special idle environment, and never picks it unless there are no other runnable environments.
User environments call the sys_yield() system call to invoke the kernel's sched_yield() function, and thereby voluntarily give up the CPU to a different environment. As you can see in user/idle.c, the idle process does this routinely.
Whenever the kernel switches from one environment to another via env_run(), it must save the old environment's register state so that it can be restored properly later when the first environment is eventually re-entered. (Why? Where is the user environment's register state stored on entry to a system call or trap handler in the kernel, and where does this state need to be stored in order to keep it safe for next time the environment runs?) There's a panic() in env_run() to point out where this needs to happen.

Exercise 8. In kern/env.h, change the #define of JOS_MULTIENV to 1. Then implement round-robin scheduling in sched_yield() as described above, and implement the crucial register state saving code in env_run().

Modify kern/init.c to create two (or more!) environments that all run the program user/yield.c. You should see the environments switch back and forth between each other five times before terminating, at which point the idle process runs and invokes the JOS kernel debugger. If this does not happen or the output looks wrong, then fix your code before proceeding.

Question:

In your implementation of env_run() you should have called lcr3(). This loads the %cr3 register, and instantly changes the addressing context used by the MMU. But virtual addresses, such as e itself, have meaning relative to a given address context. Why can the pointer e be dereferenced both before and after the addressing switch?

Challenge! Add a less trivial scheduling policy to the kernel, such as a fixed-priority scheduler that allows each environment to be assigned a priority and ensures that higher-priority environments are always chosen in preference to lower-priority environments. If you're feeling really adventurous, try implementing a Unix-style adjustable-priority scheduler or even a lottery or stride scheduler. (Look up "lottery scheduling" and "stride scheduling" in Google.)

Write a test or two that verifies that your scheduling algorithm is working correctly (i.e., the right environments get run in the right order).

Challenge! The JOS kernel currently does not allow applications to use the x86 processor's x87 floating-point unit (FPU), MMX instructions, or Streaming SIMD Extensions (SSE). Extend the Env structure to provide a save area for the processor's floating point state, and extend the context switching code to save and restore this state properly when switching from one environment to another. The FXSAVE and FXRSTOR instructions may be useful, but note that these are not in the old i386 user's manual because they were introduced in more recent processors. Write a user-level test program that does something cool with floating-point.

System Calls for Environment Creation

Although your kernel is now capable of running and switching between multiple user-level environments, it is still limited to running environments that the kernel initially set up. You will now implement the necessary JOS system calls to allow user environments to create and start other new user environments.

Unix provides the fork() system call as its process creation primitive. Unix fork() copies the entire address space of the calling process (the parent) to create a new process (the child). The only differences between the two observable from user space are their process IDs and parent process IDs (as returned by getpid and getppid). In the parent, fork() returns the child's process ID, while in the child, fork() returns 0. The two processes do not share any memory: writes to one process's memory do not appear in the other and vice versa.

In JOS we will provide a different, much more primitive set of system system calls for creating new user-mode environments. With these system calls we will be able to implement Unix-like fork() functionality entirely in user space, in addition to other types of environment creation functionality. The new system calls we will use in JOS are as follows:

sys_exofork:: This system call creates a new environment with an almost blank slate: nothing is mapped in the user portion of its address space, and it is not runnable. The new environment will have the same register state as the parent environment at the time of the sys_exofork call. In the parent, sys_exofork will return the envid_t of the newly created environment (or a negative error code if the environment allocation failed). In the child, however, it will return 0. (Since the child starts out marked as not runnable, sys_exofork will not actually return in the child until the parent has explicitly allowed this by marking the child runnable using....)
sys_env_set_status:: Sets the status of a specified environment to ENV_RUNNABLE or ENV_NOT_RUNNABLE. This system call is typically used to mark a new environment ready to run, once its address space and register state has been fully initialized.
sys_page_alloc:: Allocates a page of physical memory and maps it at a given virtual address in a given environment's address space.
sys_page_map:: Copy a page mapping (not the contents of a page!) from one environment's address space to another, leaving a memory sharing arrangement in place so that the new and the old mappings can both be used to access the same page of physical memory.
sys_page_unmap:: Unmap a page mapped at a given virtual address in a given environment.

In any of the system calls that accept environment IDs, an envid_t value of 0 means "the current environment." This convention is implemented by envid2env() in kern/env.cc.

We have provided a very primitive implementation of Unix-like fork() functionality in the test program user/dumbfork.c. This test program uses the above system calls to create and run a child environment with a copy of its own address space. The two environments then switch back and forth using sys_yield as in the previous exercise. The parent exits after 10 iterations, whereas the child exits after 20.

Exercise 9. Implement the system calls described above in kern/syscall.c. You will need to use various functions in kern/pmap.c and kern/env.cc, particularly envid2env(). Whenever you call envid2env(), pass 1 in the checkperm parameter to check permissions. Be sure you check for any invalid system call arguments, returning -E_INVAL in that case. Test your JOS kernel with user/dumbfork and make sure it works before proceeding.

Challenge! Add the additional system calls necessary to read all of the vital state of an existing environment as well as set it up. Then implement a user mode program that forks off a child process, runs it for a while (e.g., a few iterations of sys_yield()), then takes a complete snaphost or checkpoint of the child process, runs the child for a while longer, and finally restores the child process to the state it was in at the checkpoint and continues it from there. Thus, you are effectively "replaying" the execution of the child process from an intermediate state. Make the child process perform some interaction with the user using sys_cgetc() or readline() so that the user can view and mutate its internal state, and verify that with your checkpoint/restart functionality you can give the child process a case of selective amnesia, making it "forget" everthing that happened beyond a certain point.

This completes the lab.

Back to Advanced Operating Systems, Fall 2004