CS 354 Fall 2020

Lab 2: Interrupt Handling in XINU and Trapped System Call Implementation (235 pts)

Due: 09/30/2020 (Wed.), 11:59 PM

1. Objectives

The objectives of this lab are to understand XINU's basic interrupt handling on x86 Galileo backends for synchronous interrupts: source of interrupt is an exception, also called fault, and software interrupt int. We will then utilize the int instruction to code XINU system calls as trapped calls following the classical method for implementing systems calls in Linux and Windows on x86 computers.

2. Readings

  1. XINU set-up
  2. Chapters 3 and 4 from the XINU textbook.

When working on lab2:

For the written components of the problems below, please write your answers in a file, lab2answers.pdf, and put it under lab2/. You may use any number of word processing software as long as they are able to export content as pdf files using standard fonts. Written answers in any other format will not be accepted.

Please use a fresh copy of XINU, xinu-fall2020.tar.gz, but for preserving the myhello() function from lab1 and removing all code related to xsh from main() (i.e., you are starting with an empty main()). Hence NPROC and the priority of the null process remain at their original values 100 and 0. Note that main() serves as an app for your own testing purposes. The TAs when evaluating your code will use their own main() to test your XINU kernel modifications.

3. Custom handling of divide-by-zero exception [70 pts]

The consequence of a process performing a divide-by-zero ALU operation is for the CPU to generate an interrupt to inform the operating system. By default, the operating system will terminate the process with the understanding that dividing by zero is a bug and continuing the process's computation is not meaningful. In this problem, we will customize the handling of the divide-by-zero exception by XINU on our x86 backends.

3.1 Triggering divide-by-zero exception through software interrupt

Review of IDT setup in XINU. x86 generates a divide-by-zero exception or fault, which is interrupt number 0, if an instruction tries to divide a number by zero. For example, C code

int main() { int x; x = x / 0; }
when translated into machine code will trigger divide-by-zero exception. As discussed in the lectures, the first entry in IDT set up by XINU during system initialization (the same goes for Linux and Windows) for interrupt number 0 will contain a function pointer to an interrupt handler that x86 will jump to. In XINU, this is assembly code at _Xint0 in intr.S (under system/). Before doing so, x86 will save EFLAGS, CS, and EIP by pushing their values onto the current process's run-time stack. For some exceptions an additional error code may be pushed after EIP. This is not the case for divide-by-zero. Focusing only on control flow, the legacy (i.e., existing) XINU _Xint0 code jumps to assembly code at Xtrap which then calls C function trap() in evec.c. trap() hangs and does not return to its caller. When there are bugs in your XINU code, this is where you are likely to end up.

Another way to generate interrupt number 0 is by executing the software interrupt instruction, int, with operand 0. That is, instead of the C statement x = x / 0 that is translated by gcc into machine code that actually performs division by 0, the instruction, int $0, is executed without performing actual division. One way is to write an assembly function that contains, int $0, and call the function from C function main() as in lab1. A second, and more direct, method is to embed the assembly code into C code using inline assembly asm(). For example,

int main() { asm("int $0"); }
will also result in jumping to _Xint0 after pushing EFLAGS, CS, EIP onto the stack. In XINU, IDT has been configured so that all interrupts, including exceptions, do not automatically disable interrupts and no switching of run-time stack is performed. Interrupt disabling is the programmer's responsibility.

The main difference between performing a divide-by-0 instruction versus software interrupt, int $0, is that instruction iret (after making sure ESP points to the address in main memory where EIP was pushed by x86) jumps back to the same instruction that caused the exception (i.e., divide by zero) in the former. In the latter, iret jumps to the instruction following int $0. Thus in the former, if _Xint0 executes iret then x86 jumps to the same instruction that divided by zero which immediately causes a second divide-by-zero exception. And a third, fourth, ... . When writing interrupt handling code, it is important to check where iret returns to.

Implementation of new handler _Xint0a for interrupt number 0. Instead of modifying the legacy _Xint0 which handled interrupt number 0, add a new handler _Xint0a which will be installed in IDT as the first entry in place of _Xint0. To do so, place the assembly code of _Xint0a starting with directive .globl _Xint0a above the code of _Xint0 in intr.S. Implement _Xint0a so that it calls interrupt handler, void divzero(void), coded in C in divzero.c which outputs a message (e.g., "you are diving by zero") using kprintf().

You may disable interrupts in _Xint0a before calling divzero by executing cli, or alternatively by disabling interrupts inside divzero() by invoking disable(). The same applies when restoring interrupts. Try both and choose what works. Note that divzero() is a lower half kernel function per our discussion in the lectures. To install _Xint0a in place of _Xint0 in IDT, go toward the end of intr.S where defevec is initialized and replace _Xint0 with _Xint0a. defevec is a 1-D array of function pointers that is used by initevec() calling set_evec() in evec.c to initialize the entries of IDT so that they point to interrupt handlers in intr.S. Also remember from lab1 that new functions added to system/ that are called from other functions must be added to prototypes.h, and header file xinu.h must be included in all XINU code. Do not put _Xint0a in prototypes.h as it is not treated as a function by gcc.

Testing. Verify that your implementation of _Xint0a that handles interrupt number 0 works correctly by putting, asm("int $0"), in main() followed by kprintf(). If everything works correctly, int $0 will trigger interrupt number 0 which will push EFLAGS, CS, EIP onto the current process's stack and jump to _Xint0a. _Xint0a calls divzero() which prints a message, returns back to _Xint0a. Make sure interrupts are re-enabled (in _Xint0a or divzero()) and ESP points to the correct address before executing iret which pops the stored EIP, CS, EFLAGS values from the stack and restores their registers before jumping to the instruction following int $0 in main(). The TAs will use their own main() function to test your code.

3.2 Triggering divide-by-zero exception through divide by zero instruction

In this variation of 3.1, main() will actually perform a divide by zero operation, for example, through C statement, x = x / 0, as discussed above. Test your modified interrupt handling implementation of 3.1 with the new test app main(). Discuss in lab2answers.pdf what you find.

Modify the implementation of 3.1 for handling interrupt number 0 so that actually dividing by 0 to generate the interrupt does not result in an infinite loop. To do so, consider what you must do to make _Xint0a not jump back to the same instruction that generated divide-by-zero exception in the first place. That is, as in 3.1, assume you want to jump to the instruction that follows the one that caused division by zero. Explain what your method for achieving this goal in lab2answers.pdf.

Instead of modifying _Xint0a, add new assembly code _Xint0b above the code of _Xint0a in intr.S. Then modify the first element of 1-D array defevec in intr.S so that it points to _Xint0b. That is, you are installing _Xint0b in IDT as the handler for interrupt number 0. When evaluating your code, the TAs will modify the first entry of defevec in intr.S to change the configuration of IDT as needed. Verify that the modified main() that performs a division by zero operation resumes execution at the next instruction after your interrupt number 0 XINU kernel code, _Xint0b, handles the divide-by-zero exception. For example, calling kprintf() to output a message after performing a divide by 0 operation should now work.

Hint: When inducing x86 to jump to the next instruction following the divide-by-zero instruction that generated interrupt 0, use trial-and-error to determine what address works upon executing iret. Keep in mind that the Galileo backends are 32-bit machines where memory access following 4 byte boundaries reduces CPU cycles.

4. New XINU system call: uptimesec() [25 pts]

Implement a new XINU system call, uint32 uptimesec(void), in uptimesec.c under system/ that returns the number of seconds that have elapsed since XINU was bootstrapped on a backend. XINU keeps track of this quantity in global variable, uint32 clktime, declared in include/clock.h. clktime is maintained by XINU's clock interrupt handling code clkdisp in clkdisp.S which is installed as the 33rd entry of IDT by clkinit() during system initialization. clkinit() is called after the other entries of IDT (XINU uses 48 out of the 256 total) are initialized by initevec(). clkdisp calls C function clkhandler() in clkhandler.c which is the lower half kernel function that actually maintains clktime.

XINU system calls follow a pattern of first disabling interrupts by calling disable() and remembering the interrupt mask. After system call chores are completed, just before returning interrupts are restored to their previous state by calling restore() which re-enables interrupts. Do the same when coding uptimesec(). It is a simple system call since it takes no arguments and returns an unsigned int value. Test uptimesec() by calling from main() and verify that it works correctly. The TAs will use their own code to test uptimesec().

5. Trapped XINU system calls [140 pts]

As discussed in the lectures and evident by inspection of any XINU system call code, all XINU system calls are regular C function calls. That is, there is no special trap instruction that switches a process from user mode to kernel mode and jumps to a kernel function in the upper half of the kernel to carry out the requested service. A system call in XINU is not a wrapper function that acts as a gateway to a kernel function that actually performs the system call task. It is the actual kernel function. Isolation/protection is a core feature of general-purposes operating systems such as Linux and Windows which is necessary for reliability. It is essential for security. Not supporting isolation/protection, in computing environments where it makes sense, is the exception to the rule. In this problem, we will re-implement two XINU system calls -- legacy system call chprio() and new system call uptimesec() -- into trapped versions using a traditional technique followed by Linux and Windows for x86 machines.

5.1 Traditional system call trap using int instruction in Linux and Windows

The traditional approach followed by Linux and Windows on x86 to implement trapped system calls is to use software interrupt, i.e., int instruction, to transition from user mode to kernel mode and jump to kernel code in the upper half that performs the requested service in kernel mode. To return back to user mode after the task has been complete, the iret instruction is executed which jumps back to user code and continues where the process left off. For example, in Linux the fork() system call is a wrapper function (part of the C standard library) that performs a trap, and sys_fork() is the kernel function that calls other kernel functions to accomplish the requested task of spawning a child process. When the wrapper function fork() traps, i.e., executes the int instruction, it does not directly jump to sys_fork() but goes through an intermediary -- the system call dispatcher -- that determines which system call service is being requested and jumps to the relevant kernel function sys_fork(). The traditional method for system call trap using int/iret has been supplanted by sysenter/sysexit instructions in Intel x86 CPUs which is optimized and faster than the traditional method. We will use the traditional method which is more transparent and pedagogically useful.

5.2 XINU system call dispatcher _Xint38

From Problem 3.1, we know that the operand of int specifies the interrupt number. In Linux, interrupt number 128 is used to install the system call dispatcher in IDT. In Windows, interrupt number 46 is used. System calls are numbered 0, 1, 2, ... to identify them inside the kernel. Register EAX is used to pass the system call number from a system call wrapper function to system call dispatcher. In XINU, we will use interrupt number 38 to install a system call dispatcher. We will repurpose _Xint38 in intr.S by modifying its code to perform the chores of a system call dispatcher. Interrupt numbers 0, 1, ..., 31 are reserved in x86 for exceptions (i.e., faults), and XINU uses interrupt number 32 to service clock interrupts. We could use interrupt number 33 to install our system call dispatcher in IDT but will use 38 instead.

Arguments of a system call, if any, are communicated from system call wrapper function to system call dispatcher through registers and stack. uptimesec() does not have arguments. For chprio(), we will pass the second argument, pri16 newprio, from wrapper function through register EBX. The first argument, pid32 pid, will be passed by pushing its value onto the stack. Upon executing int $38 in the wrapper function, x86 will push EFLAGS, CS, EIP on the stack and jump to _Xint38. Wrapper functions will communicate to _Xint38 which system call is being called through EAX. For uptimesec(), 6 will be placed in EAX and system call number 7 will identify chprio(). Any other number passed in EAX will be considered invalid by the system call dispatcher which will place -1 in EAX. Wrapper functions will inspect EAX when iret from _Xint38 jumps back to the wrapper function to check if a system called failed.

After checking that the value in EAX passed by a system call wrapper function is valid, _Xint38 will push system call arguments (none for uptimesec() and two for chprio()) onto the stack and conditionally jump to uptimesec() or chprio(). Note that we will reuse existing XINU system calls which are regular C functions as internal kernel functions that carry out the requested task. For example, in Linux internal kernel function do_fork() carries out the actual work of spawning a child process. Since uptimesec() and chprio() are translated into machine code by gcc following CDECL, when calling them from _Xint38 you must follow CDECL as well so that caller/callee interaction works correctly. Moreover, uptimesec() and chprio() will return values in EAX per CDECL. Since the return values from uptimesec() and chprio() must be communicated back to their wrapper functions, _Xint38 must not modify EAX before executing iret. The wrapper functions will be coded so that it expects return values in EAX. Of course, wrapper functions themselves will return to their caller (e.g., app function main()) with return value stored in EAX.

5.3 Three software layers for implementing trapped system calls

When implementing trapped system calls you need to handle three layers of software. First, system call wrapper functions that act as an API for app programmers to invoke kernel services. This will be discussed in 5.4. Second, system call dispatcher code (in our case _Xint38 in intr.S) to which system call wrapper functions trap to. This was discussed in 5.2. Third, the internal kernel functions that actually perform requested kernel services. In our case, they are the existing XINU system call functions uptimesec() and chprio() that are called by _Xint38. The third layer is coded in C (uptimesec() and chprio()), the second layer (_Xint38) is coded in assembly, and the first layer discussed in 5.4 will be coded in a mix of C and assembly. The interfaces between the three software layers must be carefully implemented so that the resultant interaction works correctly.

5.4 System call wrapper functions: general design

In Linux, fork() is a system call wrapper function (part of the C standard library, e.g., glibc on our frontend lab machines) acts as the system call API for app programmers that traps to the system call dispatcher which then calls internal kernel functions to carry out the work of spawning a child process. In XINU, you will implement two wrapper functions, uint32 xuptimesec(void), in xuptimesec.c, and pri16 xchprio(pid32, pri16), in xchprio.c, both under system/. An app programmer will call xuptimesec() to invoke the kernel service uptimesec(), and similarly for xchprio() to change the priority of a process. uptimesec() and chprio() which are regular C functions will be repurposed as internal kernel functions. For example, an app process in XINU running

int main() { pri16 x; x = xchprio(getpid(),5); }
will request of XINU to change its priority to 5 through system call API xchprio(), not internal kernel function chprio(). In the case of Linux, most of the work for creating a child process is carried out by internal kernel function do_fork() which is not meant to be called directly by a user process. Since Linux is open source, there is nothing that prevents an app programmer from copying the code of do_fork() into a function, say my_fork(), and calling my_fork() directly in user mode. It will not successfully complete since some operations performed by my_fork() (a copy of do_fork()) are privileged and being in user mode will not allow execution of those operations.

The job of a wrapper function is to do some preparatory work before executing the trap instruction to switch to kernel mode and eventually reach the relevant kernel function of the upper half where the requested service -- if valid -- is carried out in kernel mode. After returning from the trap back to user mode, final clean-up chores may need to be performed by a system call wrapper function including passing along a return value from the internal kernel function to the app function that called the system call wrapper function. For example, in the case of xuptimesec(), the number of seconds since last bootstrap of a backend will be returned by internal kernel function uptimsec() to its caller, the system call dispatcher _Xint38. _Xint38 before executing iret to return to xuptimesec() must ensure that the return value received from uptimesec() is communicated back to xuptimesec(). In our case, through register EAX.

5.5 System call wrapper function: xchprio()

We will consider two methods of implementing a system call wrapper function. The first method is coding an assembly function that is called from the wrapper function which is coded in C. These are the techniques practiced in lab1. The assembly function will then execute instruction, int $38, to trap to _Xint38. The second method, instead of calling a function coded in assembly, imbeds assembly code (which includes trap instruction int $38) in the wrapper function's C code. To embed means that the assembly code is directly merged with the assembly code of the C wrapper function produced by gcc. That is, there is no function call needed from the wrapper function's translated assembly code to the assembly code containing the trap instruction. Thus, the second method, called inline assembly, is more efficient and faster since the overhead associated with making a function call is eliminated. Keep in mind the discussion in the lectures that function calls, even with stack management hardware support (e.g., ESP and EBP registers, push and pop instructions) is slower compared to the same task achieved without making a function call. Since system calls may be frequently called by app processes, making system calls efficient is important. Hence inline assembly, all else being equal, is preferred. Improved trap instruction sysenter uses hardware support to further reduce system call overhead. We will code xuptimesec() using inline assembly in 5.5. We will implement xchprio() by coding an assembly function that xchprio() calls which, then, causes a trap to _Xint38.

Code the assembly function, pri16 trapchprio(pid32, pri16), that wrapper function xchprio() calls in trapchprio.S. Since xchprio() is translated by gcc using CDECL, code trapchprio() so that it follows CDECL as well. This is what you practiced in lab1. Thus user process running main() calls xchprio() which calls trapchprio(). trapchprio() traps to _Xint38 which calls chprio(). Then the return journey from chprio() up the chain to xchprio() and its caller main(). At every step, make sure to be aware of what is happening to the stack and relevant registers, following CDECL so that you and gcc are on the same page.

5.6 System call wrapper function: xuptimesec()

We will use inline assembly to code xuptimesec() thereby reducing overhead. We encountered a simple form of inline assembly in 3.1 where we embedded assembly instruction, int $0, inside C function main() using asm("int $0") to generate a software interrupt for the divide-by-zero exception. Except in very simple cases that require no communication of values between C and embedded inline assembly code, we will need to use a richer framework for embedding assembly code in C code called extended inline assembly. Although xuptimesec() has no arguments, it returns a value of type unsigned int, that must be assigned to a C variable so that it can be utilized by the surrounding C code of xuptimesec(). In general, extended assembly may involve passing of arguments and execution of multiple instructions which require careful specification following extended asm() syntax so that the embedded assembly code does not conflict with the surrounding assembly code generated by gcc upon translating the C code of xuptimesec(). Potential conflicts include use of specific registers by the inline assembly code that overwrite, or are overwritten by, assembly code generated by gcc when translating C code.

In the case of extended inline assembly code inserted in xuptimesec() C code, before executing instruction, int $38, we will need to store the system call number 6 which identifies xuptimesec() in register EAX needed by _Xint38. After executing the trap, when _Xint38 returns to the next instruction of xuptimesec() by executing iret, we may need to copy the return value contained in EAX to a local C variable of xuptimesec() if further operations on the returned value are performed by xuptimesec(). If not, leaving the value untouched in EAX may suffice since the CDECL caller/callee interface between main() and xuptimesec() expects xuptimesec() to put its return value in EAX. In general, extended inline assembly follows the format

asm(assembler template
: output operands
: input operands
: clobbered registers);

The last component, clobbered registers, is a list of registers that the extended inline assembly code may be modifying. When embedding the extended inline assembly code, this information will be used to save/restore, or avoid if possible, these registers in the surrounding assembly code generated by gcc from the C code of xuptimesec(). For our purposes, it is best to start with simple examples with references (html) serving as a more comprehensive guide as needed. For example, assuming system call number for xuptimesec() is stored in local C variable, int sysnum, and the value returned by _Xint38 in EAX is to be copied to local C variable, int retval, asm()'s input operands would specify "a"(sysnum) meaning that the value in C variable sysnum should be stored in EAX. This comprises the input part. For output operands of asm(), we would specify "=a"(retval) meaning that the content of EAX is to be copied to C variable retval. Another simple test you can perform is code Problem 3.4 of lab1 not as an assembly function addtwo() in addtwo.S but as extended inline assembly code embedded in main(). Coding of extended inline assembly for CS354 is minimal and requires no special expertise beyond the prerequisite background of CS354. Verify that xuptimesec() works correctly by calling it from main().

5.7 Trapped system call implementation remains in kernel mode

At this point you may have noticed that trapped system calls xuptimesec() and xchprio() follow the general framework for implementing trapped system calls in Linux and Window with one caveat: when trapping by executing, int $38, in system call wrapper functions to _Xint38 via x86's IDT, we did not switch stacks. Nor did we concern ourselves with the CPL bits of the CS register which will specify whether the current process runs in kernel mode (00) or user mode (11). When discussing software support for isolation/protection we emphasized that per-process kernel stack is needed to perform nested kernel function calls in kernel mode. For lab2 we are preserving XINU's basic architecture. Although XINU operates in protected mode (i.e., PE bit of CR0 register is set to 1 during boot loading), all processes created using system call create() start out in kernel mode, stay in kernel mode, and end in kernel mode. As discussed in the lectures, XINU's GDT is initialized with three entries for kernel code segment, kernel data segment, and kernel stack segment that are pointed to by CS, DS, and SS, respectively. We noted that the stack segment is not necessary as SS can be made to point to the same entry as DS which is how Linux and Windows configures GDT. The key difference of Linux and Windows is that their GDT contains two additional entries: one for user code segment and another for user data segment. Newly created processes start out with CS pointing to the user code segment of GDT and DS and SS pointing to the user data segment of GDT. In XINU, registers CS, DS, SS always point to the three entries whose DPL is set to 00 specifying they are kernel mode entries.

Adding user mode code and data segment entries in XINU's GDT is not difficult. However, it requires acquiring background on x86 including hardware supported stack switching and hardware supported context manipulation. Our implementation of xuptimesec() and xchprio() in lab2 will perform system call trapping using x86's interrupt hardware support following the traditional method used by Linux and Windows. However, all XINU processes will start off in kernel mode and stay in kernel mode throughout their existence. You will benefit from understanding how trapped system calls are architected and implemented without the added complications of managing CS, DS, SS registers and dual stacks that depend on idiosyncratic features of x86 hardware support.

Bonus problem [25 pts]

Extend the trapped XINU system calls to include getpid(). Place the wrapper function xgetpid() in xgetpid.c and use extended inline assembly to trap to the system call dispatcher. Assign xgetpid() system call number 9. To prevent potential issues when extending XINU to include xgetpid(), save a copy that does not include xgetpid() and maintain the extended kernel including xgetpid() separately. You can use Git, RCS, or any number of version control tools to keep track of your code. Saving earlier versions in separate folders will work as well. Test that your implementation works correctly.

Note: The bonus problem provides an opportunity to earn extra credits that count toward the lab component of the course. It is purely optional.

Turn-in instructions

General instructions:

When implementing code in the labs, please maintain separate versions/copies of code so that mistakes such as unintentional overwriting or deletion of code is prevented. This is in addition to the efficiency that such organization provides. You may use any number of version control systems such as GIT and RCS. Please make sure that your code is protected from public access. For example, when using GIT, use git that manages code locally instead of its on-line counterpart github. If you prefer not to use version control tools, you may just use manual copy to keep track of different versions required for development and testing. More vigilance and discipline may be required when doing so.

The TAs, when evaluating your code, will use their own test code (principally main()) to drive your XINU code. The code you put inside main() is for your own testing and will, in general, not be considered during evaluation.

If you are unsure what you need to submit in what format, consult the TA notes link. If it doesn't answer your question, ask during PSOs and office hours which are scheduled M-F.

Specific instructions:

1. Format for submitting written lab answers and kprintf() added for testing and debugging purposes in kernel code:

2. Before submitting your work, make sure to double-check the TA Notes to ensure that any additional requirements and instructions have been followed.

3. Electronic turn-in instructions:

        i) Go to the xinu-fall2020/compile directory and run "make clean".

        ii) Go to the directory where lab2 (containing xinu-fall2020/ and lab2answers.pdf) is a subdirectory.

                For example, if /homes/alice/cs354/lab2/xinu-fall2020 is your directory structure, go to /homes/alice/cs354

        iii) Type the following command

                turnin -c cs354 -p lab2 lab2

You can check/list the submitted files using

turnin -c cs354 -p lab2 -v

Please make sure to disable all debugging output before submitting your code.

Back to the CS 354 web page