CS 354 Spring 2021

Lab 2: Interrupt Handling in x86 XINU and Trapped System Call Implementation (250 pts)

Due: 03/03/2021 (Wed.), 11:59 PM

1. Objectives

The objectives of this lab are to understand XINU's basic interrupt handling on x86 Galileo backends for synchronous interrupts: source of interrupt is either an exception (also called fault) or software interrupt int. We will then utilize the int instruction to code XINU system calls as trapped calls following the classical method for implementing systems calls in Linux and Windows on x86 computers.

2. Readings

  1. XINU set-up
  2. Chapters 3 and 4 from the XINU textbook.

When working on lab2:

For the written components of the problems below, please write your answers in a file, lab2ans.pdf, and put it under lab2/. You may use any number of word processing software as long as they are able to export content as pdf files using standard fonts. Written answers in any other format will not be accepted.

Please use a fresh copy of XINU, xinu-spring2021.tar.gz, but for preserving the helloworld() function from lab1 and removing all code related to xsh from main() (i.e., you are starting with an empty main()). As noted before, main() serves as an app for your own testing purposes. The TAs when evaluating your code will use their own main() to evaluate your XINU kernel modifications.

3. Custom handling of synchronous x86 invalid opcode exception [75 pts]

3.1 What is invalid opcode exception

Although gcc is unlikely to generate machine code that triggers an invalid opcode exception which is a synchronous interrupt (i.e., "synchronous" in the sense that an instruction of the current process is the cause), writing assembly code or embedding assembly in C code can result in interrupt number 6 if not careful. One case is accessing x86 control registers CR0-CR8, some of which are reserved. For example, embedding

asm("movl %ebx, %cr1");

into C code that tries to copy the content of EBX to CR1 will result in interrupt number 6. The operation as a whole comprised of opcode and operand is considered invalid. Trying to use a general-purpose register such as %EAF which does not exist is detected by gcc. CR1 exists but is not used by x86, a condition that is not checked by gcc.

Modifying the content of control registers is part of the task performed by operating system kernels and associated system tools. For example, during bootloading x86's protected mode that separates kernel mode/user mode is enabled by setting the PE bit (first bit) of CR0 to 1. In the graduate version of CS354, CS503, the last lab assignment involves a 5-week virtual memory lab where paging -- a topic we will discuss in the second half of the course -- is enabled by students by setting the PG bit (last bit of CR0) to 1. Our XINU version runs with paging disabled, i.e., PG = 0.

For most synchronous interrupts, when an operating system is informed of the interrupt by hardware, the default action is to terminate the offending process. For some interrupts, an operating system may provide processes the ability to deal with the interrupt without terminating the process. This will be the subject of lab5 in this course which corresponds to signal handling in Linux/UNIX. A similar feature also exists in Windows. In the default XINU, interrupt number 6 on our x86 Galileo backends results in termination of the current process. In Problem 3, we will modify how XINU deals with interrupt number 6 which involves understanding the interrupt handling interface between hardware and software, and coding software so that x86 does what we want it to do.

3.2 Modifying how XINU deals with synchronous interrupts

Two methods of triggering interrupt number 6. We will consider two methods for triggering the invalid opcode exception. The first method we have already encountered in 3.1 where an invalid operation, "movl %ebx, %cr1", where the MOV instructions tries to access CR1 is the cause. In the second method, we can use the x86 instruction, int, called software interrupt to generate the invalid opcode exception. From C code we embed

asm("int $6");

which causes instruction int to generate interrupt number 6. A software interrupt is a type of synchronous interrupt. It is a useful tool when programming operating systems. We will study one of its uses in Problem 5.

Review of IDT setup in XINU. XINU's interface to x86's interrupts is through a data structure, IDT (interrupt descriptor table), through which hardware and software interface. IDT is set up by XINU (same goes for Linux and Windows running on x86), and x86 consults IDT to determine which kernel code to run when an interrupt is generated. The 7'th entry of IDT (interrupt numbers start at 0) is configured to contain a function pointer to XINU's interrupt handling code, _Xint6, for dealing with interrupt number 6. That is, when interrupt number 6 is generated x86 will jump to _Xint6 whose code is located in system/intr.S. Inspecting intr.S you will find that in the default XINU _Xint6 will jump to Xtrap (also in intr.S) which then calls C function trap() located in system/evec.c. Xtrap coded in assembly passes two arguments to trap() following CDECL. The first argument is the interrupt number (_Xint6 knows it was interrupt number 6 since it is executing), the second argument the PID of the current process. trap() outputs the interrupt number, PID, and a number of other state variables which can be helpful when debugging run-time errors. You will get very familiar with trap(). Then the current process terminates and XINU's scheduler picks a ready process of highest priority to run next.

Review of x86 interrupt handling in XINU. Our goal is to modify _Xint6 so that it does not call Xtrap but returns control (i.e., jumps) back to the process that triggered the invalid opcode exception. In particular, we want _Xint6 to jump to the instruction of the current process that follows the "movl %ebx, %cr1" instruction which caused the synchronous interrupt. In Linux signal handlers, we would want _Xint6 to jump to a user code -- i.e., signal handler -- that the programmer has asked Linux to jump when this particular interrupt occurs. Linux does not support signal handling for interrupt number 6, but that is unimportant for the purpose of this exercise. To jump back to the user code of the process that executed "movl %ebx, %cr1", we need to know the address to jump to which requires hardware support. x86 provides this support by pushing onto the stack of the current process the content of EFLAGS, CS, and EIP registers. For some interrupts, an error code may be pushed after EIP to convey additional information. The first register, EFLAGS, contains state information of the current process which must be saved onto the process's stack since kernel interrupt handling code (i.e., _Xint6) may corrupt the bits of EFLAGS. Before jumping back to the code of the interrupted process, the original content of EFLAGS must be restored. The same goes for CS which specifies whether the process was running in user mode/kernel mode before generating the interrupt. The most important piece is the program counter, EIP, which contains the address to jump back to. For some interrupts, the return address pushed onto the stack by x86 is the address of the instruction following the instruction that caused the interrupt. For other interrupts, the return address takes us back to the same instruction that triggered the interrupt. For invalid opcode exception it is the latter.

Returning from x86 interrupts. The reason for the bifurcation of the return address is that for some instructions repeating the instruction that caused the interrupt is the appropriate action to take. For example, in the case of a page fault mentioned above which, roughly, means information required to run the next instruction is not in main memory and must be fetched from disk (we will discuss the details later in the course under memory management), the instruction failed to execute and must be re-executed after the missing information becomes available. Hence in the case of a page fault exception (interrupt number 14), the return address EIP pushed by x86 should be same instruction that caused the page fault interrupt in the first place. Thus modifying the code of _Xint6 to the single instruction, iret, which pops and restored the saved EIP, CS, EFLAGS from the stack, the process would just end up repeating the same instruction, "movl %ebx, %cr1", again resulting in an infinite loop. Hence whatever code we put after this instruction will never be executed. Verify that this is indeed the case. First, check that legacy _Xint6 causes trap() to be called and state information to be output before terminating. Include this output in your write-up lab2ans.pdf under lab2/. Second, modify _Xint6 so that it executes iret only. Insert code after "movl %ebx, %cr1" in main() to verify that it is never reached. Add code to XINU's kernel to confirm that this change to _Xint6 is causing an infinite loop. Explain in lab2ans.pdf how you do that. For any XINU source code that you modified create a directory v1/ under lab2/ and place the file in v1/. For example, if you modified intr.S to test that an infinite loop occurs, place the modified file intr.S in v1/. By copying intr.S to system/ the TAs can verify correctness of your code.

Implementing new handler _Xint6i that returns to next instruction when executing "int $6". The problem of modifying _Xint6 so that it returns to the instruction following "movl %ebx, %cr1" and not result in an infinite loop will be tackled in 3.3. Here we will make use of software interrupt, asm("int $6"), to generate interrupt number 6 which returns to the instruction following int $6 upon returning from _Xint6 by executing iret. Instead of modifying legacy code _Xint6 which handled interrupt number 6, add a new handler _Xint6i which will be installed in IDT as the seventh entry in place of _Xint6. To do so, place the assembly code of _Xint6i starting with directive .globl _Xint6i above the code of _Xint6 in intr.S. Code _Xint6i so that it calls interrupt handler, void opcodeinv(int x), coded in C in opcodeinv.c which outputs x along with a suitable text message using kprintf(). Remember to include new function definitions in include/prototypes.h. Do not put _Xint6i in prototypes.h as it is not treated as a function by gcc.

You may disable interrupts in _Xint6i before calling opcodeinv by executing cli and, redundantly, by also disabling interrupts inside opcodeinv() by invoking disable(). The reverse applies when restoring interrupts. Verify that this works. Note that directly executing cli inside opcodeinv() upon entry (e.g., asm("cli")) and executing sti before returning from opcodeinv() will not work correctly. Explain in lab2ans.pdf why this is the case. Inspect the code of disable() and restore(), and explain in lab2ans.pdf why calling disable() and restore() inside opcodeinv() works. Ignore the fact that disabling interrupts inside opcodeinv() is unnecessary since _Xint6i has already done so before calling opcodeinv(). An additional item you need to consider is that you are calling a C function opcodeinv() from assembly code _Xint6i. Since you can't be sure what machine code gcc will translate opcodeinv() into -- e.g., will the assembly code of opcodeinv() utilize registers such as EAX, EBX? -- you need to follow the rules of CDECL and safeguard/restore state information that the caller is responsible for. That is, some registers the caller is responsible for safeguarding/restoring, other registers are the responsibility of the callee per CDECL. Make sure you explicitly take care of this issue. At the level of software engineering for operating systems, we do not leave anything to chance.

To install _Xint6i in place of _Xint6 in x86's IDT, go toward the end of intr.S where defevec is initialized and replace _Xint6 with _Xint6i. defevec is a 1-D array of function pointers that is used by initevec() calling set_evec() in evec.c to initialize the entries of IDT so that they point to interrupt handling code in intr.S. We are modifying XINU's default set-up of IDT. Verify that your implementation of _Xint6i that handles interrupt number 6 works correctly by putting, asm("int $6"), in main() followed by kprintf(). Since generating interrupt number 6 via software interrupt causes x86 to push the address of the instruction following "int $6" onto the stack, executing iret from _Xint6i will not jump back to int $6. That is, int $6 will generate interrupt number 6 which will push EFLAGS, CS, EIP (of the next instruction) onto the current process's stack and jump to _Xint6i. _Xint6i calls opcodeinv() which prints a message, returns back to _Xint6i. After returning from opcodeinv(), make sure that ESP points to the correct address in the stack before executing iret which atomically pops the stored EIP, CS, EFLAGS values from the stack and restores their registers before jumping to the instruction following "int $6" in main(). The TAs will use their own main() to test your code.

3.3 Returning to the next instruction of "movl %ebx, %cr1"

Upon triggering invalid opcode exception via instruction "movl %ebx, %cr1", our goal is to make changes to _Xint6 before executing iret so that we jump to the instruction following it. Hence we do not enter into an infinite loop. Logically, this means that we will need to modify the EIP value that was pushed onto the stack by x86 by increasing its value so that it becomes the address of the instruction following "movl %ebx, %cr1". Unlike ARM processors that have fixed instruction lengths, doing so in x86 is more involved due to instructions being of variable length. Instead of trying to become an expert on x86 instruction format which is not necessary, use trial-and-error by adding different values to the return address on the stack to find an offset that works.

When testing that your offset works or not, it is useful to add assignment statements right after asm("movl %ebx, %cr1") in main() and output their values by calling kprintf(). For example, you may initialize integer variable x = 0 before executing movl, then set x = 3 after executing movl. If your offset correctly identifies the address of the next instruction following "movl %ebx, %cr1" calling kprintf() thereafter should output 3 for x. An offset too small or large will result in unexpected behavior. Although we are using trial and error to identify the address of the next instruction, this is done in a well-defined and controlled manner which is sufficient for our purpose. If we wanted to write code that returned to the next instruction of any instruction that can generate an invalid opcode exception, we would build a table of all such instructions (opcode and operands) and their lengths, inspect the content of the return address pushed by x86 to determine (i.e., parse) the instruction and consult the table. For our purpose of understanding and implementing operating systems this is of tertiary relevance and not necessary.

Place this version of intr.S containing the modified _Xint6 in v2/ under lab2/. Explain in lab2ans.pdf what values you tried that worked, what values did not work, and how you performed testing to arrive at your answer.


4. New XINU system call: clkticks() [25 pts]

Implement a new XINU system call, uint32 clkticks(void), in system/clkticks.c that returns the number of clock interrupts that XINU has serviced since bootloading on a backend machine. Note that when cli disables external interrupts on x86, this includes clock interrupts. XINU's hardware clock that is used as its system timer is configured to generate an interrupt every 1 msec. This time interval is also referred to as the tick. When discussing clock management under device management later in the course, we will encounter kernels that do not have a fixed tick. XINU's clock interrupt handling software is comprised of two parts: clkdisp in clkdisp.S which is installed as the 33rd entry of IDT, and clkhandler() in clkhandler.c which is called by clkdisp. clkinit() inserts clkdisp as entry 32 of IDT during system initialization. clkinit() is called after the other entries of IDT (XINU uses 48 out of the 256 total) have been initialized by initevec(). To support clkticks(), create a new global variable, uint32 xclockticks, and declare it in the same file where global variable clktime is declared. clktime keeps track of XINU's uptime in unit of second. Make changes to XINU kernel code as necessary to support clkticks(). Specify in lab2ans.pdf all the files that you modified, and added, to support the new system call.

When coding clkticks(), note that XINU system calls follow a template of first disabling interrupts by calling disable() and remembering the interrupt mask. After system call chores are completed, and just before returning, interrupts are restored to their previous state by calling restore(). Do the same when coding clkticks(). Test clkticks() by calling from main() along with system call sleepms() which sleeps for a given number of milliseconds. This is only an approximate method to gauge correctness. A more advanced method uses a very accurate clock supported by x86 called TSC (time stamp counter). XINU's getticks() function in system/getticks.c uses assembly instruction rdtsc (read TSC) to query this 64-bit counter. We will make do with utilizing sleepms().


5. Trapped XINU system calls [150 pts]

As discussed in the lectures and evident by inspection of any XINU system call code, all XINU system calls are regular C function calls. That is, there is no special trap instruction that switches a process from user mode to kernel mode and jumps to a kernel function in the upper half of the kernel to carry out the requested service. A system call in XINU is not a wrapper function that acts as a gateway to a kernel function that actually performs the system call task. It is the actual kernel function. Isolation/protection is a core feature of commodity operating systems such as Linux and Windows which is necessary for reliability. It is essential for security. Not supporting isolation/protection, in computing environments where it makes sense, is the exception to the rule. In this problem, we will re-implement two XINU system calls -- resume() and yield() -- into trapped versions using a traditional technique followed by Linux and Windows for x86 machines. Recall resume() readies a process, for example, one that has been created and is in a suspended state. yield() is a way for a process to yield the CPU to another process by having XINU call its scheduler. If the only other ready process is the NULL process, yield() will not succeed and return immediately.

5.1 Traditional system call trap using int instruction in Linux and Windows

The traditional approach followed by Linux and Windows on x86 to implement trapped system calls is to use software interrupt, i.e., int instruction, to transition from user mode to kernel mode and jump to kernel code in the upper half that performs the requested service in kernel mode. To return back to user mode after the task has been complete, the iret instruction is executed which jumps back to user code and continues where the process left off. For example, in Linux the fork() system call is a wrapper function (part of the C standard library) that performs a trap, and sys_fork() is the kernel function that calls other kernel functions to accomplish the requested task of spawning a child process. When the wrapper function fork() traps, i.e., executes the int instruction, it does not directly jump to sys_fork() but goes through an intermediary -- the system call dispatcher -- that determines which system call service is being requested and jumps to the relevant kernel function sys_fork(). The traditional method for system call trap using int/iret has been supplanted by sysenter/sysexit instructions in Intel x86 CPUs which is optimized and faster than the traditional method. We will use the traditional method which is more transparent and pedagogically useful.

5.2 XINU system call dispatcher _Xint46

From Problem 3.1, we know that the operand of int specifies the interrupt number. In Linux, interrupt number 128 is used to install the system call dispatcher in IDT. In Windows, interrupt number 46 is used. System calls are numbered 0, 1, 2, ... to identify them inside the kernel. Register EAX is used to pass the system call number from a system call wrapper function to system call dispatcher. We will use EBX instead. In XINU, we will use interrupt number 46 to install a system call dispatcher. We will repurpose _Xint46 in intr.S by modifying its code to perform the chores of a system call dispatcher. Interrupt numbers 0, 1, ..., 31 are reserved in x86 for exceptions, and XINU uses interrupt number 32 to service clock interrupts. We could use interrupt number 33 to install our system call dispatcher in IDT but will use 46 instead.

Arguments of a system call, if any, are communicated from system call wrapper function to system call dispatcher through registers and stack. yield() does not have arguments. For resume(), we will pass its argument, pid32 pid, from wrapper function by pushing it on the stack. Upon executing "int $46" in the wrapper function, x86 will push EFLAGS, CS, EIP on the stack and jump to _Xint46. Wrapper functions will communicate to _Xint46 which system call is being called through EBX. For yield(), 9 will be placed in EBX and system call number 3 will identify resume(). Any other number passed in EBX will be considered invalid by the system call dispatcher which will put -1 in EAX. By CDECL wrapper functions coded in C will inspect EAX when iret from _Xint46 jumps back to the wrapper function to check if a system called failed.

After checking that the value in EAX passed by a system call wrapper function is valid, _Xint46 will push system call arguments (none for yield() and one for resume()) onto the stack and conditionally jump to yield() or resume(). We will reuse existing XINU system calls which are regular C functions as internal kernel functions that carry out the requested task. For example, in Linux internal kernel function do_fork() carries out the actual work of spawning a child process. Since yield() and resume() are translated into machine code by gcc following CDECL, when calling them from _Xint46 you must follow CDECL as well so that caller/callee interaction works correctly. Moreover, yield() and resume() will return values in EAX per CDECL. Since the return values from yield() and resume() must be communicated back to their wrapper functions, the intermediary _Xint46 must not modify EAX before executing iret. The wrapper functions coded in C are translated by gcc, hence will expect return values in EAX. Wrapper functions, also called APIs, in turn will return to their caller (e.g., app function main()) with return value stored in EAX. The main challenge lies in correctly interfacing C functions and assembly functions following CDECL caller/callee convention. This is called ABI (application binary interface) programming which we started practicing in lab1.

5.3 Three software layers for implementing trapped system calls

When implementing trapped system calls you need to handle three layers of software. First, system call wrapper functions that act as an API for app programmers to invoke operating system services. For example, fork() which is part of the standard C library is a wrapper function. This will be elaborated in 5.4. Second, system call dispatcher code (in our case _Xint46 in intr.S) to which system call wrapper functions trap to. This was discussed in 5.2. Third, the internal kernel functions that actually perform requested kernel services. In our case, they are the existing XINU system call functions yield() and resume() that are called by _Xint46. The third layer is coded in C (yield() and resume()), the second layer (_Xint46) is coded in assembly, and the first layer discussed in 5.4 is coded in a mix of C and assembly. The interfaces between the three software layers must be carefully implemented so that the resultant interaction works correctly.

5.4 System call wrapper functions: general design

In Linux, fork() is a system call wrapper function (part of glibc on our frontend lab machines) that acts as the system call API for app programmers that traps to the system call dispatcher which then calls internal kernel functions to carry out the work of spawning a child process. The same three layers as in our XINU set-up. In XINU, you will implement two wrapper functions, syscall xyield(void), in xyield.c, and pri16 xresume(pid32), in xresume.c, both in system/. An app programmer will call xyield() to invoke the kernel service yield() (e.g., similar to do_fork() in Linux), and analogously for xresume() to resume a suspended process. yield() and resume() which are regular C functions will be repurposed as internal kernel functions. For example, an app process in XINU with statement

x = xresume(5);
in its C code will ask XINU to unsuspend process with PID 5. xresume() traps to the kernel's system call dispatcher _Xint46 which calls internal kernel function resume() to carry out the actual work of resuming the specified process. We noted that in Linux most of the work for creating a child process is carried out by internal kernel function do_fork(). Because successfully executing the machine code of do_fork() entails kernel mode privilege, do_fork() can be directly called by an app process (it is just code whether it contains privileged instructions or not) but doing so will fail since the app process running in user mode is trying to perform operations requiring kernel mode. Only by calling the wrapper function fork() which traps the calling process from user mode to kernel mode and the internal kernel function do_fork() is successfully executed.

Hence even though Linux is open source and there is nothing preventing an app programmer from copying the code of do_fork() into a function, say mydo_fork(), and calling mydo_fork() from main(), it will fail to create a kernel process due to isolation/protection implemented by Linux. What an attacker gains is knowledge of how do_fork() is coded which may reveal bugs that can then be exploited. In the long run, the fact that Linux's code is open source has enabled its code to be examined by thousands of programmers which has resulted in fixes and improvements. Proprietary code, on the other hand, has been examined by a much smaller group of people, and its reliability and security relies heavily on attackers not understanding the code. This can be a risky assumption. Overall, there are always pros/cons.

Going back to our wrapper functions, their job is to do some preparatory work before executing the trap instruction to switch to kernel mode and eventually reach the pertinent internal kernel function of the upper half where the requested service is carried out in kernel mode. After returning from the trap back to user mode, final clean-up chores may need to be performed by a system call wrapper function including passing along a return value from the internal kernel function via the system call dispatcher to the app function that called the system call wrapper function. For example, in the case of xyield() the return value OK of internal kernel function yield() is returned to its caller, the system call dispatcher _Xint46. _Xint46 before executing iret makes sure that the value returned from yield() is returned to xyield(). Since we are following CDECL through register EAX.

5.5 System call wrapper function: xresume()

We will consider two methods of implementing a system call wrapper function. The first method is coding an assembly function that is called from the wrapper function which is coded in C. We started practicing these techniques in lab1. The assembly function will then execute instruction, int $46, to trap to _Xint46. The second method, instead of calling a function coded in assembly, imbeds assembly code (which includes trap instruction "int $46") in the wrapper function's C code. To embed means that the assembly code is directly merged with the assembly code of the C wrapper function produced by gcc. That is, there is no function call needed from the wrapper function's translated assembly code to the assembly code containing the trap instruction. Thus, the second method, inline assembly, is more efficient and faster since the overhead associated with making a function call is eliminated.

Please keep in mind from the lectures that function calls, even with stack management hardware support (e.g., ESP and EBP registers, push and pop instructions) is slower compared to the same task achieved without making a function call. Since system calls may be frequently called by app processes, making system calls efficient is important. Hence inline assembly, all else being equal, is preferred. An improved trap instruction, sysenter, in x86 uses hardware support to further reduce system call overhead. We will code xyield() using inline assembly in 5.5. We will implement xresume() by coding an assembly function that xresume() calls which, then, causes a trap to _Xint46.

Code the assembly function, pri16 trapresume(pid32), that wrapper function xresume() calls in trapresume.S. Since xresume() is translated by gcc using CDECL, code trapresume() so that it abides by CDECL. Thus user process running main() calls xresume() which calls trapresume(). When trapresume() executes "int $46" the transition, or trap, from user code to kernel code takes place. We cannot say user mode to kernel mode since in XINU all processes always run in kernel mode. XINU runs with protected mode enabled, it just chooses not to implement isolation/protection. Problem 5 is the main step toward supporting isolation/protection for x86 XINU without going all the way which entails additional hardware background and programming complications. trapresume() traps to _Xint46 which calls resume(). Then the return journey from resume() up the chain through _Xint46 and trapresume() to xresume() and its caller main(). At every step, make sure to be aware of what is happening to the stack and relevant registers per CDECL so that you and gcc are on the same page.

5.6 System call wrapper function: xyield()

We will use inline assembly to code xyield() thereby reducing overhead. We encountered a simple form of inline assembly in 3.1 where we embedded assembly instruction, "int $6", inside C function main() using asm("int $6") to generate a software interrupt for the invalid opcode exception. Except in very simple cases that require no communication of values between C and embedded inline assembly code, we will need to use a richer framework for embedding assembly code in C code called extended inline assembly. Although xyield() has no arguments, it returns a value of type unsigned int that must be assigned to a C variable so that it can be utilized by the surrounding C code of xyield(). In general, extended assembly may involve passing of arguments and execution of multiple instructions which require careful specification following extended asm() syntax so that the embedded assembly code does not conflict with the surrounding assembly code generated by gcc when translating the C code of xyield(). Potential conflicts include use of specific registers by the inline assembly code that overwrite, or are overwritten by, assembly code generated by gcc upon translating C code.

In the case of extended inline assembly code inserted in xyield() C code, before executing instruction, int $46, we will need to store the system call number 9 which identifies xyield() in register EBX needed by _Xint46. After executing the trap, when _Xint46 returns to the next instruction of xyield() by executing iret, we may need to copy the return value contained in EAX to a local C variable of xyield() if further operations on the returned value are performed by xyield(). If not, leaving the value untouched in EAX may suffice since the CDECL caller/callee interface between main() and xyield() expects xyield() to put its return value in EAX. In general, extended inline assembly follows the format

asm(assembler template
: output operands
: input operands
: clobbered registers);

The last component, clobbered registers, is a list of registers that the extended inline assembly code may be modifying. When embedding the extended inline assembly code, this information will be used to save/restore, or avoid if possible, these registers in the surrounding assembly code generated by gcc from the C code of xyield(). For our purposes, it is best to start with simple examples with references (html) serving as a more comprehensive guide as needed. For example, assuming system call number for xyield() is stored in local C variable, int sysno, and the value returned by _Xint46 in EAX is to be copied to local C variable, int retno, asm()'s input operands would specify "a"(sysno) meaning that the value in C variable sysno should be stored in EAX. This comprises the input part. For output operands of asm(), we would specify "=a"(retno) meaning that the content of EAX is to be copied to C variable retno. Another simple test you can perform is code Problem 3.4 of lab1 not as an assembly function addtwo() in addtwo.S but as extended inline assembly code embedded in main(). Coding of extended inline assembly for CS354 is minimal and requires no special expertise beyond the prerequisite background of CS354. Verify that xyield() works correctly by calling it from main().

5.7 Trapped system call implementation remains in kernel mode

As noted earlier, trapped system calls xyield() and xresume() follow the general framework for implementing trapped system calls in Linux and Window with one caveat: when trapping by executing, "int $46", in system call wrapper functions to _Xint46 via x86's IDT, we did not switch stacks. Nor did we concern ourselves with the CPL bits of the CS register which will specify whether the current process runs in kernel mode (00) or user mode (11). When discussing software support for isolation/protection we emphasized that per-process kernel stack is needed to perform nested kernel function calls in kernel mode. For lab2 we are preserving XINU's run-time stack architecture. Although XINU operates in protected mode (i.e., PE bit of CR0 register is set to 1 during bootloading), all processes created using system call create() start out in kernel mode, stay in kernel mode, and end in kernel mode. As discussed in the lectures, XINU's GDT is initialized with three entries for kernel code segment, kernel data segment, and kernel stack segment that are pointed to by CS, DS, and SS, respectively. We noted that the stack segment is not necessary as SS can be made to point to the same entry as DS which is how Linux and Windows configures GDT. The key difference of Linux and Windows is that their GDT contains two additional entries: one for user code segment and another for user data segment. Newly created processes start out with CS pointing to the user code segment of GDT and DS and SS pointing to the user data segment of GDT. In XINU, registers CS, DS, SS always point to the three entries whose DPL is set to 00 specifying they are kernel mode entries.

Adding user mode code and data segment entries in XINU's GDT is not difficult. However, it requires acquiring additional x86 background including hardware supported stack switching and context manipulation. Our implementation of xyield() and xresume() in lab2 will perform system call trapping using x86's interrupt hardware support following the traditional method used by Linux and Windows. However, all XINU processes will start off in kernel mode upon creation and stay in kernel mode throughout their existence. You will benefit from understanding how trapped system calls are designed and implemented without dealing with complications stemming from reconfiguring GDT, managing CS, DS, SS registers, and hardware support dual stacks in x86.


Bonus problem [25 pts]

Extend the trapped XINU system calls to include clkticks() from Problem 4. Call the wrapper function xclkticks() and use inline assembly to trap to system call dispatcher _Xint46. Set the system call number as 4. clkticks() will fill the role of the internal kernel function. Test that your implementation works correctly.

Note: The bonus problem provides an opportunity to earn extra credits that count toward the lab component of the course. It is purely optional.


Turn-in instructions

General instructions:

When implementing code in the labs, please maintain separate versions/copies of code so that mistakes such as unintentional overwriting or deletion of code is prevented. This is in addition to the efficiency that such organization provides. You may use any number of version control systems such as GIT and RCS. Please make sure that your code is protected from public access. For example, when using GIT, use git that manages code locally instead of its on-line counterpart github. If you prefer not to use version control tools, you may just use manual copy to keep track of different versions required for development and testing. More vigilance and discipline may be required when doing so.

The TAs, when evaluating your code, will use their own test code (principally main()) to drive your XINU code. The code you put inside main() is for your own testing and will, in general, not be considered during evaluation.

If you are unsure what you need to submit in what format, consult the TA notes link. If it doesn't answer your question, ask during PSOs and office hours which are scheduled M-F.

Specific instructions:

1. Format for submitting written lab answers and kprintf() added for testing and debugging purposes in kernel code:

2. Before submitting your work, make sure to double-check the TA Notes to ensure that any additional requirements and instructions have been followed.

3. Electronic turn-in instructions:

        i) Go to the xinu-spring2021/compile directory and run "make clean".

        ii) Go to the directory where lab2 (containing xinu-spring2021/ and lab2ans.pdf) is a subdirectory.

                For example, if /homes/alice/cs354/lab2/xinu-spring2021 is your directory structure, go to /homes/alice/cs354

        iii) Type the following command

                turnin -c cs354 -p lab2 lab2

You can check/list the submitted files using

turnin -c cs354 -p lab2 -v

Please make sure to disable all debugging output before submitting your code.


Back to the CS 354 web page