CS 354 Spring 2021

Lab 5: Asynchronous Event Handling with Callback Function (280 pts)

Due: 04/21/2021 (Wed.), 11:59 PM

1. Objectives

Problem 5, lab4, considered changing the execution path of processes at run-time from the perspective of attackers that are aiming to compromise a computing system. In this problem, we will utilize and extend the techniques to allow a kernel to modify execution paths at run-time to support asynchronous handling of events which app programmers rely upon.

2. Readings

Read Chapters 8, 9, and 10 of the XINU textbook.

3. Enforcing execution detours at run-time (120 pts)

Please use a fresh copy of XINU, xinu-spring2021.tar.gz, but for preserving the myhello() function from lab1 and removing all code related to xsh from main().

Problem 5, lab4, considered changing the execution path of processes at run-time from the perspective of attackers that are aiming to compromise a computing system. The technique of modifying return addresses to affect changes in code execution, called ROP (return oriented programming), will be enhanced before using it as a building block to implement asynchronous event handling support in XINU.

To do so, we will build on and extend the programming techniques utilized in Problem 5 of lab4 so that after successfully jumping from ctxsw() to malwareI() when a suspended victim process resumes, after malwareI() completes its work we will make the process jump to the original return address of ctxsw(). For example, if ctxsw() was called by resched() then the attacker allows the victim process to continue its normal execution thus potentially hiding the fact that it was hijacked to execute malware code. In the corner case where the victim process runs for the first time, ctxsw() jumps to the app code, i.e., function pointer specified as first argument of create(). Thus a detour is injected into the victim's code execution oblivious to the victim process. To facilitate detour of the victim's execution path, we will consider the corner case separately from the case where ctxsw() was called by resched().

3.1 ctxsw() called by resched()

We will first consider the scenario where a victim process ran before and entered into suspended state. This implies that ctxsw() was called by resched(). Code an app, void wrongturnIIa(pid32 z), in wrongturnIIa.c that works similarly to wrongturnII() of lab4 but with a twist: before overwriting the return address of ctxsw(), wrongturnIIa() remembers the original return address of ctxsw() in a global variable, void *origretaddr, declared in wrongturnIIa.c. The modified return address points to, void malwareIa(void), in malwareIa.c which works similarly to malwareI() of lab4 but jumps back to the original return address of ctxsw() instead of calling exit() to terminate the victim process. Before malwareIa() executes return, use extended in-line assembly to add code that pushes the original return address of ctxsw() stored in origretaddr onto the stack so that ESP points to this address. When malwareIa() executes ret, it will pop the original return address of ctxsw() and jump to it. That is, malwareIa() jumps to the same address of resched() that ctxsw() would have jumped in the absence of the detour. The victim's stack must be left in the same state when jumping back to resched() from malwareIa() as in the case without the detour to ensure that the victim process continues executing correctly. To leave no footprint that can have side effects on correct execution of the victim process after returning to resched(), register values must be in their original state before jumping to resched(). Describe the logic behind your code in lab5ans.pdf.

3.2 Victim process runs for the first time

To handle the corner case where the victim process runs for the first time, hence ctxsw() was not called by resched(), we will use a slightly different method from 3.1 to achieve the detour. Instead of remembering the original return address of ctxsw() and causing a jump to malwareIa(), wrongturnIIa() will rearrange the stack content of the victim process so that the address of malwareI() -- the original malwareI() code of lab4, not malwareIa() -- is inserted in the victim's stack above the original return address of ctxsw(), i.e., function pointer provided in the first argument of create(). Thus the victim's stack grows by 4 bytes and the content above the original return address of ctxsw() is shifted by one word to make space for inserting the malwareI function pointer. We must also update the saved stack pointer in the process table entry of the victim process so that ctxsw() accesses the modified top of the stack address of the victim process. When the victim process becomes current for the first time, ctxsw() will jump to malwareI(). When malwareI() performs return, it will jump to the original return address of ctxsw() which was preserved on the victim's stack. In the corner case, the detour is implemented by stack manipulation by malwareIa() without assistance by malwareI().

Test and verify that 3.1 and 3.2 work correctly. The problems are an exercise in verifying your understanding of how control flow of executing programs is facilitated by run-time stacks. The techniques can be utilized by hackers to attack a computing system, for a kernel to mount a defense as in Problem 6 of lab4, or to support enhanced kernel services in the form of asynchronous event handling in Problem 4.

4. Asynchronous CPU usage exceeded event with callback function (160 pts)

4.1 Overall structure

As discussed in the lectures, asynchronous IPC with callback function entails a receiver process registering with the kernel a callback function that is to be executed in the receiver's context in user mode when it receives a message. That is, the receiver's code does not call its callback function directly but requests that the kernel arrange for its execution if a message arrives while preserving isolation/protection since the callback function is user code. In the meantime, the receiver process goes about executing its synchronous code. By default, all application code is synchronous code that is invoked through nested function calls (e.g., starting from main() in C code). An asynchronous callback function is not part of the nested function calls.

In this problem, we will consider generalized asynchronous event handling with callback function -- IPC is but one instance -- which is an important service provided by modern operating systems. In UNIX and Linux the service is called signal handling which includes asynchronous IPC. Timers, pressing CNTL-C to terminate a process, etc. are among the events that apps may be coded to utilize and respond to. In this problem, we will consider an asynchronous event supported by Linux called "CPU usage exceeded" (i.e., signal SIGXCPU) which is used to limit total CPU usage by a process. The default outcome, or disposition, of exceeding a set CPU usage threshold is termination of the process. Linux kernels provide the app programmer with the ability to determine what happens to a process by registering a callback function, called signal handler in UNIX/Linux parlance, which is invoked by the kernel on behalf of the process when the event occurs. As with asynchronous IPC, the main technical issue is ensuring that the user's signal handler is executed in user mode when the process continues execution to preserve isolation/protection. To do so, we will utilize the ROP programming techniques practiced in lab4 and extended in Problem 3 to facilitate a detour to a callback function when a "CPU usage exceeded" event occurs.

4.2 System call interface

A process registers a callback function with the kernel that is to be executed when the process's total CPU usage exceeds a specified value. This is done through system call

syscall registerxcpu(void (* fp)(void), uint32 cpulimit);
in registerxcpu.c where the first argument is a function pointer to the user callback function and cpulimit specifies a threshold on the process's CPU usage which must be strictly positive. Introduce a new process table field, bool8 prxcpureg, that is initialized to 0. It is set to 1 if registerxcpu() is called successfully. Add a new process table field, void *prxcpufp, where the function pointer fp is remembered. registerxcpu() returns SYSERR if cpulimit is not strictly positive or registerxcpu() has already been called. In production systems we must also check that the function pointer fp is valid by verifying that it falls in the code segment of user code. We will ignore this check for simplicity. registerxcpu() returns 0 if the system call was successful. XINU will not have a default disposition to terminate a process whose CPU usage has exceeded a threshold. Instead, if the prxcpureg flag is set, XINU will always run the user supplied callback function fp. Add a third process stable field to remember cpulimit.

4.3 Detecting CPU usage exceeded event

To monitor a process CPU usage in unit of 1 tick (1 msec), we will reuse the tick counter global variable, uint32 xclockticks, from Problem 4, lab2, and CPU usage monitoring code of Problem 4, lab3. Thus port clkhandler() and resched() (and any other relevant code changes) from lab2 and lab3 by removing all unnecessary code. For example, system calls totcputicks() and xcpuutil() are not needed. If your monitoring code from lab2 and lab3 had issues, get them fixed. Since both lab assignments are in the past, help provided in the PSOs/office hours will go beyond providing guidance and hints. If needed, you will also be provided with working code. Please keep in mind that integrating code that you did not write into your own code can be more difficult than fixing bugs in your own code.

With CPU usage monitoring code ported from lab2 and lab3, checking if the current process has registered a CPU usage exceeded callback function and detecting if CPU usage of the current has exceeded threshold cpulimit must be performed. The asynchronous event is driven by XINU's 1 msec clock interrupt on our x86 backends which is handled by clkhandler(). One simplification we benefit from is that for the "CPU usage exceeded" event the current process is the process that registered a callback function. In general, for events such as timers and message send the process that registered a callback function may not be the current process. This is relevant since to assure isolation/protection a callback function must be run in user mode and in the context of the process that registered it. For "CPU usage exceeded" we only need to ensure that a callback function is run in user mode. In XINU where processes always run in kernel mode, this means when the boundary between kernel code and user code is crossed.

4.4 Preserving isolation/protection

As noted in 4.3, for the event "CPU usage exceeded" we do not need to consider the possibility that the process who registered a callback function is not the current process. When clkhandler() detects that the current process -- if it has registered a callback function -- has exceeded its CPU usage threshold, it must make arrangements so that when x86 execution crosses the boundary from kernel code to user code a detour to the callback function is made. In our case, the boundary is where clkdisp executes iret which causes a jump to the interrupted user code. Instead of jumping to the original "return address" of clkdisp, we will remember this address and overwrite it with the address of a helper code, detourguide, in detourguide.S that facilitates the detour. detourguide is system code that does not need to be run in kernel mode. Hence as with the start function inserted by gcc which calls main(), or the userret() function inserted by XINU to which app code jumps to after completion which calls kill(), system code that is not written by the programmer but runs in user mode falls under user code. Who wrote it is not essential. In what mode it must run is.

The task of the helper code detourguide is to call the user's callback function. After the callback function returns, detourguide needs to arrange the stack so that when it executes the ret instruction a jump to the original return address of clkdisp is made. This involves utilizing the technique of 3.1, albeit coding the helper in assembly which simplifies implementation. Describe in lab5ans.pdf the steps behind the work carried out by detourguide and how compares to 3.1.

4.5 Testing

Test and verify that your implementation works correctly. Describe in lab5ans.pdf the test cases you have considered to establish correctness.

Bonus problem [20 pts]

One of the benefits of virtual memory is supporting contiguous user memory in a process's virtual address space using fragmented physical memory in the kernel's physical address space to mitigate external fragmentation. Suppose an app process makes multiple vgetmem() system calls where vgetmem() returns a virtual address in the process's virtual address space. That is, vgetmem() is a counterpart of getmem() if XINU were to implement virtual memory. The process makes multiple vfreemem() system calls which is the counterpart to freemem() under virtual memory support. Despite external fragmentation seemingly having been solved using virtual memory support, why can an application still suffer under external fragmentation? Explain using an example. What are possible solutions? What are their pros/cons? Discuss your answer in lab5ans.pdf.

Note: The bonus problem provides an opportunity to earn extra credits that count toward the lab component of the course. They are purely optional. Two bonus problems are provided for those wishing to utilize them to reach the 50% maximum contributed by the lab assignments.

Turn-in instructions

General instructions:

When implementing code in the labs, please maintain separate versions/copies of code so that mistakes such as unintentional overwriting or deletion of code is prevented. This is in addition to the efficiency that such organization provides. You may use any number of version control systems such as GIT and RCS. Please make sure that your code is protected from public access. For example, when using GIT, use git that manages code locally instead of its on-line counterpart github. If you prefer not to use version control tools, you may just use manual copy to keep track of different versions required for development and testing. More vigilance and discipline may be required when doing so.

The TAs, when evaluating your code, will use their own test code (principally main()) to drive your XINU code. The code you put inside main() is for your own testing and will, in general, not be considered during evaluation.

If you are unsure what you need to submit in what format, consult the TA notes link. If it doesn't answer your question, ask during PSOs and office hours which are scheduled M-F.

Specific instructions:

1. Format for submitting written lab answers and kprintf() added for testing and debugging purposes in kernel code:

2. Before submitting your work, make sure to double-check the TA Notes to ensure that any additional requirements and instructions have been followed.

3. Electronic turn-in instructions:

        i) Go to the xinu-spring2021/compile directory and run "make clean".

        ii) Go to the directory where lab5 (containing xinu-spring2021/ and lab5ans.pdf) is a subdirectory.

                For example, if /homes/alice/cs354/lab5/xinu-spring2021 is your directory structure, go to /homes/alice/cs354

        iii) Type the following command

                turnin -c cs354 -p lab5 lab5

You can check/list the submitted files using

turnin -c cs354 -p lab5 -v

Please make sure to disable all debugging output before submitting your code.

Back to the CS 354 web page