final

CS354 Fall 2000

Final Exam

Name:

Part 1. Answer True/False (T/F) (1 point each)

__T__ Spin-locks need kernel support. (To yield the CPU)
__F__ Semaphores should be initialized with numbers greater than 0. (They may be initialized with 0)
__F__ The number of page tables in a computer is equal to the number of CPUs.
__T__ All threads in a process share the same page table.
__F__ In segment-swap entire processes are swapped in and out from disk.
__F__ The page-table register is updated every time there is a context switch from one thread to another.
__T__ Page sizes are always a power of two.
__T__ In demand paging a virtual address is divided into page number and page offset.
__F__ A physical page that is replaced is always written back to the backing store. (Only if it has been modified)
__F__ All virtual memory is backed by swap space.
__F__ The flag PTHREAD_SCOPE_PROCESS is used to create a kernel thread.
__F__ a.out is an executable format used on Windows.
__F__ truss is a command to transfer files in UNIX systems.
__T/F_The data section of a program is memory mapped readable, writable and executable. (In theory it is only readable and writable, but in practice it is also mapped executable)
__F__ Most of the processes in a computer are in ready state.
__F__ There is at most one running process in the computer.
__F__ Most of the CPU time is spent in kernel mode.
__F__ TLBs speed up the translation from physical to virtual addresses.
__T__ A small time quantum will cause a program to take more time.
__F__ The arguments of a system call are checked in user space.
__F__ In non-preemptive scheduling a program may have a context switch at any time.
__F__ A program that runs with preemptive scheduling runs faster than one in non-preemptive scheduling.
__T__ Most of the processes' CPU bursts are completed before the time quantum expires.
__F__ There is only one page-table register in every computer.
__T__ Programs that run with SJF have a smaller response time than FCFS.
__F__ read() is a system call but close() is not.
__F__ A pipe that is used to communicate a parent and a child process could be created by the child.
__T__ pthread_create() by default creates user-level threads.
__F__ In the bounded buffer problem solution using semaphores, the readBuffer() procedure waits on a semaphore that is initialized to the number of buffers available.
__T__ cond_wait() unlocks the mutex lock that is passed as a parameter.

Part 2. Short Questions (4 pts)

1. What are the advantages and disadvantages of using threads vs. processes.

Advantages:
Context Switch among threads is faster
Thread creation is faster
Threads need less resources
Communication among threads may be done using shared memory instead of pipes.
Disadvantages:
Threads are less robust than processes. If one thread crashes it will crash the entire process.
Thread synchronization is necessary to avoid corruption of data structures in memory

2. Why it is important to make small the critical sections that mutex locks protect?

Critical sections should be small to allow higher parallelism in multi-processor machines. Long critical sections force that only one thread may run at a time using only one processor. Other processors will remain idle.

3. Describe the problems involved using malloc and free.

Premature Frees - Free an object that is still in use. The object may be allocated again overwiting the previosu information.
Memory Leaks - Objects that are not longer in use are not freed making the memory usage of the program grow without limits.
Double free - An object is freed twice
Free of non heap objects - Occurrs when an object that was not allocated with malloc/calloc/realloc is freed.

4. Write the code of the test_and_set operation, and the code of the spin_lock()/spin_unlock().

// test_and_set() is provided by an assembly instruction that
// is executed atomically by the CPU. This is the equivalent
// code in "C".
int test_and_set( int * v )
{
int oldval = *v;
v = 1;
return oldval;
}
void spin_lock(int * v )
{
while ( test_and_set( v ) ) {
thr_yield();
}
}
void spin_unlock( int * v ) {
*v = 0;
}

5. Write the steps for servicing a page-fault.

0. An intruction that access or modifies a page causes the page fault.
1. The CPU saves the return PC in the stack as well as other registers
2. It jumps to routine pointed by the entry in the interrupt vector that corresponds to a page fault
3. If the page is invalid send SEGV to process and return.
4. If page is not resident find a candidate physical page to replace, write the page to replace to disk if necessary and load the new page.
Update the page tables to reflect the new change.
5. If page is copy-on-write create a copy of the page in physical memory, update the page table of the process.
6. If page does not have enough persmissions send SEGV to process and return.
7. Clear modified/access bits
8. Return from interrupt
9. Retry offending instruction
10. Program consitnues running as if nothing happenend.

6. Explain how fork() benefits from using copy-on-write.

When a parent process calls fork() to create a child process the OS does not create a duplicate of the physical pages of the parent in the child process. Instead, it only duplicates the page table of the parent in the child process. Both parent and child will use the same physical pages as long as they are not modified. If either parent or child modifies a page, the OS will create a copy of the page, update the page table of the modifying process to point to this new page, and retry the instruction. Only modified phisical pages are copied. This saves physical memory as well as execution time since only modified pages are copied. This is called lazy copy.

7. Explain how execvp() benefits from using virtual memory by memory mapping the text segment.

When a program is loaded after calling execvp(), the OS does not copy the executable file into physical memory. Instead, the OS memory maps the file and when the CPU fetches instructions from memory it will generate page faults for the pages that are not already resident in physical memory. In this way, only pages of the executable that are needed are loaded into physical memory. This saves physical memory as well as execution time.This is called lazy read.

8. The following three threads have access to five databases. Each database has its own mutex lock. a) Give an example of how these three threads can enter into a deadlock and b) explain how this can be prevented.

thread1()
{
    while (1) {
      mutex_lock( &mutex_db5);
      mutex_lock( &mutex_db2);
      mutex_lock( &mutex_db4);
      // Access db5, db2, db4
      mutex_unlock( &mutex_db5);
      mutex_unlock( &mutex_db2);
      mutex_unlock( &mutex_db4);
    }
}

thread2()
{
    while (1) {
      mutex_lock( &mutex_db1);
      mutex_lock( &mutex_db6);
      mutex_lock( &mutex_db5);
      // Access db1, db6, db5

      mutex_unlock( &mutex_db1);
      mutex_unlock( &mutex_db6);
      mutex_unlock( &mutex_db5);
    }
}

thread3()
{
    while (1) {
      mutex_lock( &mutex_db4);
      mutex_lock( &mutex_db7);
      mutex_lock( &mutex_db1);
      // Access db4, db7, db1
      mutex_unlock( &mutex_db4);
      mutex_unlock( &mutex_db7);
      mutex_unlock( &mutex_db1);
    }
}
a) How these threads can enter into a deadlock.
1. Assume that thread1() starts running. It locks mutex_db5 and then there is a context switch.
2. Thread2 starts running. It locks mutex_db1, mutex_db6 and then when it attempts to lock mutex_db5 it will block since mutex_db5 is already locked by thread1.
3. Thread3 starts running. It locks mutex_db4, mutex_db7, and when it attempts to lock mutex_db1 it will block since mutex_db1 is already locked by thread2.
4. Thread1 starts running. It locks mutex_db2 but when it attempts to lock mutex_db4 it will block since mutex_db4 is locked by thread3.
5. At this point thread1, thread2, and thread3 are blocked in a deadlock.
thread1 <---- mutex_db5 <----- thread2 <----- mutex_db1
    |                                             ^
    |                                             |
    |-----> mutex_db4 -----> thread3--------------|
b) Explain how this can be prevented.
Enforce an ordering in how the mutex_locks are locked.
Thread 1 should lock mutex_db2, mutex_db4, mutex_db5 in this order.
Thread 2 already locks the mutex locks in the required order: mutex_db1, mutex_db5, mutex_db6.
Thread 3 should lock mutex_db1, mutex_db4, mutex_db7.

9. Assume the following 2-level page table. Translate the VM address 0x00801578 to its corresponding physical address. Assume a page size of 4KB and that the first 10 bits of the VM address index the level 1 page table, the next 10 bits index the second level and the bits left are the page offset, similar to what was discussed in class. Assume a 32-bit word size. The third column in the level 2 page tables are physical page addresses.(5 pts)

VM Address 0x00801578 is __0x58000578____ in Physical Memory

Offset = 0x578 (Last 12 bits)
Page Number = 0x00801 (Remaining first 20 bits)
Page Number in binary: 0000 0000 1000 0000 0001
Level 1 index = 0000 0000 10 (2)
Level 2 index = 00 0000 0001 (1)

Looking up Level 1 table entry 2 we get that the 2nd level page table to use is in address 0x32000.

Looking up Level 2 table in address 0x32000 in entry 1 we get that the page number in phisical memory that corrsponds to the VM address is 0x58000

Putting together both physical page number and offset we obtain that the physical address is: 0x58000578

Level 1
0	0x38000
1	0x36000
2	0x32000
3	0x40000

Level 2
At 0x38000	0	0x42000
	1	0x50000
	2	0x70000

Level 2
At 0x32000	0	0x56000
	1	0x58000
	2	0x72000

Level 2
At 0x36000	0	0x5a000
	1	0x5c000
	2	0x62000

Level 2
At 0x40000	0	0x89000
	1	0x82000
	2	0x86000

10. The following procedure insert()adds an item to the list pointed by head. If insert() is used by two threads, a) give one sequence of steps that can corrupt the list. You may use the labels (1) to (4) at the left of each instruction to describe your solution. b) Add the mutex_lock/mutex_unlock() calls to the code to solve the list corruption such that they cover the minimum number of steps that form the critical section.

struct List {
int val;
int next;
};
struct List * head = NULL;
void insert( int val )
{
(1)List tmp = new List;

(2)tmp->val = val;

(3)tmp->next = head;

(4)head = tmp;
}
a)
Thread 1: (1) (2) (3) (context switch)
Thread 2: (1) (2) (3) (4) (context switch)
Thread 1: (4) (context switch)
The element inserted by thread 2 has been lost.
b)
mutex_t m;
void insert( int val )
{
List tmp = new List;
tmp->val = val;
mutex_lock( &m );
tmp->next = head;
head = tmp;
mutex_unlock(&m);
}

Part 3 Long Questions. (10 points each)

11. The bounded buffer problem that we covered in class has been used for writing and reading only one buffer at a time. Generalize the bounded buffer to implement the procedures writemany( int wsize, char * wdata) and readmany( int rsize, char * rdata), where writemany will write wsize bytes from wdata to common_buffer and will block if there is no space available in common_buffer. readmany will read rsize bytes from common_buffer and will block until there is enough data available. writemany/readmany will always write/read consecutive bytes. Feel free to define as many variables as you want. Use semaphores for synchronization. Multiple writer/reader threads may call writemany/readmany.

const int MAX_BUFF_SIZE=256;
char common_buffer[ MAX_BUFF_SIZE];
// Readers read starting at head
int head;
// Writers write after tail
int tail;
// Mutexes used to force read/write consecutive bytes
// in the buffer and not have interleaved readers/writers.
mutex_t write_mutex;
mutex_t read_mutex;
// Semaphore used by writers to wait when buffer is full
sema_t full_sem;
// Semaphore used by readers to wait when buffer is empty
sema_t empty_sem;
void initialize()
{
head = 0;
tail = 0;
mutex_init( &write_mutex );
mutex_init( &read_mutex );
sema_init( &full_sem, MAX_BUFF_SIZE );
sema_init( &empty_sem, 0 );
}
void writemany( int wsize, char * wdata){
// write_mutex forces that there are not interleaved writers
mutex_lock(&write_mutex);
for ( int i = 0; i < wsize; i++ ) {
    // Wait if buffer is full
    sema_wait( &full_sem );
    // Add data to buffer
    common_buffer[ tail ] = wdata[ i ];
    tail = (tail + 1)%MAX_BUFF_SIZE;
    // Wake up any readers
    sema_post( &empty_sem );
}
mutex_unlock(&write_mutex);
// Note: The write_mutex protects any
// access/modification of the "tail" variable
// by multiple threads
}
void readmany( int rsize, char * rdata) {
// read_mutex forces that there are not interleaved readers
mutex_lock(&read_mutex);
for ( int i = 0; i < rsize; i++ ) {
    // Wait if buffer is full
    sema_wait( &empty_sem );
    // Read data from buffer
    rdata[ i ] = common_buffer[ head ];
    head = (head + 1)%MAX_BUFF_SIZE;
    // Wake up any writers
    sema_post( &full_sem );
}
mutex_unlock(&read_mutex);
// Note: The read_mutex also protects any
// access/modification of the "head" variable
// by multiple threads
}

12. Write the code for RPCClient::threadServer() from project 2. Try to be as precise as possible.

int
RPCServer::threadServer( RPCServer * s )
{
// See your project for details

Waits until there is available call buffer (waits on the _full semaphore).
Remove one call buffer (i) from available queue
    lock mutex
    i = available;
    available = cb->next;
    unlock mutex
    RPCCallBuffer *cb = &CallBuffers[i];
Copy rpcName and args into cb
Insert cb into service
    Lock mutex
    cb->next = service;
    service = i;
    Unlock mutex
Wake up server using sema_post _empty.
Wait on cb->done semaphore.
After returning, copy results from cb to result.
Add cb to available
Wake up any other client waiting on full semaphore
}

13. Complete the class MReadNWriteLock that implements a generalization of read/write locks. These locks either allow at most M readers or at most N writers but not both. For example, if one thread is holding the lock in write mode, reader threads will have to wait. If one thread is holding the lock in read mode, writer threads will have to wait. It is possible to have up to N writers holding the lock in write mode. If writer thread N+1 tries to write, it will have to wait. It is possible to have up to M reader threads. If thread M+1 tries to read, it will block until another reader thread unlocks the lock. Use condition variables.

class MReadNWriteLock {
/* Add other fields here */

// Current readers and writers
int _nreaders;
int _nwriters;

// Maximum readers and writers
int _maxreaders;
int _maxwriters;

// Common mutex
mutex_lock _mutex;

// Condition variables to synchronize readers/writers
cond_var _cond;

public:
MReadNWriteLock( int m, int n );
readLock();
writeLock();
readUnlock();
writeUnlock();
};

MReadNWriteLock= m; ::MReadNWriteLock ( int m, int n )
{
maxreaders = m;
maxwriters = n;
mutex_init( &_mutex );
cond_init( &_cond );
}

MReadNWriteLock ::readLock()
{
mutex_lock( &_mutex );

// Wait if there are writers or number
// of readers has been exceeded
while ( _nwriters > 0 || _nreaders >= _maxreaders ) {
cond_wait( &_cond );
}

_nreaders++;

mutex_unlock( &_mutex );
}

MReadNWriteLock ::writeLock()
{
mutex_lock( &_mutex );

// Wait if there are readers or number
// of writers has been exceeded
while ( _nreaders > 0 || _nwriters >= _maxrwriters ) {
cond_wait( &_cond );
}

_nwriters++;

mutex_unlock( &_mutex );
}

MReadNWriteLock ::readUnlock()
{
mutex_lock( &_mutex );

_nreaders--;

// Wake up any readers/writers waiting
cond_broadcast( &_cond );

mutex_unlock( &_mutex );
}

MReadNWriteLock ::writeUnlock()
{
mutex_lock( &_mutex );

_nwriters--;

// Wake up any readers/writers waiting
cond_broadcast( &_cond );

mutex_unlock( &_mutex );
}