Overview of the EMIST Tool Suite from Purdue University




1 Introduction

In network simulators, it is easy to create a topology, assign tasks to nodes, and monitor every single packet. A basic testbed -- without any software support that mirrors some of these simulator capabilities -- is extremely limited in its usefulness, since it requires the experimenters to be experts in system-level programming. To achieve a level of control that is comparable to that provided by a simulator on physical testbed machines is a significant undertaking, requiring extensive utility development. Topology creation capabilities are provided by emulation testbeds, such as Emulab and DETER, but an experimenter only acquires bare machines that form the desired topology, without any tools running on them.

A natural approach to describe tasks that must be performed on the testbed nodes is to use event scripts, much like events in an event-driven simulator. The Emulab software implements certain event types such as link failures; however, most of the interaction with the nodes must be performed via a secure shell (SSH) session. We have designed a flexible mechanism to control all test machines from a central location, since manually using each computer is impossible, especially when timed events are involved. We have developed a utility, which we refer to as a Scriptable Event System (SES), to parse a script of timed events and execute it on the test machines. Our utility is capable of receiving callbacks, such that event synchronization can be achieved.

Instrumentation and measurement on a testbed also pose a significant challenge. The capability to log and correlate different types of activities and events in the test network is essential. Not only are packet traces important, but also system statistics must be measured. We have developed a set of tools to log events on the test nodes on a per second basis. Statistics such as CPU utilization, packets per second, and memory utilization are logged to the local disk for later inspection.

2 Tools

Our tools run on Linux, except for the Scriptable Event System, which can easily run on other systems such as FreeBSD.

2.1 Scriptable Event System (SES)

The SES is composed of a master server and zombies/clients.

MASTER

The master server must be started before the zombies can attach to it. The only customization that can be done to the master at this point in time is the ability to specify a port on which to open a socket (20666 is the default).

The master can reside either on users.emulab.net or on some experiment node as long as the control network is used to make sure that messages always reach the participating zombies.

Master control
exit - Exit the server and cleanly destroy all the threads
list - show all the available zombies


t "pause" Block processing of new commands for the given time period.
t name1..nameN "cmd" Have zombies name1 through nameN perform a command cmd, in "t" seconds from now.
t name1..nameN "cmd!" Same as the above, except the input of new commands is blocked until the "cmd" finishes on all of the specified zombies.
t name1..nameN "stop" Stop all the tasks that are running on the specified zombies. Kills everything in the zombie process group except the zombie (i.e., any processes started from shell scripts will die as well).
^C Stop all tasks for all the zombies and also purge all tasks from the input stream. Sometimes ^C has to be hit a few times to ensure that all tasks are purged and callbacks are terminated.
run cmd Run a command and use its output as the input of the server. Thus, a PERL script can have loops and conditionals and just print master server commands which will get executed by the master server.

Note 1: "t" is always an offset relative from the current moment in time; however, events that pause input (pause, !) cause the current time to advance.

Example 1:

5 "pause"
1 node1 "ls > ls.out"


Node1 will execute the cmd "ls > ls.out" 6 seconds after the script starts.

Example 2:

0 node2 "run_some_test!"
0 node3 "copy_measurements"


node3 will execute the command after the command on node2 finishes. If the command takes 500 seconds, then node3 will execute its command at the 500 second mark.

Note 2: A node can schedule more than 1 task at a time. The tasks will be executed in the order they are encountered in the script.
0 node1 "ls > ls.out"
0 node1 "ls -la > la.out"


"ls > ls.out" will be executed before "ls -la > la.out"; The tasks will be executed back to back in the same scheduling round.

CLIENT

The client is similar to a regular shell except it does not support pipes or fancy things such as pattern matching. A client can start multiple tasks. It then periodically polls which ones are still active in order to maintain its task list. The client has to be provided with the name of the machine that runs the master server. Optionally, the port number can be specified if the master is not running at the default port.

For convenience, the user can use the "-nt" command argument to reduce the length of the zombie name reported to the server (e.g., node1.proj.group would be reported as node1 to the master server).

2.2 Measurement Tools

tmeas - This tool records a number of system level statistics. The tool measures data on all of the interfaces that have a 10.x.x.x address. If something else is used, then MATCH_ADDR in tmeas.c can be modified. The tool expects the the user provides the name of the file where logging must occur. (Note: always log on local disk and do not use the NFS while the experiment is running.) It is possible to specify the duration of the run in seconds or the tool can be just killed (i.e., send stop command in the SES).

The file fields are as follows

timestamp,
bytes_per_sec,
pack_per_sec,
bytes_per_sec_up,
pack_per_sec_up,
memtotal,
memused,
uptime,
idletime,
established TCP connections,
half_open TCP connections,
TCPSlowStartRetrans count,
TCPAbortOnTimeout count,
errs on the device drivers,
drops on the device drivers


cwnd_track - This is loosely based on tmeas. The purpose of the tool is pretty limited in its current form. The main goal is to poll TCP congestion window (Cwnd) values for a given IP address. If there is no connection to the provided IP address, the tool waits and logs nothing. Once the connection appears, the tool logs the value along with the time stamp.

2.3 Data Analysis Scripts

We have written a set of scripts that are helpful in analyzing data from BGP logs files and the "tmeas" tool. Since "tmeas" collects a lot of system dependent data per node in a single file, it is essential to be able to merge similar statistics for several nodes. The resulting merged file can be easily fed into gnuplot or other plotting tools.

The scripts are short and can be easily examined. A short overview is given below.

dataPlot.sh is the top level script that needs to be executed. The user has to specify the directory that contains the measurement files. tmeas files need to be named tmeas.nodeX. The script dataPlot.sh can be modified to specify which nodes the user wants to merge and plot. If there are any bgp log files (dataPlot assumes that *.log is a BGP log file), then those files will be aggregated and total number of BGP update messages will be plotted as time progresses. The outputs of the scripts are gnuplot-generated png or ps files.

3 Example

The example script below demonstrates how to automate experiments with the SES. The user must first launch the master server while the experiment is swapping in, so that the clients can establish the connections. After all of the connections are established, the scripts can be executed. The folloring are the steps that the script below executes:

1. Start taking system measurements (node0, node2, node3, r1, r2)
2. Create a TCP sink at node2
3. Run tcpdump on node0 and node2
4. Create a TCP-targeted (square wave) attack in the direction of the receiver
5. Copy a 10 MB file to the TCP sink from node0 and log the output of the file transfe.
6. Wait until the file transfer is complete
7. Stop the attack and logging
8. Copy the dump files, file transfer results, and system measurements to a central location


The high level tasks above can be expressed as the following script. If one wishes to repeat the experiment many times or vary the attack parameters, a PERL script can be written to print the statements below and perform string substitution depending on the iteration. You can use the run command in SES for this.

0 node0 node2 node3 r1 r2 "./tmeas -f /usr/local/tmeas.out"
0 node2 "/usr/bin/ttcp -r > /dev/null"
0 node0 node2 "rm /usr/local/dump.dmp"
0 node0 node2 "sh /proj/DDoSImpact/exp/bell/scripts/dump.sh"
5 node1 "./flood node3 -U -s100 -S10.1.4.1 -W160-4500 -D80000"
9 node0 "/usr/bin/ttcp -v -t node2 < /usr/local/f10m >/usr/local/ttcp.out!"
0 node0 node1 node2 node3 r1 r2 "stop"
0 node0 node2 r1 r2 "killall tcpdump"
1 "pause"
0 node0 "cp /usr/local/dump.dmp /proj/DDoSImpact/exp/bell/data/dump.node0"
0 node2 "cp /usr/local/dump.dmp /proj/DDoSImpact/exp/bell/data/dump.node2"
0 node0 "cp /usr/local/ttcp.out /proj/DDoSImpact/exp/bell/data"
0 node0 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node0"
0 node3 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node3"
0 node1 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node1"
0 node2 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.node2"
0 r1 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.r1"
0 r2 "cp /usr/local/tmeas.out /proj/DDoSImpact/exp/bell/data/tmeas.out.r2"

4 Software Link Monitor


Created by: Roman Chertov
Last updated by: Sonia Fahmy
June 2nd, 2005