CS 59000 Security Analytics: Python Tutorial

Instructor: Prof. Ninghui Li

TA: Wuwei Zhang

In this course we are going to use Python for programming assignments. You can download it from: https://www.python.org/downloads. We recommend you use Python 3 or later versions. You must inform your TA if an earlier version (like 2.7) is used in assignment.

We expect you meet the prerequisite of the course. You should know how to program in at least one language. In addition, you should know data structures and algorithms taught in CS 251 and CS 381.

IDE: If you are looking for an IDE, we recommend Pycharm Community Version: https://www.jetbrains.com/pycharm/download/. Please do not download the education version, it is almost identical to a plain text editor.

Python Syntax

Comments

We use # for single line comment and triple-quoted strings (''') for multi-line comment.

In [1]:
# Example 1: This is a comment
'''
This is how we use multi-line comments. 
print(1)
1 will not be printed
2 will not be printed
3 will be printed
'''
# print(2) this should not be printed
print(3) 
3

Indentation

Leading whitespace (spaces and tabs) at the beginning of a logical line is used to compute the indentation level of the line, which in turn is used to determine the grouping of statements.

In [22]:
'''
Example 2: Indentation.
Notice white space and tabs are DIFFERENT. 4 consecuitive white spaces looks identital to a single tab for human, but 
they are different. To make your life easier, we recommend you always enable auto-indentation if possible.
''' 
def thisIsAFunction():
    a = 1
    print(a)

thisIsAFunction() # What if we comment out this line?
1

If and Loop Statements

Similar to other programming languages, Python supports If and Loop Statements.

In [30]:
'''
Example 3: How to use if and loop statements
'''
def checkSign(i):
    if i > 0:  
        return "+"
    elif i == 0:  # ELIF stands from ELSE IF. 
        return "0"
    else:
        return "-"

def forLoop():
    print("This is the result from FOR loop:")
    for i in range(-10,10,2): 
        # A for loop with interval [-10,10) with increment 2. 
        print(str(i) +":" + checkSign(i))

def whileLoop():
    print("This is the result from WHILE loop:")
    i = -10
    while i < 10:
        print(str(i) +":" + checkSign(i))
        i += 2
        #Notice Python does not support "++" operator, use += 1 instead.

    
forLoop()
whileLoop()
    
This is the result from FOR loop:
-10:-
-8:-
-6:-
-4:-
-2:-
0:0
2:+
4:+
6:+
8:+
This is the result from WHILE loop:
-10:-
-8:-
-6:-
-4:-
-2:-
0:0
2:+
4:+
6:+
8:+

Operators

Python supports operators in a friendly way. Such feature enhanced the readibility of the code sharply.

In [35]:
'''
    Example 4 Operators: How to use different operators in Python.
    In this example, we have two functions with the same functionalities: checking if an integer i is in a specific 
    range: (low, high).   
'''

def classicExpression(i, low, high):
    # Use and for &&, or for || and not for ! to increase readibility
    if i < high and i > low: 
        print("yes")
    else:
        print ("no")

def newExpression(i, low, high):
    # A more readable expression. 
    if low < i < high:
        print ("yes")
    else:
        print ("no")

classicExpression(5,0,6)
newExpression(5,0,6)
    
yes
yes

Import Libraries

To import a library, use import statment. You should alwasy import libraries at the beginning the file. You can use "import ... as ..." for short hand. The first form of import statement binds the module name in the local namespace to the module object, and then goes on to import the next identifier, if any. If the module name is followed by as, the name following as is used as the local name for the module.

In [34]:
# Example 5 
import matplotlib.pyplot as plt # Import a library for ploting figures
import sys #  For System-specific parameters and functions

plt.xlim(0,2)   #Setting the x-axis to [0,2]. This is identitcal to matplotlib.pyplot.xlim(0,2)
a = sys.argv[1] # The first command line argument is assigned to a.
Out[34]:
(0, 2)

Some Useful Libraries:

Scipy and numpy are commonly used for scientific computing. If you are not using any IDE, you can download those libraries from the following URLs:

Numpy: https://www.numpy.org

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

Scipy: https://www.scipy.org

SciPy contains additional routines needed in scientific work: for example, routines for computing integrals numerically, solving differential equations, optimization, and sparse matrices.

Matplot: http://matplotlib.org

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. Matplotlib can be used in Python scripts, the Python and IPython shells, the Jupyter notebook, web application servers, and four graphical user interface toolkits.

In [59]:
"""
========
Barchart
========

A bar plot with errorbars and height labels on individual bars
"""
import numpy as np
import matplotlib.pyplot as plt

N = 5
men_means = (20, 35, 30, 35, 27)
men_std = (2, 3, 4, 1, 2)

ind = np.arange(N)  # the x locations for the groups
width = 0.35       # the width of the bars

fig, ax = plt.subplots()
rects1 = ax.bar(ind, men_means, width, color='r', yerr=men_std)

women_means = (25, 32, 34, 20, 25)
women_std = (3, 5, 2, 3, 3)
rects2 = ax.bar(ind + width, women_means, width, color='y', yerr=women_std)

# add some text for labels, title and axes ticks
ax.set_ylabel('Scores')
ax.set_title('Scores by group and gender')
ax.set_xticks(ind + width / 2)
ax.set_xticklabels(('G1', 'G2', 'G3', 'G4', 'G5'))

ax.legend((rects1[0], rects2[0]), ('Men', 'Women'))


def autolabel(rects):
    """
    Attach a text label above each bar displaying its height
    """
    for rect in rects:
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width()/2., 1.05*height,
                '%d' % int(height),
                ha='center', va='bottom')

autolabel(rects1)
autolabel(rects2)

plt.show()

File IO and Object-Oriented Programming

In this section, we will cover how to use file IO and OOP. We assume you already know this in at least one programming, so we will just cover the syntax part. File IO is essential for programming assignments, it will be used for the whole semester. OOP is recommended to be used but not required.

In [46]:
# Example 6: File IO
# Two commonly used data structure [] and {} are covered as well.

inputFile = "testfile.txt" 
wordList = [] # List data structure, it's similar to arrayList in Java
wordDict = {} # Dictionary data structure, it's similar to a hash table
with open(inputFile) as f: # f is a file descriptor
    lines = f.readlines()  # file is readed line by line
    for l in lines:  
        print(l)            #print each line
        for word in l.split(" "): # split each line by whitespace.
            print ("   " + word)
            wordList.append(word) # append this word to the end of wordList
            if word in wordDict:
                wordDict[word] +=1
            else:
                wordDict[word] = 1

'''
    File descriptor is automatically managed by python. You don't have to do this close() 
    manually, but it's a good practice. 
  '''
f.close() 

outputFile = "outputfile.txt"
This is a file

   This
   is
   a
   file

Words should be splited correctly by space.

   Words
   should
   be
   splited
   correctly
   by
   space.

aaaa;;;bbb bbb!aa}

   aaaa;;;bbb
   bbb!aa}

words and Words are different.
   words
   and
   Words
   are
   different.
In [49]:
# Continue of the above example
outputFile = "doesnotexist.txt"
with open (outputFile, 'w+') as f:
    for word in wordList:
        f.write(word + " ")
    f.write("\n")
    for word in wordDict:
        f.write(word + ":" + str(wordDict[word]) + "\n")
In [58]:
# Example 7 Object Oriented Programming:

# A global variable for all computer science course at XY University
Courses = ["502","503","526","555","580","590SA"]

# Use keyword class to define a class


class GraduateStudent: 
    
    advisor = None
    year = 0
    courseTaken = {}
    # A simple constructor
    def __init__(self, advisor, year):
        self.advisor = None
        self.year = 0
        self.courseTaken  = {}
        for c in Courses:
            self.courseTaken[c] = False
           
    def takeCourse(self, course):
        if course in Courses:
            self.courseTaken[course] = True
        else:
           print("wrong course number")
    
    def setAdvisor(self, advisor):
        self.advisor = advisor

def main():
    studentA = GraduateStudent(None, 1)
    studentA.takeCourse("590SA")
    studentA.takeCourse("590")
    studentA.takeCourse("502")
    print(studentA.courseTaken)

main()
wrong course number
{'502': True, '503': False, '526': False, '555': False, '580': False, '590SA': True}

Additional Reading

If you believe you still need more materials to read, we recommend you check the following resources:

  1. Coursera Python Course: https://www.coursera.org/specializations/python
  2. http://mcsp.wartburg.edu/zelle/python/ppics2/index.html This is the textbook website we used for undergraduate python course. The slides are free to download.
  3. Official Tutorial: https://docs.python.org/3/tutorial/.