STAT59800-JN1/CS59000-030 Spring 2010 Tuesday-Thursday 3:00-4:15 REC 309
Schedule Project H2O Project Resources
Professor Jennifer Neville
Lawson 2142D neville[at]cs.purdue.edu 496-9387
Office hours: By appointment (arrange by email)
Many modern data analysis problems involve large data sets of artificial, social, and biological networks that can be represented as graphs. In these settings, traditional IID assumptions are inappropriate; the analyses must take into account the structure of relationships between the data instances. As a result, there has been increasing amount of research developing techniques for incorporating network and graph structures into machine learning and statistics.
Network modeling is an active area of research in several domains. Statisticians have mostly concentrated on models of static networks, which focus on predicting the existence of edges between individual nodes, and do not attempt to model aggregate properties of the graph. In contrast, physicists have developed techniques to model global properties of large complex networks. Their models describe average statistics of the network and focus less on the individual links between particular nodes.
This course will provide an introduction to probabilistic methods for network analysis, paying special attention to model design and computational issues of learning and inference. We will survey statistical network modeling research in multiple communities, including statistics, computer science, and physics.
Classes will consist of instructor presentations, student presentations, and group discussions. Students will be required to (1) read, discuss, and present research papers, and (2) complete a semester-long class project. Potential projects include: a survey paper of research in a subtopic of interest, an empirical investigation of the performance of graph generation algorithms, an analysis of real-world data to determine local and global network characteristics, design and implementation of a new network model/algorithm.
Mathematical maturity and an introductory Statistics course (e.g., STAT416/511/516).
Readings from the current research literature, 1-2 papers per class. See course schedule.
Kolaczyk, Eric. D. (2009). Statistical Analysis of Network Data. Springer. (available for download from Purdue library.)
Easley, D. and J. Kleinberg (2010). Networks, Crowds, and Markets. Cambridge University Press. online
Jackson, M. (2008). Social and Economic Networks. Princeton Press.