Julian James Stephen
    Purdue University
    Department of Computer Science
    West Lafayette, IN

    2016
  • J. Stephen, S. Savvides, V. Sundaram, M. Ardekani and P. Eugster
    STYX: Stream Processing with Trustworthy Cloud-based Execution
    ACM Symposium on Cloud Computing (SoCC 2016).
  • P. Eugster, C. Jayalath, K. Kogan, and J. Stephen
    Big Data Analytics beyond the Datacenter
    IEEE Computer, to appear.
  • W. Culhane, P. Eugster, C. Jayalath, K. Kogan, and J. Stephen
    Cloud Federation and Geo-distribution
    Cloud Computing Encyclopedia, Wiley, pp. 265-279, 2016. link
    2015
  • J. Stephen, D. Gmach, R. Block, A. Madan and A. AuYoung
    Distributed Real time Event Analysis pdf
    12th IEEE International Conference on Autonomic Computing (ICAC 2015).
    2014
  • J. Stephen, S. Savvides, R. Seidel, and P. Eugster
    Program Analysis for Secure Big Data Processing pdf
    29th IEEE/ACM International Conference on Automated Software Engineering (ASE 2014).
  • J. Stephen, S. Savvas, R. Seidel, and P. Eugster
    Practical Confidentiality Preserving Big Data Analysis pdf
    6th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '14).
  • C. Jayalath, J. Stephen, and P. Eugster
    Universal Cross-Cloud Communication
    IEEE Transactions on Cloud Computing (TCC), 2(2): 103-116, April-June 2014.
  • C. Jayalath, J. Stephen, and P. Eugster
    From the Clouds to the Atmosphere: Running MapReduce across Datacenters
    IEEE Transactions on Computers (TC), 63(1): 74-87, 2014.
    2013
  • J. Stephen and P. Eugster
    Assured Cloud-based Data Analysis with ClusterBFT pdf
    14th ACM/IFIP/USENIX International Middleware Conference (Middleware 2013).
  • C. Jayalath, J. Stephen, and P. Eugster
    Atmosphere: A Universal Cross-Cloud Communication Infrastructure pdf
    14th ACM/IFIP/USENIX International Middleware Conference (Middleware 2013).

Education

  • Ph.D in Computer Science (in progress). Purdue University, IN
  • MS in Computer Science (2015). Purdue University, IN
  • BTech in Computer Science (2004). Mar Athanasius College of Engineering, India

Employment

  • IBM Thomas J. Watson Research Staff - Security & Privacy, Jan 2017 (to join)
  • Purdue University. Research Assistant, Teaching Assistant for CS 180, 2010-2012
  • HP Labs. Research Assistant Intern 2013 & 2014 Summer
  • Oracle. Applications Engineer 2006 - 2010
  • Infosys. Software Engineer 2004 - 2006

Awards

  • Maurice H. Halstead Memorial award for outstanding research in software engineering
    Department of Computer Science, Purdue University. Spring’ 15
  • Raymond Boyce Graduate Teaching award
    Purdue University, Fall’ 15
  • Spot award, selected by peers
    Oracle, Jan’ 08
  • Employee award
    Oracle, Qtr 2’ FY06
  • Spot award for inducing positive turnaround on client critical module
    Infosys’ 05

Secure Big Data Processing (home)

  • Background and motivation: Homomorphic encryptions allow us to perform specific operations directly on encrypted data. Though Fully homomorphic encryptions are still too costly to build practical systems, partially homomorphic encryptions has been leveraged to build database systems supporting SQL queries (CryptDB). While such advances in cryptographic techniques allow us to process directly on encrypted data, programmer friendly and efficient ways of programming data analysis jobs on large data sets, in a distributed fashion are still missing.
  • Crypsis: In this project we explore data flow analysis and program transformations for Pig Latin, that automatically enable the execution of standard Pig Latin scripts on encrypted data. We avoid fully homomorphic encryption because of its prohibitively high cost; instead, rely on partial homomorphic encryptions and minimal set of computations done by the client. Depending on the user program, we generate multiple encryptions of the same field to support different operations, allow computations to finish on the client side and perform re-encryptions. Our system is able to operate on average with 3 times overhead as compared to the same data analysis job on plain text.

Distributed Real-Time Event Analysis

    Security Information and Event Management (SIEM) systems perform complex event processing over a large number of event streams at high rate. As event streams increase in volume and event processing becomes more complex, traditional approaches such as scaling up to more powerful systems quickly become ineffective. We designed and implemented DRES, a distributed, rule-based event evaluation system that can easily scale to process a large volume of non-trivial events. DRES intelligently forwards events across a cluster of nodes to evaluate complex correlation and aggregation rules. This approach enables DRES to work with any rules engine implementation. Our evaluation shows DRES scales linearly to more than 16 nodes, successfully processing more than half a million events per second.