===============================================================================
Trio (Stanford) - Jennifer Widom
===============================================================================

* Lineage supports uncertainty
  - representation, correlation, efficiency
  - goal for both internal and external

* Current Trio Model
  - more general than discrete or sets
  - "maybe annotations" (tuple level)
  - doesn't require confidence values

* ULDBs: Interesting Questions
  - Extraneous alternatives (minimality)
  - Extraction (drop tables -> errors)

* Future work (research questions)
  - Uncertainty/lineage in Schemas
  - Optimizer Issues and statistics
    - alternatives per tuple?
    - cluster by attribute or verticalize?

===============================================================================
MistiQ (Washington) - Dan Suciu
===============================================================================

* Tables interpreted as events
  - configuration file (meta data)
  - point probablities, guaranteed ranking

* Safe plans -vs- monte carlo simulations
  - heuristics, statistics (like group size)

* Applications of ProbDB
  - Fuzzy object matching: IMDB + AMZN
  - Information extraction (i.e. in IR)
  - Cleaning of sensor data

===============================================================================
Administrivia
===============================================================================

* TODO
  - proceedings page (Purdue)
    - email key participants for slides
    - see http://www-db.stanford.edu/sdt/
  - mailing list (Stanford)
  - sharing data sets
    - IMDB with prob values (Washington)
    - Yahoo! product data (Stanford)
    - Enron and Blogger (IBM)

* Results
  1. Mailing List
  2. Web Page
  3. Shared Data Sets
  4. Meet regularly

* Key Questions
  - approaches: thresholding versus ranking (i.e top k)
  - prob values: where they come from, how to use them

===============================================================================
Avatar (IBM Almaden) - Sriram Raghavan
===============================================================================

* Information Extraction (IE)
  - ie. how UPDB's may solve our problems
  - Classical IE: accuracy and scalability

* Avatar IES for IE, like DBMS for data
  - declaritive IE (for interactions)
  - annotation store (tags for patterns)
  - simple interface (for average user)

* Two problems (annotators make mistakes)
  - High precision -> recall problem
  - derived annotator anamoly (beta thresholds)

* Applications of UPDB's
  - Precisions for annotator rules
  - We're asking the "reverse" question

* Approach: Computing Probability Models
  - Simplified with Naive-Bayes assumption
  - input uncertainties low, outputs high

===============================================================================
Prob Databases (Maryland) - Amol, Lise, Prithviraj
===============================================================================

* Model-based Views (MauveD)
  - Step 1: Process data with stat model
  - Present user with inference views
  - Efficiently update in stream context

* Statistical Relational Learning
  - apply traditional models to relational data
    - answers "where do [prob] numbers come from?"
  - both attribute and structural uncertainty
    - introduce aggragates for complex rel's
    - building conditional probability trees
      - same tricks for existance uncertainty
  - Probabilistic DBs (compare and contrast)

* Arbitrarily correlated data
  - Probabilistic graphical models (from ML)
    - Boolean valued random variables
    - Factor functions over random vars
  - Similar to carrying event expressions
  - Query evaluation as inference problem

===============================================================================
Orion (Purdue) - Sunil Prabhakar
===============================================================================

* Initial focus on attribute uncertainty
  - what about moving objects / predictions?
  - brief survey of applications (i.e. sensors)

* Independence b/t attributes of same tuple
  - what about aggragates? sub queries? arithmetic?
  - joining uncertain attributes? (w/o threshold)

* Several highlights of prototype
  - Quality metrics of probabilistic results
  - Index structures (i.e. PTI) with thresholds
  - Comparing uncertain attributes: resolution

* {a,b} JOIN {a,b} -> {a,b} | {a,b}
  Problem assuming attributes are independent

===============================================================================
HeisenData (IR Berkeley) - Minos Garofalakis
===============================================================================

* PDM for the Digital (Smart) Home
  - many sensors and actuators
  - example app: people tracking

* Requirements
  1. Handle uncertainty and correlation
  2. Share knowledge across applications
  3. Real-time & retrospective reasoning

* Existing Approaches
  - insufficient granularity
  - assume tuple dependence

* Current Apps only use DBMS as a store
  - SELECT * -> crunch -> INSERT *

* HeisenData Project Goals
  - Query evidence or model directly
  - Basis for ML app development

* Data Model: Possible Worlds
  - Evidence Data + Dependencies
  - Prob_model(World | Evidence)

* Hierarchical graphical models
  - Inherit base model, i.e. spatial dependencies
  - Use it to compliment evidence (i.e. missed readings)