Definition of ``published data''

Next: Theory of data Up: Further Work Previous: Study of limitations

Definition of ``published data''

Another problem is making sure that we do make available what we want to. We need to be able to specify what information users are allowed to obtain from the published data. We must, in effect, specify how correct the published view of the data must be. This can draw on work in replicated and mobile databases. That is, how current must a copy be? How correct must a copy be?

A specification of the amount of error allowed in data (as done using quasi-copies [ABGM90], for example) will allow us to introduce selective error that will limit the effectiveness of data mining.

The issues are different -- we are not concerned with cost measures for updating the copy, for example -- but some of the same specification mechanisms can apply. Percentage error bounds could be useful. Update counts (``how many updates are allowed before my copy is updated'') may not be as appropriate, but careful selection of the ``out of date'' data may be used to prevent mining.

Christopher W Clifton
Fri Aug 23 13:26:29 EDT 1996