Author
|
Conference
|
Journal
|
Organization
|
Year
|
DOI
Look for results that meet for the following criteria:
since
equal to
before
between
and
Search in all domains
Limit my searches in the following domains
Agriculture Science
Arts & Humanities
Biology
Chemistry
Computer Science
Economics & Business
Engineering
Environmental Sciences
Geosciences
Material Science
Mathematics
Medicine
Physics
Social Science
Multidisciplinary
Keywords
(5)
Information Quality
Probability Distribution
Random Measure
Error Rate
lempel ziv
Subscribe
Academic
Publications
In Search of an Accuracy Metric (Data and Information Quality Metrics)
Edit
In Search of an Accuracy Metric (Data and Information Quality Metrics)
BibTex
|
RIS
|
RefWorks
Download
Craig Fisher
,
Eitel Lauria
,
Carolyn Matheus
,
SUNY Albany
Practitioners and researchers often refer to error rates or accuracy percentages of databases. The former is the number of cells in error divided by the total number of cells; the latter is the number of correct cells divided by the total number of cells. However, databases may have similar error rates (or accuracy percentages) but differ drastically in the severity of their accuracy problems. A simple percent does not provide information as to whether the errors are systematic such as one record with 20 fields in error or 20 errors randomly distributed throughout the database. The difference is rooted in the degree of randomness or complexity. We expand the accuracy metric to include a complexity (randomness) measure and include a
probability distribution
value. The proposed randomness check is based on the Lempel-Ziv (LZ) complexity measure. The main candidate for the
probability distribution
parameter is Poisson's lambda. The newly described metric allows management to distinguish between databases that have similar accuracy measures and error rates but differ drastically in the level of complexity of the quality problems.
Cumulative
Annual
View Publication
The following links allow you to view full publications. These links are maintained by other sources not affiliated with Microsoft Academic Search.
(
mitiq.mit.edu
)
References
(27)
Designing Information Systems to Optimize the Accuracy-Timeliness Tradeoff
(
Citations: 63
)
Donald P. Ballou
,
Harold L. Pazer
Journal:
Information Systems Research - ISR
, vol. 6, no. 1, pp. 51-72, 1995
Enhancing data quality in data warehouse environments
(
Citations: 83
)
Donald P. Ballou
,
Giri Kumar Tayi
Journal:
Communications of The ACM - CACM
, vol. 42, no. 1, pp. 73-78, 1999
Time Related Factors of Data Quality in Multichannel Information Systems
(
Citations: 23
)
Cinzia Cappiello
,
Chiara Francalanci
,
Barbara Pernici
Published in 2004.
Randomness and Mathematical Proof
(
Citations: 261
)
Gregory J. Chaitin
Journal:
Scientific American - SCI AMER
, vol. 232, no. 5, pp. 47-52, 1975
On the complexity measures of genetic sequences
(
Citations: 49
)
Vladimir D. Gusev
,
Lubov A. Nemytikova
,
Nadia A. Chuzhanova
Journal:
Bioinformatics/computer Applications in The Biosciences - BIOINFORMATICS
, vol. 15, no. 12, pp. 994-999, 1999