Extending Naive Bayes Classifier with Hierarchy Feature Level Information for Record Linkage

Zhou, Yun; Howroyd, John; Danicic, Sebastian; and Bishop, Mark (J. M.). 2015. 'Extending Naive Bayes Classifier with Hierarchy Feature Level Information for Record Linkage'. In: AMBN 2015: the second workshop on Advanced Methodologies for Bayesian Network. Yokohama, Japan. [Conference or Workshop Item]
Copy

Probabilistic record linkage has been well investigated in re- cent years. The Fellegi-Sunter probabilistic record linkage and its enhanced version are commonly used methods, which calculate match and non-match weights for each pair of corresponding fields of record-pairs. Bayesian network classifiers – naive Bayes classifier and TAN have also been successfully used here. Very recently, an extended version of TAN (called ETAN) has been developed and proved superior in classification accuracy to conventional TAN. However, no previous work has applied ETAN in record linkage and investigated the benefits of using a nat rally existing hierarchy feature level information. In this work, we extend the naive Bayes classifier with such information. Finally we apply all the methods to four datasets and estimate the F1 scores.


picture_as_pdf
Yun_AMBN_2015_revised.pdf

View Download

Atom BibTeX OpenURL ContextObject in Span OpenURL ContextObject Dublin Core Dublin Core MPEG-21 DIDL Data Cite XML EndNote HTML Citation METS MODS RIOXX2 XML Reference Manager Refer ASCII Citation
Export

Downloads