This document presents a novel technique for detecting duplicates in XML data using XPath and Bayesian networks, addressing the challenges posed by hierarchical structures in data. The algorithm developed achieves high accuracy in identifying fuzzy duplicates through a new pruning strategy, improving upon existing methods. Experimental results indicate the effectiveness of this approach in enhancing data quality by eliminating duplicate records in various XML datasets.