seminars
Detail
Publication date: 1 de June, 2021Measuring the Structural Similarity of Semistructured Documents Using Entropy
We propose a technique for measuring the structural similarity
of semistructured documents based on entropy. After extracting the
structural information from two documents we use either Ziv-Lempel
encoding or Ziv-Merhav crossparsing to determine the entropy and
consequently the similarity between the documents. To the best of
our knowledge, this is the first linear-time approach for evaluating
structural similarity. In an experimental evaluation we
demonstrate that the results of our algorithm in terms of clustering
quality are on a par with or even better than existing approaches.
Date | 02/03/2007 |
---|---|
State | Concluded |