Freitag, 20. November 2015

Testing fuzzy approximation with Test::Deep

Maybe the title of this post is misleading, but I will explain what I mean.

Last year I wrote a series of LCS (longest common subsequence) modules for speed and rock solid quality. Then I had a need for smarter alignment and added similarity to the algorithm, now released as LCS::Similar on CPAN.

The hardest part in developing LCS and friends was sorting out bogus algorithms from a collection of nearly 100 publications with pseudocode and sometimes code. It needed some time to get all cornercases as testcases.

But how to test an algorithm which returns one of maybe more than one optimal ("longest") solutions? With a method allLCS() returning all solutions and comparing the single solution against them.

Here is what the output looks like:


my $cases = {
  'case_01' => {
    'allLCS' => [
      [ [ 0, 0 ], [ 2, 2 ] ],
      [ [ 1, 0 ], [ 2, 2 ] ]
    ],
    'LCS' =>
      [ [ 0, 0 ], [ 2, 2 ] ],
  },
  'case_02' => {
    'allLCS' => [
      [ [ 1, 1 ], [ 3, 2 ], [ 4, 3 ], [ 6, 4 ] ],
      [ [ 1, 1 ], [ 3, 2 ], [ 5, 3 ], [ 6, 4 ] ],
      [ [ 2, 0 ], [ 3, 2 ], [ 4, 3 ], [ 6, 4 ] ],
      [ [ 2, 0 ], [ 3, 2 ], [ 5, 3 ], [ 6, 4 ] ],
      [ [ 2, 0 ], [ 4, 1 ], [ 5, 3 ], [ 6, 4 ] ]
    ],
    'LCS' =>
      [ [ 1, 1 ], [ 3, 2 ], [ 5, 3 ], [ 6, 4 ] ]
  },
};

The result of LCS is valid if it is in the results of allLCS.

At first I wrote the comparison myself but Test::Deep already provides it:


  use Test::More;
  use Test::Deep

  use LCS::Tiny;

  cmp_deeply(
    LCS::Tiny->LCS(\@a,\@b),
    any(@{$object->allLCS(\@a,\@b)} ),
    "LCS::Tiny->LCS $a, $b"
  );

  done_testing;

That's how we can test approximation. But if the comparison in LCS changes from eq to a similarity function returning values between 0 and 1, than additional pairs matched with similarity appear in the result. For a comparison we need any of allLCS is a subset of LCS.

This was the first approach which works not reliable:


use Test::Deep;

# THIS DOES NOT WORK
sub cmp_any_superset {
    my ($got, $sets) = @_;

    for my $set (@$sets) {
      my $ok = cmp_deeply(
        $got, supersetof(@$set)
      );
      return $ok if $ok;
    }
};

Trying around and reading the documentation of Test::Deep again and again, nearly giving up and write it myself, I gave one possible interpretation of the docs a chance:


  cmp_deeply(
    $lcs,
    any(
      $lcs,
      supersetof( @{$all_lcs} )
    ),
    "Example $example, Threshold $threshold"
  );

It works.