The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

NAME

Array::OverlapFinder - Find/remove overlapping items among ordered sequences

VERSION

This document describes version 0.005 of Array::OverlapFinder (from Perl distribution Array-OverlapFinder), released on 2020-01-02.

SYNOPSIS

 use Array::OverlapFinder qw(
     find_overlap
     combine_overlap
 );

 # sequence is array of strings (compared with 'eq' operator; if you have array
 # of records/structures, you can encode each record as JSON or using Data::Dmp,
 # for example)
 my @seq1 = qw(1 2 3 4 5 6);
 my @seq2 = qw(4 5 6 7 8 9);
 my @seq3 = qw(8 9 10 11);

 my @overlap_items                   = find_overlap(\@seq1, \@seq2);                           # => (4,5,6)
 my @all_overlap_items               = find_overlap(\@seq1, \@seq2, \@seq3);                   # => ([4,5,6], [8,9])
 my ($overlap_items_12, $index2_at_seq1, $overlap_items_13, $index3_at_seq1b) =
                                       find_overlap({detail=>1}, \@seq1, \@seq2, \@seq3);      # => ([4,5,6], 3, [8,9], 7)

 my @combined_seq = combine_overlap(\@seq1, \@seq2, \@seq3);                                   # => (1,2,3,4,5,6,7,8,9,10,11)
 my ($combined_seq, $overlap_items_12, $index2_at_seq1, $overlap_items_13, $index3_at_seq1b) =
                    combine_overlap({detail=>1}, \@seq1, \@seq2, \@seq3);
                                                                                               # => ([1,2,3,4,5,6,7,8,9,10,11], [4,5,6], 3, [8,9], 7)

DESCRIPTION

Assuming you have two ordered sequences of items that might or might not overlap, where the first sequence contains "earlier" items and the second contains possibly "later" items, the functions in this module can find the overlapping items for you or remove them combining the two sequence into one:

 # condition A, no overlaps
 sequence1: 1 2 3 4 5 6
 sequence2:              8 9 10
 overlap  :
 combined : 1 2 3 4 5 6  8 9 10

 # condition B, overlaps
 sequence1: 1 2 3 4 5 6
 sequence2:       4 5 6 7 8 9
 overlap  :       4 5 6
 combined : 1 2 3 4 5 6 7 8 9

 # condition C, overlaps
 sequence1: 1 2 3 4 5 6
 sequence2:       4 5
 overlap  : 4 5
 combined : 1 2 3 4 5 6

 # condition D, overlaps
 sequence1: 1 2 3 4 5 6
 sequence2:       4 5 6
 overlap  :       4 5 6
 combined : 1 2 3 4 5 6

 # condition E, overlaps (identical)
 sequence1: 1 2 3 4 5 6
 sequence2: 1 2 3 4 5 6
 overlap  : 1 2 3 4 5 6
 combined : 1 2 3 4 5 6

 # condition F, overlaps
 sequence1: 1 2 3 4 5 6
 sequence2: 1 2 3 4 5 6 7 8
 overlap  : 1 2 3 4 5 6
 combined : 1 2 3 4 5 6 7 8

 # condition G1, overlaps in the middle of second sequence will be assumed as non-overlapping
 sequence1: 1 2 3 4 5 6
 sequence2:   2 3 4 x x 5 6
 overlap  :
 combined : 1 2 3 4 5 6 2 3 4 x x 5 6

 # condition G2, multiple overlaps will be assumed as non-overlapping
 sequence1: 1 2 3 4 5 6
 sequence2: 2 3 4 x x 5 6 y y
 overlap  :
 combined : 1 2 3 4 5 6 2 3 4 x x 5 6 y y

The functions can accept more than two sequences to find/remove overlapping items in.

Use-cases: forming a non-overlapping sequence of items from repeated downloads of RSS feed or "recent" page.

FUNCTIONS

All functions are not exported by default, but exportable.

find_overlap

Usage:

 find_overlap([ \%opts , ] \@seq1, \@seq2, ...)

combine_overlap

Usage:

 combine_overlap([ \%opts , ] \@seq1, \@seq2, ...)

HOMEPAGE

Please visit the project's homepage at https://metacpan.org/release/Array-OverlapFinder.

SOURCE

Source repository is at https://github.com/perlancar/perl-Array-OverlapFinder.

BUGS

Please report any bugs or feature requests on the bugtracker website https://github.com/perlancar/perl-Array-OverlapFinder/issues

When submitting a bug or request, please include a test-file or a patch to an existing test-file that illustrates the bug or desired feature.

SEE ALSO

nauniq from App::nauniq can also sometimes be used, if you know the items in the sequence are unique.

Algorithm::Diff

Text::OverlapFinder has a similar name, but the two modules are not that related.

AUTHOR

perlancar <perlancar@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2021, 2020 by perlancar@cpan.org.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.