A pattern matching problem
Developing software involves solving an indefinite number of small pattern matching problems. Recently I had to find a simple way to match a specific pattern of heartbeats in a parsed ECG signal.
The problem can be reduced to this: Return the number of occurrences of a pattern, say [4,5,6]
, in a longer sequence, such as [1,2,3,4,5,6,4,5,6,4,5,6,4,5,6]
.
I’m sure this problem has been solved a millions times over in computing history. But, besides regular expressions (which deals with text strings), I didn’t find a straightforward solution in Python’s standard library.
Looking for a solution, I came across Stack Overflow (SO) where someone else, having the same problem, posted it as a question. The accepted answer outlined an algorithm, but didn’t provide any code.
Therefore, I came up with my own solution, which I posted on SO, and repeat here (Python code):
import itertools
def pattern_match(pattern, sequence):
"""Count the number of times that pattern
occurs in the sequence.
"""
pattern = tuple(pattern)
k = len(pattern)
# create k iterators for the sequence
i = itertools.tee(sequence, k)
# advance the iterators
for j in range(k):
for _ in range(j):
next(i[j])
count = 0
for q in zip(*i):
if pattern == q:
count += 1
return count
Here is a Gist solving the example problem above, which was posed in the SO question.
If you are curious, the answer is 4.