Improving on the
brute force
string macthing algorithm
- Consider the following
state in a
string matching
operation:
T = abadaabaccabacabaabb
P = abacab
^ ^
| | P[j] != T[s+j]
s=0 j=3
|
-
The brute force
string matching algorithm will
try the next character position:
T = abadaabaccabacabaabb
P = abacab
^
|
s=s+1 <--- Can we do better than +1 ?
j=0
|
|
Using a
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ?????????a??????????
P = aaabxxyz
^ ^
| | First mismatch occurs here
s j
|
- We can know for a
fact that
....
T = ?????????a??????????
P = aaabxxyz
^ ^
| |
s j
|
|
Using a
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ?????????a??????????
P = aaabxxyz
^ ^
| | First mismatch occurs here
s j
|
-
All characters in
the prefix
before the
mismatched character
are
same:
T = ?????aaaba??????????
P = aaabxxyz
^ ^
| |
s j
|
|
Using a
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ?????????a??????????
P = aaabxxyz
^ ^
| | First mismatch occurs here
s j
|
-
The prefix
aaab
will cause
a
shift of
1 position
to fail:
T = ?????aaaba??????????
P = aaabxxyz
^
| because aab != aaa
s=s+1
|
|
Using a
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ?????????a??????????
P = aaabxxyz
^ ^
| | First mismatch occurs here
s j
|
-
The prefix
aaab
will cause
a
shift of
2 positions
to fail:
T = ?????aaaba??????????
P = aaabxxyz
^
| because ab != aa
s=s+2
|
|
Using a
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ?????????a??????????
P = aaabxxyz
^ ^
| | First mismatch occurs here
s j
|
-
The prefix
aaab
will cause
a
shift of
3 positions
to fail:
T = ?????aaaba??????????
P = aaabxxyz
^
| because b != a
s=s+3
|
|
Using a
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ?????????a??????????
P = aaabxxyz
^ ^
| | First mismatch occurs here
s j
|
- The earliest possible
match
must
start here:
T = ?????aaaba??????????
P = aaabxxyz
^
| The prefix aaab will cause a shift of +4 !
s=s+4
|
|
Example 2 on using
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ????????????X??????????
P = aaaaabaaxaaxyz
^ ^
| | First mismatch occurs here
s j
|
- We can know for a
fact that
....
T = ????????????X??????????
P = aaaaabaaxaaxyz
^ ^
| |
s j
|
|
Example 2 on using
partially matched prefix
to
slide the pattern
- Consider the following
state in a
string matching
operation:
T = ????????????X??????????
P = aaaaabaaxaaxyz
^ ^
| | First mismatch occurs here
s j
|
-
All characters in
the prefix
before the
mismatched character
are
same:
T = ????aaaaabaaX??????????
P = aaaaabaaxaaxyz
^ ^
| |
s j
|
|
Example 2 on using
partially matched prefix
to
slide the pattern
We can know for sure that
no match can
occur at
these positions:
T = ????aaaaabaaX??????????
P = aaaaabaaxaaxyz
T = ????aaaaabaaX??????????
P = aaaaabaaxaaxyz
T = ????aaaaabaaX??????????
P = aaaaabaaxaaxyz
T = ????aaaaabaaX??????????
P = aaaaabaaxaaxyz
T = ????aaaaabaaX??????????
P = aaaaabaaxaaxyz
|
The earliest possible
match
starts here:
T = ????aaaaabaaX??????????
P = aaaaabaaxaaxyz
|
❮
❯