dk.brics.string.stringoperations
Class Replace6
java.lang.Object
dk.brics.string.stringoperations.Operation
dk.brics.string.stringoperations.UnaryOperation
dk.brics.string.stringoperations.Replace6
public class Replace6
- extends UnaryOperation
Automaton operation for String.replace(CharSequence,CharSequence)
where both arguments are known.
When two occurrences of the search string overlaps, the replace() operation replaces the leftmost occurrence,
though the current implementation approximates it as though either occurrence is chosen at random.
For example, the operation "babab".replace("bab", "X")
will always produce "Xab", but will be approximated as {"Xab","baX"}.
Replace6
public Replace6(String searchFor,
String replaceBy)
- Automaton operation for
String.replace(CharSequence,CharSequence)
where both arguments are known.
- Parameters:
searchFor
- known value of the first argumentreplaceBy
- known value of the second argument
charsetTransfer
public CharSet charsetTransfer(CharSet a)
- Description copied from class:
UnaryOperation
- Transfer function for character set analysis.
- Specified by:
charsetTransfer
in class UnaryOperation
getPriority
public int getPriority()
- Description copied from class:
Operation
- Returns priority of this operation.
When approximating operation loops in grammars, operations with
high priority are considered first.
- Specified by:
getPriority
in class Operation
op
public Automaton op(Automaton a)
- Automaton operation.
Constructs a new automaton as a copy of a, with these modifications.
For every state s0, if there exists a path s0,...,sn reading the
search string S, the following modifications are made:
- A replacement path r0,...,rm is created, reading the replacement string.
- A ghost path g0,...,gn-1 is created, reading the search string, except for its last character.
- These epsilon transitions are created:
(s0,r0),
(s0,g0),
(rm,sn)
- For i=1,...,n-1 this constrained epsilon transition is created:
(gi, si, [Si]), where Si is the i-th character in the search string (0-based).
- For i=0,...,n-1, gi is made an accept state if si is an accept state.
- r0 and rm are made accept states if s0 and sn are accept states, respectively.
- The character S0 is removed from the transition (s0,s1).
The strategy above assumes that the input automaton is deterministic, and that
the search string is not empty. If the search string is empty, a different strategy is used, which
inserts the replacement string once between every state.
A constrained epsilon transition is like a normal epsilon transition, except the constrained character cannot be read
immediately after following the transition. If it occurs afterwards, the automaton must move to the dead reject state instead of
following the transitions at the current state. If the automaton follows more than one constrained epsilon transition
consecutively, all their constrained characters become constrained.
The intuition behind the strategy is this: If the search string occurs at some position, the replacement string
can now be read from that position instead. If only a prefix of the search string did occur, the automaton can
follow the ghost path a bit, and then jump back into the original path when a character differs from the search
string. Removing the first character from the first transition ensures that the automaton can no longer read the
whole search string at that position.
- Specified by:
op
in class UnaryOperation
- Parameters:
a
- input automaton, should not be modified
- Returns:
- output automaton
toString
public String toString()
- Description copied from class:
Operation
- Returns name of this operation.
- Specified by:
toString
in class Operation
Copyright © 2003-2009 Anders Møller, Aske Simon Christensen, Asger Feldthaus.