Multiple string updates


#1

I’m doing some file manipulation and struggling with map, nested for comprehension etc and would appreciate a solution.

I have a list of strings, called lines, such as those taken from an arbitrary file.
I also have a list of replacement strings, in a list called replacements.

eg.

val lines = List( “abcd”, “abcde”, “abcdefgh” )
val replacements = List( “b”, “d”, “gh” )

I then want to map over each string in lines, and call something like

 l.replace(stringFromReplacements, "")

for each string in replacements.

i.e. my new list of lines will become

List( “ac”, “ace”, “acef” )

I’ve had a look at Stack Overflow similar q but this uses a prebuilt replacements Map. Mine is dynamic.

Something like

for {line <- lines
   var tmpline = line
   str <- replacements
   tmpline = tmpline.replace(str, "")
} yield tmpline

but this is not valid.


#2

I see two possible solutions:

(1) Replace multiple times, like lines.replace(“b”, “”).replace(“d”, “”).replace(“gh”, “”)

(2) Use a regular expression with alternatives. I’m a bit rusty on those, but I think it is something like lines.replace(“b|d|gh”, “”)


#3

For your problem, a for-comprehension is not sufficient, because the result of a replacement is needed as input for the next replacement. For this you need a fold.

Folding iterates through your data structure and applies a function to the current result (beginning with a start value) and the current element in the structure, then applying it to the result thereof and the next element and so on.

So to replace everything in replacements for a given line:

replacements.foldLeft(line)((curLine, str) => curLine.replaceAll(str, ""))

We fold over all replacement, passing the line as first left value for the function. The function itself removes all occurrences of the first replacement, and the resulting line is passed to the function with the next replacement.

for { line <- lines }
    yield replacements.foldLeft(line)((curLine, str) => curLine.replaceAll(str, ""))

or the equivalent shorthand notation:

for {line <- lines} yield replacements.foldLeft(line)(_.replaceAll(_, ""))

#4

Great - many thanks.
Yes I could probably do with regexes but the fold is what I was after. It’s now clearer for me when to get fold involved.