Hi,
I'm trying to read some replacement rules from a file (separated by "|") and use them on a given Text...
$inputText = [system.IO.File]::ReadAllText("input.txt")
$regex = Get-Content "changes.txt" -ReadCount 0
foreach($expression in $regex) {
if ($expression -eq 'EOF') { break }
$parts = $expression.Split("|")
if ($parts.Count -eq 2) {
$t = $parts[0]
$u = $parts[1]
$inputText = $InputText -creplace $t, $u
$inputText | out-file "output.txt" -enc ascii
}
This works fine for most rules, but doesn't work for matching patterns at end of a line.
e.g. \\DE`r`n|\\DE_02`r`n
If I try using strings for the replacement, the expected replacements are done:
$inputText = $InputText -creplace "\\DE`r`n", "\\DE_02`r`n"
How come it works with "strings", but not with &variables??? (the variables $t and $u contain just the same strings...)
Which replacement rule can be used for my needs?
Thanks in advance!
Hello,
your examples do not work for me, maybe I provide wrong input data. I assume you are trying to replace \\DE with \\DE_2 in the source file (but not replacing \\server\name\DE to \\server\name\DE_2) by providing the replace rules in changes file.
I have taken time to explain the script in more detail since you seem a bit lost, hope you'll appreciate it.
First of all there is -replace (-split) operator and its case sensitive variant -creplace (-csplit), these are taking regular expression as argument, you have to realize that | and \ has special meaning in regular expressions.
In your example $InputText -creplace "\\DE`r`n", "\\DE_02`r`n" does not mean replace \\DE<end of line> with \\DE_2<end of line> but replace \DE`n`r with \\DE_2`n`r. The first \ in the first expression is taken as an escape sign causing the second one to be interpreted as a normal character.
This beahviour can be supressed with "simplematch" option for the -split and with [regex]::escape("<expression>") method for the -replace operator.
(Btw what is the reason you use the -readCount 0 ?)
here is code that takes input file input.txt and replaces strings by rules given in changes.txt file in format
find|replacewith<rest of line is ommited>
#set paths for the input, output and changes files $changeFile = 'C:\Temp\regex_file\changes.txt' $changes = Get-Content $changeFile <#content: \\DE|\\DE_02 \\FE|\\FE_02 test #> $InputFile = 'C:\Temp\regex_file\input.txt' $input = Get-Content $InputFile <#content: \\DE \\DE\name \\DE\file\DE\ \\FE \\FE\test\FE #> $outputFile = 'C:\Temp\regex_file\output.txt' #process the data <#for each line in changes files do a split by "|" and produce maximum of 3 results for each line (do 3-1=2 splits, reducing amount of work done if there is more places to split), simplematch - match the "|" as is,not using special regex characters) If the count of results from the split is less than 2 then proceed to the next line (go to next item in foreach) do a replace on the $input variable doing a "simpleMatch" - \\ are treated as normal characters #> foreach ($change in $changes) { $change = $change -split "|",3,"simplematch" if ($change.Count -lt 2) {continue} $input = $input -creplace [regex]::Escape($change[0]), $change[1] } $input | out-file $outputFile -enc ascii
Thank you a lot for your quick and detailed reply =)
Unfortunately your solution will not work for me. I'll try to explain a little better, what needs to be done:
I need to replace \DE<end of line>
but must not replace \DEUTSCHLAND or \DE\WHATEVER
Here's my Test-File:
At line 23 I inserted \DE for testing. This must not be replaced.
At line 1954 there is \DE at the end of a line. This should be replaced by \DE_02.
By the way, you're right, I am a bit lost... I don't know powershell very well, and I didn't write the script. Therefore I unfortunately don't know, why there's a -readCount 0
I hope it's a little clearer now, what I need to do and you can help me solve the problem.
Thanks for your time!
Ok, that is no problem:)
This one should work:
$changeFile = 'C:\Temp\regex_file\changes.txt'$changes = Get-Content $changeFile$InputFile = 'C:\Temp\regex_file\input.txt'$input = Get-Content $InputFile$outputFile = 'C:\Temp\regex_file\output.txt'foreach ($change in $changes) { $change = $change -split "|",3,"simplematch" if ($change.Count -lt 2) {continue} $input = $input -creplace $change[0], $change[1] }$input | out-file $outputFile -enc ascii
there is change to the code at this line:
it does no longer escape the string provided, so it is now treated as a regular expression
In regular expressions you mark end of line ($), also \ has a special meaning in regular expressions so we need to escape it by the escape sign \
the \s* means zero or more (*) whitespace (\s) characters - there are often extra spaces or tabs at end of line
you need to provide this in the changes file \\DE\s*$|\DE_02, the first one is regular expression, the second is just normal string.
$input = $input -creplace $change[0], $change[1]
PS: If you need to apply just one rule you can greatly simplify it to
$(Get-Content 'C:\Temp\regex_file\input.txt') -creplace "\\DE\s*$","\DE_02" | out-file 'C:\Temp\regex_file\output.txt' -enc ascii
Thank you so much!!!
Works perfect =)
You're welcome :)
nohandle, you are brilliant!
Thanks :)