Extract text and numeric values from a text file

rated by 0 users
This post has 7 Replies | 2 Followers

Top 500 Contributor
Posts 8
PLeskov Posted: 04-30-2012 12:00 PM

 

I’m new to powershell regular expressions and have a question on how to write one that would extract text and numeric values from .txt log files. Here is a sample of the log file.

Note: There are no spaces between the string lines in the original file:

 

"4/30/2012 10:47:03 AM","1","About","SyncBackPro V6.0.12.0 Log",0,"O"

"4/30/2012 10:47:03 AM","1","Profile Name","Test Backup - 1",0,"O"

"4/30/2012 10:47:03 AM","1","Type","Mirror Right",0,"O"

"4/30/2012 10:47:10 AM","1","Deleted","2",0,"O"

"4/30/2012 10:47:10 AM","1","Skipped","4",0,"O"

"4/30/2012 10:47:10 AM","1","Copied","2",0,"O"

"4/30/2012 10:47:10 AM","1","Copied/Moved","0.07KBytes",0,"O"

"4/30/2012 10:47:10 AM","1","Scan Started","4/30/2012 10:47:04 AM",0,"O"

"4/30/2012 10:47:10 AM","1","Scan Finished","4/30/2012 10:47:04 AM",0,"O"

"4/30/2012 10:47:10 AM","1","Profile Start Time","4/30/2012 10:47:03 AM",0,"O"

"4/30/2012 10:47:10 AM","1","Profile End Time","4/30/2012 10:47:10 AM",0,"O"

"4/30/2012 10:47:10 AM","1","Result","Success",0,"O"

 

So far I was able to extract just the lines of text that are of importance and output them into a single file.

Get-Content $file | Select-String -pattern "Profile Name", "Deleted", "Skipped", "Copied", "Copied/Moved", "Profile End Time", "Result" | Out-File $logpath -encoding ASCII -append

   }      ## part of the script's function.

To make the summary report look more readable I need to extract just the values that follow those pre-define text patterns.  I only need the values that are highlighted. They follow certain text patterns  that don’t change: “Profile Name”, "Deleted", "Skipped", "Copied",  "Copied/Moved", "Profile End Time", "Result" . The numeric values that follows "Deleted", "Skipped", "Copied" or "Copied/Moved" could be any number in 0 - 50,000 range.

Please help!

Thanks,

Paul Leskov

 

 

 

 

 

 

 

 

 

Top 25 Contributor
Posts 287
Top Contributor

Hi,

You don't need regular expressions for that.

foreach($line in (Get-Content $file)){

$line.split(",")[3]}

 

Top 500 Contributor
Posts 8

Hi Felipe,

Your solution works but it solves only part of the problem. I still need to filter the data and extract only those specific values. I got 59 log files with couple of thousands lines in some of them. My goal is to extract only the valuable backup data and output it in a single file formatted as a table. Something like this:

 

Profile Name    Deleted  Skipped  Copied  Copied/Moved    Profile End Time    Result

Test Backup         5         3          44       35.8MB       4/30/2012 1:41:13 PM    Success
Mirror Noon          356      288      25       56.5MB       4/30/2012 1:450:13 PM   Success
Mirror Exchange
Mirro Daily
....
.....

Thanks,

Paul

 

 

Top 25 Contributor
Posts 287
Top Contributor

Hi,

I can't do anything more elaborated now because I'm away from a computer but you can put a if condition to exclude the properties you don't want.

For what I can see there aren't many.

 

Top 25 Contributor
Posts 287
Top Contributor

You can do something like:

foreach($line in (Get-Content "C:\test.txt")){

if($line -notmatch "About" -and $line -notmatch "Type" -and $line -notmatch "Scan Started" -and $line -notmatch "Scan Finished" -and $line -notmatch "Profile Start Time"){

$table = @{$line.split(",")[2]=$line.split(",")[3]}
$table
}}

Top 500 Contributor
Posts 8

Thanks!

 With a slight adjustment it works great!

foreach($line in (Get-Content $logfile)){
 
 
if($line -match "Profile Name" -or $line -match "Deleted" -or $line -match "Skipped" -or $line -match "Copied" -or $line -match "Copied/Moved"  -or $line -match "Profile End Time" -or $line -match "Result"){

$table = @{$line.split(",")[2]=$line.split(",")[3]}
$table

}

}

Now how do I output only the values of the array to a text file and omit the Array names?. I also need the Values list to be converted to a table. That way when the Values from the rest of the log files are added, the output will be much more readable. Any ideas on how to do that?

Thanks!

Paul 

 

 

 

Top 500 Contributor
Posts 8

Well, answering my own question, I came across some functions that enable converting hush table to objects and do all kind of formatting. Now Im still having a problem using foreach loop with the hash table. 

foreach ($line in (Get-Content "C:\Scripts\Logs\Test\Test Backup One_Log_Page1.txt")) {
 
 if($line -match "Profile Name" -or $line -match "Deleted" -or $line -match "Skipped" -or $line -match "Copied" `
 -or $line -match "Copied/Moved"  -or $line -match "Profile End Time" -or $line -match "Result"){
  
  $TableTemp = @{$line.split(",")[2]=$line.split(",")[3]} # even if I output to a file | Out-File "C:\Scripts\Logs\Test\Test.txt"
                # I still get only one value  
  
  $TableTemp
  
  
  }
}

 When I output the results into the $TableTemp = @{$line.split(",")[2]=$line.split(",")[3]} it saves all the values fine. When I read the hash It only has one value. I've read that those just a reference values and you can't work with them directly. How do I save the results of the foreach loop in the hash so they can be accessed later? Any input would be appreciated!

Paul

Top 25 Contributor
Posts 287
Top Contributor

Hi,

I'm getting all the values.. So if we use the file in your first post as en example I would get the output of:

Name                           Value
----                           -----
"Profile Name"               "Test Backup - 1"
"Deleted"                      "2"
"Skipped"                      "4"
"Copied"                       "2"
"Copied/Moved"             "0.07KBytes"
"Profile End Time"           "4/30/2012 10:47:10 AM"
"Result"                         "Success"

If you want to get each iteration then use:

$TableTemp += @{$line.split(",")[2]=$line.split(",")[3]}

or if outputting to a file:

Out-File -append "C:\Scripts\Logs\Test\Test.txt"

Page 1 of 1 (8 items) | RSS
Copyright 2012 PowerShell.com. All rights reserved.