Tipps & Tricks Using Compare-Object

Compare-Object is a very powerful Cmdlet that can compare different result sets. The funny thing is: when you try to use Compare-Object in simple scenarios, all works fine. Once you put it to work in production environments, it often fails. Here is why. This article describes what to watch out for and how to correctly configure Compare-Object to make it work for you.

Using Compare-Object To Find New Processes

PowerShell can easily compare result sets for you and filter out only those items that have changed. Let's say you'd like to know which processes have been started after a given point in time. How can you do that?

The workhorse doing the comparinson is called Compare-Object. This Cmdlet takes two resultsets and automatically analyzes them. It then outputs only those items present in either one of the result sets. To find out the processes started after a given point in time, you first create a base resultset of all currently running processes like this:

PS> $shot1 = Get-Process
PS> notepad
PS> $shot2 = Get-Process
PS> Compare-Object $shot1 $shot2

InputObject                                                SideIndicator
-----------                                                -------------
System.Diagnostics.Process (notepad)                       =>
System.Diagnostics.Process (WmiPrvSE)                      =>

Now, whenever you want to see what has changed in your environment, create a second snapshot and compare both. Let's start notepad.exe really quick, then create another snapshot and compare both:

Compare-Object returns only those processes that exist in either $shot1 or $shot2, and the SideIndicator property tells you in which result set the object was present. A SideIndicator "=>" indicates that processes existed in only the second result set so you know these must be new processes. Interestingly enough, when I launched notepad, Windows also launched a process called WmiPrvSE.

Finding New Files And Folders

Wow, that's easy! Why not use this to find new files and folders added to a folder you'd like to monitor? To do that, you'd first generate a snapshot of that folder, then wait for content to be changed, and finally create another snapshot and compare contents. Let's take a look:

PS> $shot1 = Dir $home
PS> Set-Content $home\testfile1.txt "A new file"
PS> $shot2 = Dir $home
PS> Compare-Object $shot1 $shot2

InputObject                                                SideIndicator
-----------                                                -------------
testfile1.txt                                              =>

It worked. It will not always work like a charm, though. There is an important caveat you need to know about: SyncWindow.

Adjusting SyncWindow

Whenever Compare-Object compares object sets, it uses a SyncWindow to resync both lists when there is no match. The default SyncWindow setting in PowerShell V1 is 5, so whenever the result sets has too many consecutive differences, the result will not be what you expected. Here is an example:

PS> $shot1 = 1..10
PS> $shot2 = 10..1
PS> Compare-Object $shot1 $shot2

PS> $shot1 = 1..15
PS> $shot2 = 15..1
PS> Compare-Object $shot1 $shot2

                                     InputObject SideIndicator
                                     ----------- -------------
                                              15 =>
                                               1 <=
                                              14 =>
                                               2 <=
                                               2 =>
                                               1 =>
                                              14 <=
                                              15 <=

In the first part, Compare-Object compares two lists of numbers. Both lists contain the same numbers but in reverse order. The result is nothing, and that is correct since both sets contain the same numbers.

In the second part, there are 15 numbers in each set. This time, Compare-Object returns a bunch of nonsense information, claiming for example that the number 15 is present only in $shot1 and then only in $shot2. Why?

The default SyncWindow is 5, so whenerver there is no match, Compare-Object uses a delta of +/- 5 items to find the next matching item. When there are 10 elements in a set, a SyncWindow of 5 is sufficient to resync both lists (plus/minus 5 results in a maximum of 10 allowable consecutive differences). When there are 15 elements, SyncWindow would need to be at least 7 (the first comparison would be number 1 of $shot1 against number 15 of $shot2; with a SyncWindow of 7, Compare-Object would move 14 elements in $shot2 to find a match and would indeed find the matching number 1).

Fortunately, you can change the SyncWindow property using the parameter -syncWindow:

PS> $shot1 = 1..15
PS> $shot2 = 15..1
PS> Compare-Object $shot1 $shot2 -syncWindow 7
PS> Compare-Object $shot1 $shot2 -syncWindow 6

                                      InputObject SideIndicator
                                      ----------- -------------
                                               15 =>
                                                1 <=
                                                1 =>
                                               15 <=

Now, what would the SyncWindow need to be with an array of 16 or 25 elements? Easy: Take the array size, divide it by two and there you go. For an array of 16 elements, the minimum SyncWindow needs to be 8, and for an array of 25 elements it needs to be 12.

PS> $shot1 = 1..5
PS> $shot2 = 5..1
PS> Compare-Object $shot1 $shot2
PS> $shot1 = 1..10
PS> $shot2 = 10..1
PS> Compare-Object $shot1 $shot2
PS> $shot1 = 1..15
PS> $shot2 = 15..1
PS> Compare-Object $shot1 $shot2

                                       InputObject SideIndicator
                                       ----------- -------------
                                                15 =>
                                                 1 <=
                                                14 =>
                                                 2 <=
                                                 2 =>
                                                 1 =>
                                                14 <=
                                                15 <=


PS> $shot1 = 1..15
PS> $shot2 = 15..1
PS> Compare-Object $shot1 $shot2 -syncWindow 7
PS> Compare-Object $shot1 $shot2 -syncWindow 6

                                       InputObject SideIndicator
                                       ----------- -------------
                                                15 =>
                                                 1 <=
                                                 1 =>
                                                15 <=

Here are a couple of things to note regarding SyncWindow:

  • When SyncWindows is too low, Compare-Object returns false information and reports objects twice, once for each result set
  • The default SyncWindow setting of 5 is sufficient only when you expect very small changes in your result sets
  • To make sure you catch all matches, you would have to set SyncWindow to half of the number of expected differences. You can also set SyncWindow to a very large number like 1000 as catch-all. This however may cause long delays and a lot of memory consumption
  • In PowerShell V2, the default syncWindow setting has been raised as a consequence of this

Picking Properties

Remember everything in PowerShell is represented as object, and objects have properties. If you don't care about properties, Compare-Object picks the information to use for comparison automatically. This may not be what you want. Have a look:

PS> $shot1 = Dir $home
PS> Add-Content $home\testfile1.txt "Another line"
PS> $shot2 = Dir $home
PS> Compare-Object $shot1 $shot2

PS> Compare-Object $shot1 $shot2 -property Name, Length

Name                               Length SideIndicator
----                               ------ -------------
testfile1.txt                          26 =>
testfile1.txt                          12 <=

This is actually a little modification to the example script earlier. I create a folder snapshot, then I add a line to the file I created in the earlier example. Next, I create another snapshot and compare both. The result is: nothing. Why?

Because I did not pick an object property. So Compare-Object simply looked at the file name, and since I added a line to an existing file, no new file name was created.

To monitor file changes, I need to explicitly tell Compare-Object to compare both the name and the Length property. Once I do that, I get back two results. The SideIndicator tells me that the file testfile1.txt was 12 Bytes in the initial snapshot and now is 26 Bytes.

Working With Results

The results delivered by Compare-Object are custom objects returning the properties you selected as well as the SideIndicator property. You can filter and return only selected information. For example, if you'd like to filter the result to show only new elements (SideIndicator equals "=>"), use Where-Object like this:

PS> Compare-Object $shot1 $shot2 -property Name, Length | Where-Object { $_.SideIndicator -eq '=>' }

Name                            Length SideIndicator
----                            ------ -------------
testfile1.txt                       26 =>

You could also use the -passThru parameter to actually return the original objects compared by Compare-Object. When you do this, the SideIndicator property is appended to the original object and you can still use it for filtering:

PS> Compare-Object $shot1 $shot2 -property name, length | fl *


name          : testfile.txt
length        : 51
SideIndicator : =>

name          : testfile.txt
length        : 39
SideIndicator : <=



PS> Compare-Object $shot1 $shot2 -property name, length 
-passThru


    Directory: Microsoft.PowerShell.Core\FileSystem::C:\Users\Tobias


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---          1/9/2009  11:38 AM         51 testfile.txt
-a---          1/9/2009  11:34 AM         39 testfile.txt


PS> Compare-Object $shot1 $shot2 -property name, length -passThru | fl *


PSPath            : Microsoft.PowerShell.Core\FileSystem::C:\Users\Tobias\testfile.txt
PSParentPath      : Microsoft.PowerShell.Core\FileSystem::C:\Users\Tobias
PSChildName       : testfile.txt
PSDrive           : C
PSProvider        : Microsoft.PowerShell.Core\FileSystem
PSIsContainer     : False
SideIndicator     : =>
Mode              : -a---
Name              : testfile.txt
Length            : 51
DirectoryName     : C:\Users\Tobias
Directory         : C:\Users\Tobias
IsReadOnly        : False
Exists            : True
FullName          : C:\Users\Tobias\testfile.txt
Extension         : .txt
CreationTime      : 1/9/2009 11:31:05 AM
CreationTimeUtc   : 1/9/2009 10:31:05 AM
LastAccessTime    : 1/9/2009 11:31:05 AM
LastAccessTimeUtc : 1/9/2009 10:31:05 AM
LastWriteTime     : 1/9/2009 11:38:17 AM
LastWriteTimeUtc  : 1/9/2009 10:38:17 AM
Attributes        : Archive

PSPath            : Microsoft.PowerShell.Core\FileSystem::C:\Users\Tobias\testfile.txt
PSParentPath      : Microsoft.PowerShell.Core\FileSystem::C:\Users\Tobias
PSChildName       : testfile.txt
PSDrive           : C
PSProvider        : Microsoft.PowerShell.Core\FileSystem
PSIsContainer     : False
SideIndicator     : <=
Mode              : -a---
Name              : testfile.txt
Length            : 39
DirectoryName     : C:\Users\Tobias
Directory         : C:\Users\Tobias
IsReadOnly        : False
Exists            : True
FullName          : C:\Users\Tobias\testfile.txt
Extension         : .txt
CreationTime      : 1/9/2009 11:31:05 AM
CreationTimeUtc   : 1/9/2009 10:31:05 AM
LastAccessTime    : 1/9/2009 11:31:05 AM
LastAccessTimeUtc : 1/9/2009 10:31:05 AM
LastWriteTime     : 1/9/2009 11:34:58 AM
LastWriteTimeUtc  : 1/9/2009 10:34:58 AM
Attributes        : Archive



PS> Compare-Object $shot1 $shot2 -property name, length 
-passThru | Where-Object { $_.SideIndicator -eq '=>' }


    Directory: Microsoft.PowerShell.Core\FileSystem::C:\Users\Tobias


Mode                LastWriteTime     Length Name
----                -------------     ------ ----
-a---          1/9/2009  11:38 AM         51 testfile.txt

Persisting Comparison Information

All comparisons so far have taken place in memory because we created the result sets on the fly. What if I'd like to compare folder content against a predefined base set?

You can easily do that. Simply export the result sets to XML, then reload them and do the comparison. There are three important rules when you do that:

  • Use Select-Object to select only the object properties you really need for your comparison prior to exporting objects to XML or else the XML will be very large
  • The result sets you compare both need to be written to XML and re-imported. Do not compare an imported XML result set against a live result set because object types are different
  • Specify the properties you want to compare when you use Compare-Object

Here is an example of persisting result sets. First, I create a snapshot of my $home drive and export it as XML. Since I am interested in new files and changed files, I only export Name and Length.

Next, whenever I am in need, I can import the base folder set and compare it against the current folder content. To do that, I export  the current folder content as XML, too, and reimport it so that

PS> Dir $home | Select-Object Name, Length | Export-Clixml $home\baseline.xml
PS> Add-Content $home\testfile.txt "Hello World"
PS> $shot1 = Import-Clixml $home\baseline.xml
PS> Dir $home | Select-Object Name, Length | Export-Clixml $home\temp.xml
PS> $shot2 = Import-Clixml $home\temp.xml
PS> Compare-Object $shot1 $shot2 -property Name, Length

Name                             Length SideIndicator
----                             ------ -------------
baseline.xml                     58150 =>
temp.xml                         45056 =>
testfile.txt                        39 =>
baseline.xml                      8192 <=
temp.xml                         58152 <=
testfile.txt                        26 <=

Note that in this example, since I stored the xml files in the same folder I monitored, they will also show up in the result set.

Summary

Compare-Object is a great way of comparing result sets. You just need to be careful to make sure:

  • you are comparing the same object types (do not mix imported xml data with live data)
  • the syncWindow is large enough to cover the number of expected differences
  • you specify the properties you really want to compare

Use the SideIndicator property to filter result so you only get the changes in one of the result sets. And use the -passThru parameter to get the real objects.

Cheerio

-Tobias


Posted Jan 09 2009, 01:47 AM by Tobias Weltner

Comments

Personal Weblog of John Wood » Blog Archive » Tips & Tricks Using Compare-Object – Copy New Files wrote Personal Weblog of John Wood &raquo; Blog Archive &raquo; Tips &amp; Tricks Using Compare-Object &#8211; Copy New Files
on 07-29-2009 10:58 AM

Pingback from  Personal Weblog of John Wood  » Blog Archive   » Tips & Tricks Using Compare-Object – Copy New Files

Concentrated Tech NSoftware Dell Compellent Sponsored by Idera and Concentrated Tech and NSoftware and Dell Compellent
Copyright 2011 PowerShell.com. All rights reserved.