Compare Two Directories with Powershell
Have you ever been tasked to compare two or more directories on different systems to ensure that they are the same? This seems to come up all too often in my job – I’m afraid to find out why . . . Most people will compare directories by looking at files and their modified date or byte sizes. And this works very well. I also know that there are a lot of tools/applications that already do this, but they either cost money or not approved by my company. So the path of least resistance is to write my own and using a SHA-1 hash as the basis of comparison.
A little review. A hash is a cryptographically unique value of a document. In other words, every document, text, or otherwise produces a unique hash that can be computed again and again. This fact is can be utilized in comparing two different files for uniqueness. Hashes are used every day. For example, our passwords for applications we use online are not stored in the clear but rather hashed (or at least they should be).
So our first step in this quick and dirty script is to write a function to get a hash on any given file. I found a lot of this code via Google and converting some C# code over to Powershell. The function basically opens the file, computes the hash of given type (in this case SHA-1), and then strips out any -
function get-hash {
param (
[string] $file = $(throw 'a filename is required'),
[string] $algorithm = "sha256"
)
$fileStream = [system.io.file]::openread((resolve-path $file))
$hasher = [System.Security.Cryptography.HashAlgorithm]::create($algorithm)
$hash = $hasher.ComputeHash($fileStream)
$fileStream.Close()
return ( ([system.bitconverter]::tostring($hash)).Replace("-") )
}
Now that we know how to get a hash on one file, we’ll have to get the hashes of all files in a directory and sub-directories. We’ll store it in an object array which can be used by compare-object later on. Notice the call to get-hash with an algorithm of sha1. PsIsContainer –eq $false means to skip over directory names.
function get-DirHash(){
begin {
$ErrorActionPreference = "silentlycontinue"
}
process {
dir -Recurse $_ | where { $_.PsIsContainer -eq $false } | select Name,DirectoryName,@{Name="SHA1 Hash"; Expression={get-hash $_.FullName -algorithm "sha1"}}
}
end {
}
}
Now the easy part. We store the object array in two different arrays for the two different directories and then call compare-object to do the comparison. The flag –SyncWindow can also be used.
Compare-Object $($src | get-DirHash) $($dst | get-DirHash) -property @("Name","SHA1 Hash") -includeEqual
This gives a complete script of:
param (
[string] $src,
[string] $dst
)
function get-DirHash(){
begin {
$ErrorActionPreference = “silentlycontinue”
}
process {
dir -Recurse $_ | where { $_.PsIsContainer -eq $false } | select Name,DirectoryName,@{Name=”SHA1 Hash”; Expression={get-hash $_.FullName -algorithm “sha1″ }}
}
end {
}
}
function get-hash {
param(
[string] $file = $(throw ‘a filename is required’),
[string] $algorithm = ‘sha256′
)
$fileStream = [system.io.file]::openread((resolve-path $file))
$hasher = [System.Security.Cryptography.HashAlgorithm]::create($algorithm)
$hash = $hasher.ComputeHash($fileStream)
$fileStream.Close()
return ( ([system.bitconverter]::tostring($hash)).Replace(“-”) )
}
Compare-Object $($src | get-DirHash) $($dst | get-DirHash) -property @(“Name”,”SHA1 Hash”) -includeEqual
The output looks something like the following:
C:\Data\E148884\Code\PowerShell\powershell.exe .\compare_directories.ps1 c:\temp\REF c:\temp\DST
Name SHA1 Hash SideIndicator
---- --------- -------------
deploy-sharepoint.bat 85B5967389E5FA4BDA698AA2FA9889685B7B1CA0 ==
gacutil.exe ACBB2E6BEC6DBA8BA3E7E743A5CDC22D57CC6AD4 ==
deploy-metastorm.bat F83BFB947F8970B76115E4E0B4E5CB20DE54719 =<
deploy-metastorm.bat 3BB7DB50F3FEA06976C7076DD0D30585B0F702DB ;
And there you go. Another quick script that is very useful to administrators.
Great script. One suggestion, put it in an HTML code block. Thanks though this has proved a life saver.