Fastest way to find a value in bigarrays

rated by 0 users
This post has 6 Replies | 1 Follower

Top 500 Contributor
Posts 5
Trondhindenes Posted: 11-16-2010 10:06 AM

Hi all,

Im struggling with a script we are writing for a customer. Each night, we get a csv file from an Oracle database containing users we create as contacts in Active Directory.

So, I read in the CSV and the existsing contacts from AD, and do a compare to find those that don't already exist in ad. For these, we create them.

Now, if this was a few hundred users it would be fine, but this file is currently containing 52000 users, and my script just doesn't work fast enough. Right now, it handles about 5000 users per 30 minutues or so.

So, to make it as effective as possible, I'm looking for ways to make array lookups as effective as possible. Each user has a unique number (membernumber), and so I was thinking that maybe sorting both tables on this number and make the lookup stop looking when the first instance is found woudl speed things up (as in don't keep looking through all 52000 lines if you found what you needed in line 3).

I know this is general, but any pointers on how to do effective lookups/comparisons for large amounts of data would be so helpful.

any pointers appreciated.

 

regards,

Trond Hindenes

Top 10 Contributor
Posts 597
Microsoft MVP
Top Contributor
Can you get the AD users as csv also? Then, you could use Compare-Object. Make sure both CSVs use the memberID as property name (column header), and sort that for both first: $a = Import-CSV listAD.csv | Sort-Object memberID $b = Import-CSV list2.csv | Sort-Object memberID Compare-Object $a $b -property memberID This may be a fast way, but you need to look at the syncwIndow property as well to make sure it detects all differences.
Top 10 Contributor
Posts 597
Microsoft MVP
Top Contributor

Here is another superfast way:

[System.Collections.ArrayList]$list = 1..300000

Top 10 Contributor
Posts 597
Microsoft MVP
Top Contributor

Here is another superfast way:

[System.Collections.ArrayList]$list = 1..300000   # this would be your large list of memberids

$list.Contains($idcompare)   # this would lookup an id in your arraylist and return true/false

Top 500 Contributor
Posts 5

AH, didnt know about the compare-object, ill check that out! Thanks!

Top 500 Contributor
Posts 5

Just wanted to let you know, the compare-object is EXACLTY what I need. This is a script that runs each night, and the changes from night to night are small (0.1-1% maybe), so Ill pipe everything through a compare first (compared to last nights run) and then have a much smaller amount of data I can perform AD manipulations with.

Again, thanks so much. Saved my day!

Top 10 Contributor
Posts 597
Microsoft MVP
Top Contributor

pleasure... glad it works for you so well!

Page 1 of 1 (7 items) | RSS
Copyright 2012 PowerShell.com. All rights reserved.