Getting PowerShell Team Blog Topic Headers

In a previous tip, you learned how to use RegEx to scrape information from Web pages. It really is just a matter of finding the right "anchors" to define the start and end of what you are seeking. The next code segment reads all PowerShell team blog headers:

$regex = [RegEx]'<span></span>(.*?)</a></h4>'

$url = 'http://blogs.msdn.com/b/powershell/'
$wc = New-Object System.Net.WebClient
$content = $wc.DownloadString($url)

$regex.Matches($content) | Foreach-Object { $_.Groups[1].Value }

Twitter This Tip! ReTweet this Tip!


Posted Oct 07 2010, 08:00 AM by ps1
Concentrated Tech NSoftware Dell Compellent Sponsored by Idera and Concentrated Tech and NSoftware and Dell Compellent
Copyright 2011 PowerShell.com. All rights reserved.