11-27-2008
Downloads: 370
File size: 1.3kB
Views: 1,365
Embed
 |
Extracting Popular Names from Website |
-
-
-
- $decade = Read-host 'Enter decade (1880 - 2000)'
-
- Write-Progress "Connecting Web" "www.ssa.gov"
- $wc = new-Object System.Net.WebClient
- $nl = $wc.DownloadString("http://www.ssa.gov/OACT/babynames/decades/names$($decade)s.html")
- Write-Progress "Analyzing Data" "extracting..."
- $r = [regex]'="15%">(.*?)</td>'
- $m = $r.Matches($nl)
-
- $list = @()
- $sex = "male"
-
- foreach ($i in 0..($m.count -1) ) {
-
-
- $record = '' | Select-Object Name, Count, Percent, Sex
- $record.Name = $m[$i].groups[1].Value
- if (!($i % 60)) {
- Write-Progress "Finding Names ($($i/3))" $record.Name -percentComplete ($i * 100 / $m.count)
- }
- [void] $foreach.MoveNext()
- $record.Count = [int]($m[$foreach.current].groups[1].value)
- [void] $foreach.MoveNext()
- $record.Percent = "{0:p4}" -f (([double]$m[$foreach.current].groups[1].value) / 100)
-
- $Record.Sex = $sex
- if ($sex -eq 'male') { $sex='female' } else { $sex = 'male' }
- $list += $record
-
-
- }
-
- $list | Select-Object -first 5
- '#' * 40
- $list | Sort-Object count -descending | Where-Object { $_.Sex -eq 'male' } | Select-Object -first 5
Script demos how to extract raw HTML data from a website and turn the data into rich PowerShell objects. It asks for a decade (1880 - 2000) and then presents you the top favorite male and female names. It also demos how to use Write-Progress to display status and progress bars. The script is based on original work done by MOW (http://thepowershellguy.com/blogs/posh/archive/2007/02/13/hey-powershell-how-popular-is-this-baby-name.aspx).