XML Part 1: Playing with RSS Feeds and XML Content

A lot of data these days is wrapped as XML, and up until now, handling XML data wasn't a piece of cake. PowerShell makes handling XML a lot easier. This is the first part of a little series about XML and PowerShell. We start with accessing XML documents and reading data.

Getting XML Data

Let's first look how PowerShell can lay hands on XML data. One way is to create a new and empty XML document:

$xml = New-Object XML

Next, you can use your new XML document to load XML data, either from a local file or from the Internet. Use the Load() method. Let's for example load the www.powershell.com RSS ticker:

PS> $a = New-Object XML
PS> $a.Load("http://powershell.com/cs/blogs/MainFeed.aspx")
PS> $a

xml                                   xml-stylesheet                       rss
---                                   --------------                       ---
                                                                           rss
PS>

As long as you have access to the Internet, these lines will download the RSS news ticker into $a as an XML document, and when you output $a, you see a number of properties.

Browsing XML Data

What exactly is their meaning? To find out, let's first save the downloaded RSS ticker to file using the built-in Save() method. Next, we take a look at the first lines in that XML document by reading the xml file using Get-Content and selecting only the first 3 lines using Select-Object -First:

PS> $a.save("$home\rssticker.xml")
PS> Get-Content $home\rssticker.xml | Select-Object -first 3
<?
xml version="1.0" encoding="UTF-8"?>
<?
xml-stylesheet type="text/xsl" href="http://powershell.com/cs/utility/FeedStylesheets/rss.xsl" media="screen"
?>
<
rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/sla
sh/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
>

As it turns out, the PowerShell xml document in $a returns the top XML nodes: xml, xml-stylesheet and rss. Diving into nested information inside an XML document is very easy because it works in pretty much the same way as with objects. So if you'd like to see all the nodes below the root node rss, you use the rss property:

PS> $a.rss


version : 2.0
dc      : http://purl.org/dc/elements/1.1/
slash   : http://purl.org/rss/1.0/modules/slash/
wfw     : http://wellformedweb.org/CommentAPI/
itunes  : http://www.itunes.com/dtds/podcast-1.0.dtd
channel : channel

PS>

Again, if you'd looked at the raw XML data, you would see that the RSS node has these children: version, dc, slash, wfw, itunes and channel. Most of these are attributes with pieces of information. Channel however is another parent node and has a bunch of children. To look at the children, again use this property name:

PS> $a.rss.channel
format-default : The member "Item" is already present.

Property Name Conflict - And A Simple Workaround

Bang! You get a red error message. You cannot list the children inside the channel parent node. This is a bug, or rather, a conceptional weakness in design.

If you look at the raw XML data, you will see that the channel node contains a bunch of item nodes, and these item nodes are the ones with the interesting stuff: the feed messages. Unfortunately, the XML document object already has its own item property. The moment PowerShell tries to add the item property for the item children, you get the error message. An object cannot have two properties with the same name.

And here is the workaround: work in "blind" mode and just assume the children elements are named "item" (or whatever the error message indicates):

PS> $a.rss.channel.item


title       : Group Policy Cmdlets in Windows 7
link        : http://powershell.com/cs/blogs/windows-powershell-team/archive/2009/01/16/group-policy-cmdlets-in
              -windows-7.aspx
pubDate     : Sat, 17 Jan 2009 04:46:08 GMT
guid        : guid
creator     : Windows PowerShell Blog
comments    : 0
description : Lilia Gutnik has posted a blog entry HERE showing an example of the Windows 7 Group Policy cmdlet
              s.&#160; Check it out. &#160; Experiment!&#160; Enjoy!&#160; Engage! Jeffrey Snover [MSFT] Window
              s Management Partner Architect Visit the Windows PowerShell Team blog at:&#160;&#160;&#160; http:
              //blogs.msdn.com/PowerShell Visit the Windows PowerShell ScriptCenter at:&#160; http://www.micros
              oft.com/technet Read More......(<a href="http://powershell.com/cs/blogs/windows-powershell-team/a
              rchive/2009/01/16/group-policy-cmdlets-in-windows-7.aspx">read more</a>)<img src="http://powershe
              ll.com/cs/aggbug.aspx?PostID=899" width="1" height="1">

title       : Jeffrey Snover and Bruce Payette on the PowerScripting Podcast
link        : http://powershell.com/cs/blogs/under-the-stairs/archive/2009/01/16/jeffrey-snover-and-bruce-payet
              te-on-the-powerscripting-podcast.aspx
pubDate     : Fri, 16 Jan 2009 14:35:00 GMT
guid        : guid
creator     : Under The Stairs
comments    : 0
description : I love downloading podcasts to my Zune and listening to them as I travel. I ve got a bit of a bac
              klog, but one I ve just downloaded and will be listening to shortly (possibly tomorrow as I head
              from Milan back to London) is the PowerScripting Podcast Read More......(<a href="http://powershe
              ll.com/cs/blogs/under-the-stairs/archive/2009/01/16/jeffrey-snover-and-bruce-payette-on-the-power
              scripting-podcast.aspx">read more</a>)<img src="http://powershell.com/cs/aggbug.aspx?PostID=898"
              width="1" height="1">
category    : {category, category}

title       : How To Make Your Own Module Repository
link        : http://powershell.com/cs/blogs/windows-powershell-team/archive/2009/01/16/how-to-make-your-own-mo
              dule-repository.aspx
pubDate     : Fri, 16 Jan 2009 08:52:13 GMT
guid        : guid
creator     : Windows PowerShell Blog
comments    : 0
description : Andy Schneider (from Get-PowerShell.com ) recently asked me how he could make sure that everyone
              at Avanade could get a consistent set of modules. I run into a somewhat similar problem here at M
              icrosoft, where I want to take scripts I&#39;ve built to work with internal applications and make
               them easy for people to use, even if they&#39;re not already using PowerShell. I don&#39;t want
              the scripts to Read More......(<a href="http://powershell.com/cs/blogs/windows-powershell-team/ar
              chive/2009/01/16/how-to-make-your-own-module-repository.aspx">read more</a>)<img src="http://powe
              rshell.com/cs/aggbug.aspx?PostID=895" width="1" height="1">
category    : {category, category}


(...)

Now you get to the good stuff! These are the headlines you're after.

Using The PowerShell Pipeline

You just learned that you can access any RSS feed in the world by loading it into a new and empty XML document.

Since RSS feeds are XML documents and adhere to a defined format, you can always list all RSS messages by looking at the RSS.Channel.Item property. This returns all raw data contained in the individual RSS items. From here, simply use PowerShell Pipeline tricks to filter what you are looking for.

First, I'd like to filter the information. I only want to see the blog entry title and the link. Easy: use Format-Table:

PS> $a = New-Object XML
PS> $a.Load("http://powershell.com/cs/blogs/MainFeed.aspx")
PS> $a.rss.channel.item | Format-Table Title, Link

title                                                   link
-----                                                   ----
Group Policy Cmdlets in Windows 7                       http://powershell.com/cs/blogs/windows-powershell-te...
Jeffrey Snover and Bruce Payette on the PowerScripti... http://powershell.com/cs/blogs/under-the-stairs/arch...
How To Make Your Own Module Repository                  http://powershell.com/cs/blogs/windows-powershell-te...
Please Join Me for a Power Scripting Podcast Tonight... http://powershell.com/cs/blogs/windows-powershell-te...
Running PowerShell Scripts via Email                    http://powershell.com/cs/blogs/under-the-stairs/arch...
Date and Time in PowerShell (and WMI)                   http://powershell.com/cs/blogs/under-the-stairs/arch...
Date and Time in PowerShell                             http://powershell.com/cs/blogs/powershell-scripts/ar...
Windows 7 Troubleshooting                               http://powershell.com/cs/blogs/windows-powershell-te...
Podcast Discussing WSMAN 1/14/2008                      http://powershell.com/cs/blogs/windows-powershell-te...
Get-UpTime.ps1                                          http://powershell.com/cs/blogs/powershell-scripts/ar...
Interactive remoting in CTP3                            http://powershell.com/cs/blogs/windows-powershell-te...
How to copy colorized script from PowerShell ISE        http://powershell.com/cs/blogs/windows-powershell-te...
Please Give Us Feedback                                 http://powershell.com/cs/blogs/windows-powershell-te...
Blogging in 2008                                        http://powershell.com/cs/blogs/under-the-stairs/arch...
V2 Blog Entries                                         http://powershell.com/cs/blogs/windows-powershell-te...
Copy console screen to system clipboard                 http://powershell.com/cs/blogs/windows-powershell-te...
Get-Screensaver.ps1                                     http://powershell.com/cs/blogs/powershell-scripts/ar...
Colorized capture of console screen in HTML and RTF.    http://powershell.com/cs/blogs/windows-powershell-te...
Finding a URL For File Transfer Cmdlets                 http://powershell.com/cs/blogs/windows-powershell-te...
Transferring (Large) Files Using BITs                   http://powershell.com/cs/blogs/windows-powershell-te...
Test-PSCmdlet                                           http://powershell.com/cs/blogs/windows-powershell-te...
Capture console screen                                  http://powershell.com/cs/blogs/windows-powershell-te...
Get-Hash2.ps1                                           http://powershell.com/cs/blogs/powershell-scripts/ar...
PowerShell as Inventory Tool                            http://powershell.com/cs/blogs/windows-powershell-te...
Windows 7 Beta Has Arrived   But Not For Everyone       http://powershell.com/cs/blogs/under-the-stairs/arch...

Next, I only want the top 5 entries, and I do not want the link to be cut off. So I insert a Select-Object -First (remember, the Format cmdlets always have to be the last element in your pipeline), and I add a -wrap parameter to Format-Table:

PS> $a.rss.channel.item | Select-Object -first 5 | Format-Table Title, Link -wrap

title                                                   link
-----                                                   ----
Group Policy Cmdlets in Windows 7                       http://powershell.com/cs/blogs/windows-powershell-team/
                                                        archive/2009/01/16/group-policy-cmdlets-in-windows-7.as
                                                        px
Jeffrey Snover and Bruce Payette on the PowerScripting  http://powershell.com/cs/blogs/under-the-stairs/archive
Podcast                                                 /2009/01/16/jeffrey-snover-and-bruce-payette-on-the-pow
                                                        erscripting-podcast.aspx
How To Make Your Own Module Repository                  http://powershell.com/cs/blogs/windows-powershell-team/
                                                        archive/2009/01/16/how-to-make-your-own-module-reposito
                                                        ry.aspx
Please Join Me for a Power Scripting Podcast Tonight @  http://powershell.com/cs/blogs/windows-powershell-team/
9PM EST (6PM PST)                                       archive/2009/01/15/please-join-me-for-a-power-scripting
                                                        -podcast-tonight-9pm-est-6pm-pst.aspx
Running PowerShell Scripts via Email                    http://powershell.com/cs/blogs/under-the-stairs/archive
                                                        /2009/01/15/running-powershell-scripts-via-email.aspx

Can I filter based on topic, too? Sure thing. If you are interested in Windows 7, the upcoming new Windows client, and you'd like to see only blogs about Windows 7, then add a Where-Object into your pipeline.

Inside of it, the $_ placeholder represents the actual blog entry as it is running over the pipeline, and you then can check whether the title property contains one or more keywords.

PS> $a.rss.channel.item | Where-Object { $_.Title -like '*Windows 7*' } | Format-Table Title, Description

title                                                   description
-----                                                   -----------
Group Policy Cmdlets in Windows 7                       Lilia Gutnik has posted a blog entry HERE showing an...
Windows 7 Troubleshooting                               Windows 7 has a cool new extensible troubleshooting ...
Windows 7 Beta Has Arrived   But Not For Everyone       The Windows 7 and Windows Server 2008 R2 beta versio...

Maybe you'd like to output this as a nice report. Here is the thing to remember: Never use Format-... cmdlets when you plan to output results to sources other than the console. Use Select-Object instead. To create a simple HTML report, this is what I'd do:

PS> $a.rss.channel.item | Where-Object { $_.Title -like '*Windows 7*' } | Select-Object Title, Description | 
ConvertTo-Html | Out-File $home\report.htm; & "$home\report.htm"

This is what the resulting report looks like:

titledescription
Group Policy Cmdlets in Windows 7 Lilia Gutnik has posted a blog entry HERE showing an example of the Windows 7 Group Policy cmdlets.&#160; Check it out. &#160; Experiment!&#160; Enjoy!&#160; Engage! Jeffrey Snover [MSFT] Windows Management Partner Architect Visit the Windows PowerShell Team blog at:&#160;&#160;&#160; http://blogs.msdn.com/PowerShell Visit the Windows PowerShell ScriptCenter at:&#160; http://www.microsoft.com/technet Read More......(<a href="http://powershell.com/cs/blogs/windows-powershell-team/archive/2009/01/16/group-policy-cmdlets-in-windows-7.aspx">read more</a>)<img src="http://powershell.com/cs/aggbug.aspx?PostID=899" width="1" height="1">
Windows 7 Troubleshooting Windows 7 has a cool new extensible troubleshooting framework which is entirely based on PowerShell scripts.&#160; Rafael Rivera has written a very good step-by-step guide for how to author a Win7 Troubleshooting Pack HERE . Check it out. Experiment!&#160; Enjoy!&#160; Engage! Jeffrey Snover [MSFT] Windows Management Partner Architect Visit the Windows PowerShell Team blog at:&#160;&#160;&#160; http Read More......(<a href="http://powershell.com/cs/blogs/windows-powershell-team/archive/2009/01/14/windows-7-troubleshooting.aspx">read more</a>)<img src="http://powershell.com/cs/aggbug.aspx?PostID=880" width="1" height="1">
Windows 7 Beta Has Arrived – But Not For Everyone The Windows 7 and Windows Server 2008 R2 beta versions were released this week. I got the ISOs myself during the week, and finished off today loading R2, Win7 Ultimate and WIn7 Home Premium as VMware virtual machines. But it looks like Microsoft has totally Read More......(<a href="http://powershell.com/cs/blogs/under-the-stairs/archive/2009/01/10/windows-7-beta-has-arrived-but-not-for-everyone.aspx">read more</a>)<img src="http://powershell.com/cs/aggbug.aspx?PostID=848" width="1" height="1">

Admittedly, the HTML report does not really look very stylish, but you could change that as well:

PS> $head = '<style> BODY{font-family:Verdana; background-color:lightblue;} TABLE{border-width: 1px;border-style
: solid;border-color: black;border-collapse: collapse;} TH{font-size:1.3em; border-width: 1px;padding: 2px;borde
r-style: solid;border-color: black;background-color:#FFCCCC} TD{border-width: 1px;padding: 2px;border-style: sol
id;border-color: black;background-color:yellow}</style>'
PS> $title = "My Report"
PS> $body = "<H1>New Windows 7 + PowerShell Blogs</H1>"
PS> $a.rss.channel.item | Where-Object { $_.Title -like '*Windows 7*' } | Select-Object Title, Description | 
ConvertTo-Html -title $title -head $head -body $body    | Out-File $home\report.htm; & "$home\report.htm"

Here is more brainfood on colorizing HTML reports: http://powershell.com/cs/blogs/tips/archive/2009/01/05/outputting-html-reports.aspx

Convert Text to  XML

To wrap up this first part of our XML series, let's finally look at two more ways to read XML data into PowerShell. At the beginning of this article, we loaded the XML via Internet into an empty XML object. From here, you can save the XML as file. Using the same Load() method, you can also load XML data from an XML file:

PS> $a = New-Object XML
PS> $a.Load("http://powershell.com/cs/blogs/MainFeed.aspx")
PS> $a.Save("$home\myxml.xml")
PS>
PS> $b = New-Object XML
PS> $b.Load("$home\myxml.xml")
PS> $b

xml                                   xml-stylesheet                       rss
---                                   --------------                       ---
                                                                           rss

So the second approach is to load a file-based XML into an empty XML object. The third approach uses type conversion. You read in XML data as plain text, then convert this into the XML data type.

Remember two things here: Type conversion works by writing the type (in square brackets) in front of the data you want to convert, and second: use parenthesis around Get-Content because you do not want to convert the Get-Content cmdlet of course but rather its result.

PS> $c = [xml] (Get-Content $home\myxml.xml)
PS> $c

xml                                   xml-stylesheet                       rss
---                                   --------------                       ---
                                                                           rss

Next Steps...

Next time, we will create our own XML documents, look at some more advanced data analysis and update and change XML data in an XML file. Make sure to check back next week! Which you now by the way could automate, too. Simply use the RSS ticker in the examples above to check for new blog entries on powershell.com!

Cheers and a great and relaxing weekend to you,
and don't forget to check out PowerShell Plus! It's great and simply one of the best ways to learn PowerShell!

-Tobias

MVP Windows PowerShell 


Posted Jan 17 2009, 10:37 AM by Tobias Weltner

Comments

Dew Drop – January 17, 2009 | Alvin Ashcraft's Morning Dew wrote Dew Drop &ndash; January 17, 2009 | Alvin Ashcraft's Morning Dew
on 01-17-2009 8:28 PM

Pingback from  Dew Drop – January 17, 2009 | Alvin Ashcraft's Morning Dew

Dreaming in PowerShell wrote XML Part 2: Write, Add And Change XML Data
on 02-02-2009 3:22 AM

In the previous post, I demonstrated how PowerShell handles XML data and how easy it is to load XML from

Richard wrote re: XML Part 1: Playing with RSS Feeds and XML Content
on 07-21-2009 8:29 AM

I also was playing with RSS feeds in powershell (look at my post on www.vbscripts.nl/blogie) but there are a few thinks i couldn't figure out like my post says. Maybe any of you have an idea? I can't get the script to work with Wordpress blogs. Does anyone fixed this problem? Thanks...

Copyright 2012 PowerShell.com. All rights reserved.