Raw information used to be stored in comma-separated lists or .ini files, but for some years the XML standard has prevailed. XML is an acronym for Extensible Markup Language and is a descriptive language for any structured information. In the past, handling XML was difficult, but PowerShell now has excellent XML support. With its help, you can comfortably wrap data in XML as well as read existing XML files.
Topics Covered:
XML Structure
XML uses tags to uniquely identify pieces of information. A tag is a pair of angle brackets like the ones used for HTML documents in a Web site. Typically, a piece of information is delimited by a start and end tag. The end tag is preceded by "/"; the result is known as a node, which in this case is called Name:
<Name>Tobias Weltner</Name>
In addition, nodes possess attributes, or information relating to the node itself. This information is in the introductory tag:
<staff branch="Hanover" Type="sales">...</staff>
If a node is empty, the start and end tags can be collapsed. The ending symbol "/" drifts toward the end of the tag. If the branch office in Hanover doesn't have any staff currently working in the sales department, the tag will look like this:
<staff branch="Hanover" Type="sales"/>
Usually, though, nodes aren't empty and they contain further information, which in turn is included in tags. This allows reproduction of information structures that can be nested as deeply as you like. The following XML structure describes two staff members of the Hanover branch office who are working in the sales department.
<staff branch="Hanover" Type="sales">
<employee>
<Name>Tobias Weltner</Name>
<function>management</function>
<age>39</age>
</employee>
<employee>
<Name>Cofi Heidecke</Name>
<function>security</function>
<age>4</age>
</employee>
</staff>
So that XML files can be recognized as such, they usually begin with a header, which in a very simple case might look like this:
<?xml version="1.0" ?>
This header declares that the subsequent XML conforms to the specifications of XML version 1.0. What is known as a "schema" could also be given here. Specifically, a schema has the form of an XSD (XML Schema Definition) file and describes what the valid structure of the XML file should be to fulfill a certain purpose. In the previous example, the schema could specify that there must always be a node called "staff" as part of staff information, which in turn could include as many sub-nodes named "staff" as required. The schema would also specify that information relating to name and function must also be defined for each staff member.
Because XML files consist of plain text, you can easily create them using any editor or directly from within PowerShell. Let's save the previous staff list as an xml file:
$xml = @'
<?xml version="1.0" standalone="yes"?>
<staff branch="Hanover" Type="sales">
<employee>
<Name>Tobias Weltner</Name>
<function>management</function>
<age>39</age>
</employee>
<employee>
<Name>Cofi Heidecke</Name>
<function>security</function>
<age>4</age>
</employee>
</staff>
'@ | Out-File employee.xml
Loading and Processing XML Files
If you want to process XML files as actual XML and not as text, the text contents must be converted into the XML type. The type conversion covered in Chapter 6 performs this task in just one line:
$xmldata = [xml](Get-Content employee.xml)
Use Get-Content to read the XML from the previously saved xml file and [xml] to convert the XML into genuine XML. You could just as easily have directly specified the XML from the $xml variable:
$xmldata = [xml]$xml
However, conversion works only if the specified XML is also valid and contains no syntactic errors. You'll get an error when trying to convert if the structure of your XML is faulty.
The structure of information that describes the XML is now included in $xmldata. From now on, it will be very easy to retrieve single pieces of information because the XML object represents each node as attributes. You can get a staff list like this:
$xmldata.staff.employee
Name Function Age
---- ----- -----
Tobias Weltner management 39
Cofi Heidecke security 4
Accessing Single Nodes and Modifying Data
If a node in your XML is unique, you can access it by typing a dot as in the previous example. Often, however, XML documents contain many similar nodes (known as siblings) just as the last example includes individual employees. For example, you could use the PowerShell pipeline if you'd like to access a particular employee to modify his data:
$xmldata.staff.employee |
Where-Object { $_.Name -match "Tobias Weltner" }
Name function Age
---- ----- -----
Tobias Weltner management 39
$employee = $xmldata.staff.employee |
Where-Object { $_.Name -match "Tobias Weltner" }
$employee.function = "vacation"
$xmldata.staff.employee
Name function Age
---- ----- -----
Tobias Weltner vacation 39
Cofi Heidecke security 4
Using SelectNodes() to Choose Nodes
The SelectNodes() method, which the XPath query language supports, also allows you to select nodes. XPath specifies the "path name" to a node:
$xmldata = [xml](Get-Content employee.xml)
$xmldata.SelectNodes("staff/employee")
Name function Age
---- ----- -----
Tobias Weltner management 39
Cofi Heidecke security 4
The result looks just like the direct accessing of attributes in the preceding example. However, XPath supports wildcards enclosed in square brackets. The next statement retrieves just the first employee node:
$xmldata.SelectNodes("staff/employee[1]")
Name function Age
---- ----- -----
Tobias Weltner management 39
If you'd like, you can get a list of all employees who are under the age of 18:
$xmldata.SelectNodes("staff/employee[age<18]")
Name function Age
---- ----- -----
Cofi Heidecke security 4
In a similar way, the query language will also retrieve the last employee on the list. Position specifications are also possible:
$xmldata.SelectNodes("staff/employee[last()]")
$xmldata.SelectNodes("staff/employee[position()>1]")
Alternatively, you can also use what is known as the XpathNavigator, which you get by multiple type conversion from XML text:
$xpath = [System.XML.XPath.XPathDocument]`
[System.IO.TextReader][System.IO.StringReader]`
(Get-Content employee.xml | out-string)
$navigator = $xpath.CreateNavigator()
$query = "/staff[@branch='Hanover']/employee[last()]/Name"
$navigator.Select($query) | Format-Table Value
Value
-----
Cofi Heidecke
$query = "/staff[@branch='Hanover']/employee[Name!='Tobias Weltner']"
$navigator.Select($query) | Format-Table Value
Value
-----
Cofi Heideckesecurity4
Accessing Attributes
Attributes are information defined in an XML tag. If you'd like to see the attributes of a node, use get_Attributes():
$xmldata.staff.get_Attributes()
#text
-----
Hanover
sales
Use GetAttribute() i f you'd like to query a particular attribute:
$xmldata.staff.GetAttribute("branch")
Hanover
Use SetAttribute() to specify new attributes or modify (overwrite) existing ones:
$xmldata.staff.SetAttribute("branch", "New York")
$xmldata.staff.GetAttribute("branch")
New York
Adding New Nodes
If you'd like to add the names of new employees to the employee list, first use CreateElement() to create an employee element and then to lay down its inner structure. Afterwards, the element can be inserted at the desired location in the XML structure:
$xmldata = [xml](Get-Content employee.xml)
$newemployee = $xmldata.CreateElement("employee")
$newemployee.set_InnerXML( `
"<Name>Bernd Seiler</Name><function>expert</function>")
$xmldata.staff.AppendChild($newemployee)
$xmldata.staff.employee
Name Function Age
---- ----- -----
Tobias Weltner management 39
Cofi Heidecke security 4
Bernd Seiler expert
$xmldata.get_InnerXml()
<?xml version="1.0"?><Branch office staff="Hanover" Type="sales">
<employee><Name>Tobias Weltner</Name><function>management</function>
<age>39</age></employee><employee><Name>Cofi Heidecke</Name>
<function>security</function><age>4</age></employee><employee>
<Name>Bernd Seiler</Name><function>expert</function></employee></staff>
Exploring the Extended Type System
The PowerShell Extended Type System (ETS) ensures that objects can be converted into meaningful text; moreover, it can pass additional properties and methods to objects. The precise instructions for these operations are laid down in XML files having the .ps1xml file extension.
The XML Data of the Extended Type System
Whenever PowerShell has to convert an object into text, it searches through several of its own internal records to find any that describe the object and its conversion. The right files contain XML; their name ends with .format.ps1xml. These files are located in the PowerShell root directory $pshome:
Dir $pshome\*.format.ps1xml
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a--- 4/13/2007 19:40 22120 Certificate.format.ps1xml
-a--- 4/13/2007 19:40 60703 DotNetTypes.format.ps1xml
-a--- 4/13/2007 19:40 19730 FileSystem.format.ps1xml
-a--- 4/13/2007 19:40 250197 Help.format.ps1xml
-a--- 4/13/2007 19:40 65283 PowerShellCore.format.ps1xml
-a--- 4/13/2007 19:40 13394 PowerShellTrace.format.ps1xml
-a--- 4/13/2007 19:40 13540 Registry.format.ps1xml
All these files define a multitude of Views, which you can examine using PowerShell XML support.
[xml]$file = Get-Content "$pshome\dotnettypes.format.ps1xml"
$file.Configuration.ViewDefinitions.View
Name ViewSelectedBy TableControl
---- -------------- ------------
System.Reflection.Assembly ViewSelectedBy TableControl
System.Reflection.AssemblyName ViewSelectedBy TableControl
System.Globalization.CultureInfo ViewSelectedBy TableControl
System.Diagnostics.FileVersionInfo ViewSelectedBy TableControl
System.Diagnostics.EventLogEntry ViewSelectedBy TableControl
System.Diagnostics.EventLog ViewSelectedBy TableControl
System.Version ViewSelectedBy TableControl
System.Drawing.Printing.PrintDo... ViewSelectedBy TableControl
Dictionary ViewSelectedBy TableControl
ProcessModule ViewSelectedBy TableControl
process ViewSelectedBy TableControl
PSSnapInInfo ViewSelectedBy
PSSnapInInfo ViewSelectedBy TableControl
Priority ViewSelectedBy TableControl
StartTime ViewSelectedBy TableControl
service ViewSelectedBy TableControl
(...)
Finding Predefined Views
Predefined views are highly interesting because you can use the -view parameter to make extensive adjustments and modifications of results given by formatting cmdlets like Format-Table or Format-List.
Get-Process | Format-Table -view Priority
Get-Process | Format-Table -view StartTime
Unfortunately, there's nobody to inform you of the availability of the Priority and StartTime predefined views or of other views. You can look in the relevant XML files. The view shows that every view node contains the child nodes Name, ViewSelectedBy, and TableControl. But the raw XML data of the view may look confusing and unclear at first:
$xmldata = $file.Configuration.ViewDefinitions.View |
Select-Object -first 1
$xmldata.get_OuterXML()
A little re-formatting results in text that's easier to read:
$xmldata.get_OuterXML().Replace("<", "`t<").Replace(">", ">`t")`
.Replace(">`t`t<", ">`t<").Split("`t") |
ForEach-Object {$x=0}{ If ($_.StartsWith("</")) {$x--} `
ElseIf($_.StartsWith("<")) { $x++}; (" " * ($x)) + $_; `
if ($_.StartsWith("</")) { $x--} elseif `
($_.StartsWith("<")) {$x++} }
<View>
<Name>
System.Reflection.Assembly
</Name>
<ViewSelectedBy>
<TypeName>
System.Reflection.Assembly
</TypeName>
</ViewSelectedBy>
<TableControl>
<TableHeaders>
<TableColumnHeader>
<Label>
GAC
</Label>
<Width>
6
</Width>
</TableColumnHeader>
<TableColumnHeader>
<Label>
Version
</Label>
<Width>
14
</Width>
</TableColumnHeader>
<TableColumnHeader />
</TableHeaders>
<TableRowEntries>
<TableRowEntry>
<TableColumnItems>
<TableColumnItem>
<PropertyName>
GlobalAssemblyCache
</PropertyName>
</TableColumnItem>
<TableColumnItem>
<PropertyName>
ImageRuntimeVersion
</PropertyName>
</TableColumnItem>
<TableColumnItem>
<PropertyName>
Location
</PropertyName>
</TableColumnItem>
</TableColumnItems>
</TableRowEntry>
</TableRowEntries>
</TableControl>
</View>
Each view consists of a Name, a .NET type in ViewSelectedBy for which the view is valid, as well as the TableControl node specifying how the object is supposed to be converted into text. Just use Format-Table to output the data if you want to output all the views specified in the XML file in columns, . Then, select the properties that you want to show in the summary:
[xml]$file = Get-Content "$pshome\dotnettypes.format.ps1xml"
$file.Configuration.ViewDefinitions.View |
Format-Table Name, {$_.ViewSelectedBy.TypeName}
Name $_.ViewSelectedBy.TypeName
---- --------------------------
System.Reflection.Assembly System.Reflection.Assembly
System.Reflection.AssemblyName System.Reflection.AssemblyName
System.Globalization.CultureInfo System.Globalization.CultureInfo
System.Diagnostics.FileVersionInfo System.Diagnostics.FileVersionInfo
System.Diagnostics.EventLogEntry System.Diagnostics.EventLogEntry
System.Diagnostics.EventLog System.Diagnostics.EventLog
System.Version System.Version
System.Drawing.Printing.PrintDocument System.Drawing.Printing.PrintDocument
Dictionary System.Collections.DictionaryEntry
ProcessModule System.Diagnostics.ProcessModule
process {System.Diagnostics.Process, Deserialized.Sy...
PSSnapInInfo System.Management.Automation.PSSnapInInfo
PSSnapInInfo System.Management.Automation.PSSnapInInfo
Priority System.Diagnostics.Process
StartTime System.Diagnostics.Process
service System.ServiceProcess.ServiceController
System.Diagnostics.FileVersionInfo System.Diagnostics.FileVersionInfo
System.Diagnostics.EventLogEntry System.Diagnostics.EventLogEntry
System.Diagnostics.EventLog System.Diagnostics.EventLog
System.TimeSpan System.TimeSpan
System.TimeSpan System.TimeSpan
System.TimeSpan System.TimeSpan
System.AppDomain System.AppDomain
System.ServiceProcess.ServiceController System.ServiceProcess.ServiceController
System.Reflection.Assembly System.Reflection.Assembly
System.Collections.DictionaryEntry System.Collections.DictionaryEntry
process System.Diagnostics.Process
DateTime System.DateTime
System.Security.AccessControl.ObjectSecurity System.Security.AccessControl.ObjectSecurity
System.Security.AccessControl.ObjectSecurity System.Security.AccessControl.ObjectSecurity
System.Management.ManagementClass System.Management.ManagementClass
Here you see all of the views defined in this XML file. The object types for which the views are defined are in the second column. The Priority and StartTime views, which we just used, are also on the list. After a look at the second column, it should be clear that the views are intended for System.Diagnostics.Process objects, precisely the objects that Get-Process retrieves:
(Get-Process | Select-Object -first 1).GetType().FullName
System.Diagnostics.Process
Surprisingly, doubles of some names crop up. The reason is that, along with the TableControl node in the last example, other nodes convert objects: ListControl, WideControl and CustomControl. These nodes weren't displayed in the first overview simply because only one node of this kind is allowed for each view. A TableControl was output more or less randomly since PowerShell bases its text conversion of unknown objects on the first record.
You are now in a position to extract all required information from the XML file. First, sort the views by ViewSelectedBy.TypeName, and then group them by this criterion. You can sort out all the views that match only once for a particular object type. You need only those views of which several exist for an object type, making it worthwhile to use the -view parameter for the selection.
[xml]$file = Get-Content "$pshome\dotnettypes.format.ps1xml"
$file.Configuration.ViewDefinitions.View |
Sort-Object {$_.ViewSelectedBy.TypeName} |
Group-Object {$_.ViewSelectedBy.TypeName} |
Where-Object { $_.Count -gt 1} |
ForEach-Object { $_.Group} |
Format-Table Name, {$_.ViewSelectedBy.TypeName}, `
@{expression={if ($_.TableControl) { "Table" } elseif `
($_.ListControl) { "List" } elseif ($_.WideControl) { "Wide" } `
elseif ($_.CustomControl) { "Custom" }};label="Type"} -wrap
If you're wondering about the formatting of these lines, take a look again at Chapter 5, which covered formatting. What's important about formatting cmdlets like Format-Table and others is that they make it possible for you to specify object properties or scriptblocks as columns. Sub-expressions are mandatory as long as what you want to display in a column is not the direct but subordinate property of the object. Because you aren't interested in the direct property ViewSelectedBy but rather in its sub-property TypeName, the column would have to be defined as a scriptblock. The third column is also a scriptblock. Because its length conflicts with the column heading, a formatting hash table should be applied here to permit you to select the column heading.
The result is an edited list that provides you with the names of all the views in the first column. The view that is appropriate for a respective object type is in the second column. The third column shows whether a view is meant for Format-Table, Format-List, Format-Wide or Format-Custom.
Name $_.ViewSelectedBy.TypeName Type
---- -------------------------- ----
Dictionary System.Collections.DictionaryEntry Table
System.Collections. System.Collections.DictionaryEntry List
DictionaryEntry
System.Diagnostics. System.Diagnostics.EventLog Table
EventLog
System.Diagnostics. System.Diagnostics.EventLog List
EventLog
System.Diagnostics. System.Diagnostics.EventLogEntry List
EventLogEntry
System.Diagnostics. System.Diagnostics.EventLogEntry Table
EventLogEntry
System.Diagnostics. System.Diagnostics.FileVersionInfo Table
FileVersionInfo
System.Diagnostics. System.Diagnostics.FileVersionInfo List
FileVersionInfo
Priority System.Diagnostics.Process Table
process System.Diagnostics.Process Wide
StartTime System.Diagnostics.Process Table
PSSnapInInfo System.Management.Automation. List
PSSnapInInfo
PSSnapInInfo System.Management.Automation. Table
PSSnapInInfo
System.Reflection. System.Reflection.Assembly Table
Assembly
System.Reflection. System.Reflection.Assembly List
Assembly
System.Security. System.Security.AccessControl. List
AccessControl. ObjectSecurity
ObjectSecurity
System.Security. System.Security.AccessControl. Table
AccessControl. ObjectSecurity
ObjectSecurity
service System.ServiceProcess. Table
ServiceController
System.ServiceProcess. System.ServiceProcess. List
ServiceController ServiceController
System.TimeSpan System.TimeSpan List
System.TimeSpan System.TimeSpan Wide
System.TimeSpan System.TimeSpan Table
Remember that there are several XML files containing formatting information. You'll only get a full overview of them when you generate a list for all formatting XML files.
Posted
Mar 30 2009, 08:05 AM
by
ps1