9.2 - PMA Metadata Mapping PowerShell API
To assist with writing powerful script and custom functions, Proventeq Migration Accelerator provides some built in variables and helper module instances.
9.2.1 - Pre-defined variables
The following variables are available to all scripts/custom functions:
- MigrationItem
- SI - The source representation of the item
- ItemName - The name of the item as shown in the user interface.
- MimeType - The file extension of the item.
- Uri - The logical path of the item.
- AlternateUri - The path to the binary if the item is a file.
- GroupId - The document identifier that is common across all document versions.
- VersionLabel - The major version.
- SubVersionLabel - The minor version.
- EntityCategory - 0 for containers, 1 for files and links.
- EntityType - File, Link or Folder.
- ContentType - The content type classification of the item.
- ItemSize - The size of the binary in bytes.
- Level - The depth of the item in the folder tree.
- HasUniqueSecurity - Whether the item has security information assigned.
- SIC - The source metadata exposed as an XML document.
- SI - The source representation of the item
- Content – The actual binary content corresponding to the item in the source repository, this is a simple byte array. This is typically used for transforming XML or HTML content. For example, if the source item is an Xml file it would be the UTF-8 encoded binary of the XML. Note: This variable only contains values during the processing stage. It would contain a null value when in script editing mode.
- ContentType (string) – This variable contains the ContentType captured for the migration item during the Discovery stage.
- Log - A logging object of type log4Net.ILog - this can be used to write messages to the application log with statement such as $Log.DebugFormat("Doing x for item with path {0}", $MigrationItem.SI.Uri);
9.2.2 - XML Helper Module
The XML Helper Module can be used to extract and transform source content which is in XML format.
Sr.No. |
Function API |
Description |
Example Usage (Powershell) |
1 |
GetValueByXPath(byte[] byteArray, string xPath, string nsName, string nsUri) |
Loads given XML byte array and return value of given node path, namespace and uri. |
$XmlHelper. GetValueByXPath(Xmlbytes, “/Persons/Person/Age”, “pvq”, “http://www.pvq.com”) |
2 |
MetaDictionaryToXElement(IDi ctionary<string, string> metaDict) |
Converts given dictionary object to XML. |
$XmlHelper. MetaDictionaryToXElement (keyValuedictionary) |
3 |
XElementToMetaDictionary<T> (XElement elemMeta) |
Converts given XML to dictionary object. |
$XmlHelper. XElementToMetaDictionary (“<?xml version="1.0" ?><node></node>”) |
9.2.3 - HTML Helper API
The HTML Helper API can be used to extract and transform source HTML content. It can be used to:
- Retrieve specific HTML content from source HTML based on complex XPath queries. This helps in extracting only the relevant HTML from source.
- Cleanse HTML by removing unwanted nodes and/or attributes.
This is particularly useful for removing embedded styles within HTML. - Transform HTML by replacing nodes. This can be used for achieving standards compliance e.g. AA
An instance of HTML Helper module is available to metadata mappings scripts and functions as a variable name ‘HtmlHelper’.
The list of methods available for HTML Helper are listed below:
Sr.No. |
Function API |
Description |
Example Usage |
1 |
DecodeHTML(string htmlContent) |
Convert the string that has been HTML encoded for HTTP transmission into a decoded string. |
$HtmlHelper. DecodeHTML(“<html>xxx</html >”) |
2 |
LoadHtmlBytes(byte[] content) |
Loads HTML byes into HTMLDocument object. |
$HtmlHelper.LoadHtmlBytes(byt es) |
3 |
LoadHtml(string html) |
Loads HTML string into HTMLDocument object. |
$HtmlHelper.LoadHtml(“<html>x xx</html>”) |
4 |
LoadUrl(string url) |
Loads HTML content from given Url into HTMLDocument object. |
$HtmlHelper.LoadUrl(Url) |
5 |
GetOuterHTML(string nodeXpath) |
Returns outer HTML string for given node path of HTMLDocument object loaded by using Load method. |
$HtmlHelper.GetOuterHTML(“//t r[@class = 'someClass1' or @class = 'someClass2']") |
6 |
GetValueByXPath(byte[] content, string xPath) |
Returns inner HTML value from given HTML content and node path. |
$HtmlHelper.GetValueByXPath(b ytes, “//tr[@class = 'someClass1' or @class = 'someClass2']") |
7 |
GetInnerHtml(string nodeXpath) |
Returns inner HTML value for given node path of HTMLDocument object loaded by using Load method. |
$HtmlHelper.GetInnerHtml(“//tr [@class = 'someClass1' or @class = 'someClass2']") |
8 |
GetContentBetweenComment s(string startComment, string endComment) |
Returns content between given start and end comment present in HTMLDocument object loaded by using Load method. |
$HtmlHelper.GetContentBetwee nComments( “startComment”, “endComment”) |
9 |
RemoveNodesByXpath(string nodeXpath) |
Removes given node present in HTMLDocument object loaded by using Load method. |
$HtmlHelper.RemoveNodesByXp ath(“//tr[@class = 'someClass1' or @class = 'someClass2']") |
10 |
RemoveAtrributesByXpath(stri ng attributeName, string attributeXpath) |
Removes given attribute present in HTMLDocument object loaded by using Load method. |
$HtmlHelper.RemoveAtrributesB yXpath(“Name”, “//tr”) |
11 |
ReplaceNodes(string nodeXpath, string replacementNodeName) |
Replaces given node with new node name present in HTMLDocument object loaded by using Load method. |
$HtmlHelper.ReplaceNodes(“//tr ”, “//td”) |
Comments