9.2 - PMA Metadata Mapping PowerShell API

Follow

To assist with writing powerful script and custom functions, Proventeq Migration Accelerator provides some built in variables and helper module instances.

 

9.2.1 - Pre-defined variables

The following variables are available to all scripts/custom functions:

  • MigrationItem
    • SI - The source representation of the item
      • ItemName - The name of the item as shown in the user interface.
      • MimeType - The file extension of the item.
      • Uri - The logical path of the item.
      • AlternateUri - The path to the binary if the item is a file.
      • GroupId - The document identifier that is common across all document versions.
      • VersionLabel - The major version.
      • SubVersionLabel - The minor version.
      • EntityCategory - 0 for containers, 1 for files and links.
      • EntityType - File, Link or Folder.
      • ContentType - The content type classification of the item.
      • ItemSize - The size of the binary in bytes.
      • Level - The depth of the item in the folder tree.
      • HasUniqueSecurity - Whether the item has security information assigned.
    • SIC - The source metadata exposed as an XML document.
  • Content – The actual binary content corresponding to the item in the source repository, this is a simple byte array. This is typically used for transforming XML or HTML content. For example, if the source item is an Xml file it would be the UTF-8 encoded binary of the XML. Note: This variable only contains values during the processing stage. It would contain a null value when in script editing mode.

  • ContentType (string) – This variable contains the ContentType captured for the migration item during the Discovery stage.
  • Log - A logging object of type log4Net.ILog - this can be used to write messages to the application log with statement such as $Log.DebugFormat("Doing x for item with path {0}", $MigrationItem.SI.Uri);

 

9.2.2 - XML Helper Module

The XML Helper Module can be used to extract and transform source content which is in XML format.

 

Sr.No.

Function API

Description

Example Usage (Powershell)

1

GetValueByXPath(byte[] byteArray, string xPath, string nsName, string nsUri)

Loads given XML byte array and return value of given node path, namespace and uri.

$XmlHelper. GetValueByXPath(Xmlbytes, “/Persons/Person/Age”, “pvq”, “http://www.pvq.com”)

2

MetaDictionaryToXElement(IDi ctionary<string, string> metaDict)

Converts given dictionary object to XML.

$XmlHelper. MetaDictionaryToXElement (keyValuedictionary)

3

XElementToMetaDictionary<T> (XElement elemMeta)

Converts given XML to dictionary object.

$XmlHelper. XElementToMetaDictionary (“<?xml version="1.0"

?><node></node>”)

 

9.2.3 - HTML Helper API

The HTML Helper API can be used to extract and transform source HTML content. It can be used to:

  • Retrieve specific HTML content from source HTML based on complex XPath queries. This helps in extracting only the relevant HTML from source.
  • Cleanse HTML by removing unwanted nodes and/or attributes.
    This is particularly useful for removing embedded styles within HTML.
  • Transform HTML by replacing nodes. This can be used for achieving standards compliance e.g. AA

An instance of HTML Helper module is available to metadata mappings scripts and functions as a variable name ‘HtmlHelper’.

The list of methods available for HTML Helper are listed below:

 

Sr.No.

Function API

Description

Example Usage

1

DecodeHTML(string htmlContent)

Convert the string that has been HTML encoded for HTTP transmission into a decoded string.

$HtmlHelper. DecodeHTML(“<html>xxx</html

>”)

2

LoadHtmlBytes(byte[] content)

Loads HTML byes into HTMLDocument object.

$HtmlHelper.LoadHtmlBytes(byt es)

3

LoadHtml(string html)

Loads HTML string into HTMLDocument object.

$HtmlHelper.LoadHtml(“<html>x xx</html>”)

4

LoadUrl(string url)

Loads HTML content from given Url into HTMLDocument object.

$HtmlHelper.LoadUrl(Url)

5

GetOuterHTML(string nodeXpath)

Returns outer HTML string for given node path of HTMLDocument object loaded by using Load method.

$HtmlHelper.GetOuterHTML(“//t r[@class = 'someClass1' or

@class = 'someClass2']")

6

GetValueByXPath(byte[] content, string xPath)

Returns inner HTML value from given HTML content and node path.

$HtmlHelper.GetValueByXPath(b ytes, “//tr[@class = 'someClass1' or @class = 'someClass2']")

7

GetInnerHtml(string nodeXpath)

Returns inner HTML value for given node path of HTMLDocument object loaded by using Load method.

$HtmlHelper.GetInnerHtml(“//tr [@class = 'someClass1' or @class

= 'someClass2']")

8

GetContentBetweenComment s(string startComment, string endComment)

Returns content between given start and end comment present in HTMLDocument object loaded by using Load method.

$HtmlHelper.GetContentBetwee nComments(

“startComment”, “endComment”)

9

RemoveNodesByXpath(string nodeXpath)

Removes given node present in HTMLDocument object loaded by using Load method.

$HtmlHelper.RemoveNodesByXp ath(“//tr[@class = 'someClass1' or @class = 'someClass2']")

10

RemoveAtrributesByXpath(stri ng attributeName, string attributeXpath)

Removes given attribute present in HTMLDocument object loaded by using Load method.

$HtmlHelper.RemoveAtrributesB yXpath(“Name”, “//tr”)

11

ReplaceNodes(string nodeXpath, string replacementNodeName)

Replaces given node with new node name present in HTMLDocument object loaded by using Load method.

$HtmlHelper.ReplaceNodes(“//tr ”, “//td”)

 

 

Was this article helpful?
0 out of 0 found this helpful

Comments