14.2. Creating Lookup tables for Custom Functions
Often target metadata may need to be supplemented with additional data from a CSV file. For example, a table of information for each employee where a lookup using the EmployeeID from the source item is then used to return the user’s department, line manager, grade level etc. from a .CSV file.
In such cases, a Target metadata property can be mapped to a PowerShell script which can lookup in a CSV file for relevant information. If the lookup file(.CSV) were to occur every time an item is processed, this approach would severely impact migration performance, especially when using large lookup tables.
To resolve such issues, PMA has the ability for a .CSV file to be loaded only once per migration thread instead of for each item.
Note: This feature is only supported on non SaaS installations of PCS.
- Design CSV file
The CSV file must contain a unique Item ID column (effectively a column that acts as a key), and columns for all the Metadata associated with the item.
EmployeeID, FirstName, Surname, DOB
0001, Jane, Potter, 16/2/1962
0002, Sarah, Mahoney, 28/4/1995
0003, Rosemary, Sharkey, 3/3/1975
Example EmployeeDatabase.CSV
EmployeeID is the key and there are three columns FirstName, Surname and DOB
- Creating new PowerShell module
A new PowerShell module file (.PSM1 file) needs to be created which will be executed once by each thread basis to load the data.
e.g. Datatable_Load.psm1
- Use PowerShell to import CSV.
Edit the new PowerShell script Datatable_Load.psm1 to create a script to load the .CSV file.
Syntax to create a global variable and import the CSV
$global:<VariableToHoldCSV> = Import-CSV -path ‘<My csv path>’
e.g.
$global:Employee_Metadata = Import-CSV -path ‘c:\datamigration\EmployeeDatabase.CSV’
When the script is executed the Employee_Metadata global variable will contain the contents of the CSV file.
- Add reference to new PowerShell module
PMA needs to be configured to load this new PowerShell module.
Open Notepad in Administrator mode and open the main Proventeq PowerShell Module
Proventeq.MigrationAccelerator.PowerShell.Module.psd1 (typically located in C:\Program Files\Proventeq\Proventeq Migration Accelerator\PowerShell)
Update the NestedModule property
of Proventeq.MigrationAccelerator.PowerShell.Module.psd1 file to include this new module (e.g., Datatable_Load.psm1). Typically
NestedModules = @("Proventeq.MigrationAccelerator.SharePointModule.psm1","Proventeq.MigrationAccelerator.PowerShellModule.psm1","c:\datamigration\Datatable_Load.psm1")
After this change, starting PMA will execute the PowerShell script.
When the threads start, they will execute the module Datatable_Load.psm1 which will execute the above PowerShell to load the CSV into a global variable DOB_Metadata for the Custom Function to use.
14.2.1. Using the Imported Data
The PowerShell script executed by the Custom Function can now use the data as follows.
$EmployeeDatabase = $global:Employee_Metadata | ?{$_.EmployeeID –eq $EmployeeID}
Return $EmployeeDatabase.DOB
In the example above, a variable $EmployeeID is used to find a matching entry by looking up the .EmployeeID field in the Employee_Metadata global variable. The script then returns the .DOB field for that record.
Testing PowerShell Functions
It is recommended to test any changes made to the PowerShell module prior to launching PMA. The following steps can be executed to validate the changes.
Open PowerShell console and import the Proventeq PowerShell module
"C:\Program Files\Proventeq\Proventeq Content Suite \PowerShell\Proventeq.MigrationAccelerator.Module.psd1" -Force
In the example above you can see the EmployeeID variable defined as a string and below that the Script to execute. On the right is Test Data window where the value 0002 is passed to the function and it correctly returns in the Text View the DOB of 28/4/1995 based upon the values in the CSV EmployeeDatabase.CSV which is held in Employee_Metadata global variable.
Controlling how parameters are passed to the Custom Function
In the properties of the function, specify how the parameter value will be passed to the customer function. In the example below, the EmployeeID value comes from the Name metadata field.
14.2.2. BOX API Consumption
Below are details of how PMA consumes BOX API resources. This can give an indication of how much BOX API license consumption may occur during a migration. It’s essential to discuss BOX API limits with your BOX Account Manager.
- Discovery
1 API call per folder to retrieve a list of subfolders and files. 1 API call per file to discover the sharing permissions (if security discovery is enabled and the file has been shared with other users). 1 API call per file to discover the version history (if version discovery is enabled and the file has previous versions). A small number of calls per discovery run for setup (get user ID mappings, discover the initial folder / discovery filter entries, etc).
- Migration
1 API call per file version to download the file to the migration server.
- Misc
A small number of calls when adding Box as a source connection to populate the folder selector.
Note: Some of these API methods are paginated at 1000 items, so it takes 1 call per 1000 entries. For example, if a folder contains 2500 files, it would take 3 API calls to get the full list of items.
Comments