13.2. Discovery Filters
Note: CSV based filters are only supported when running in non-SaaS mode.
13.2.1. Discovery Filter for ShareFile
A filter can be applied to the Discovery. Using a filter can speed up the Discovery process, as well as reduce the size of the databases created by discovery and ensures that the database doesn’t contain any items which do not have to be migrated.
Filters allow a list of folder locations to be specified in a CSV file, so only these folders (and their child) will be captured in the discovery process. Anything not discovered will not be processed for migration.
Discovery Filter Syntax:
[{"ObjectType":"CSV","Filter":"<FilePath>"}]
<FilePath> - File path of the CSV containing hierarchy path and app URI. It is mandatory to provide double backward slash ‘\\’ in the file path.
E.g.
[{"ObjectType":"CSV","Filter":"F:\\Proventeq\\Configuration\\Product\\ShareFile\\Discovery Filter.csv"}]
Example Discovery Filter for ShareFile
The AppURI can be found on the details screen on the selected folder in ShareFile administration portal.
13.2.2. Discovery Filter for FileShare
A filter can be applied to the Discovery. Using a filter can speed up the discovery process, as well as reduce the size of the databases created by discovery and ensures that the database doesn’t contain any items which do not have to be migrated. Anything not discovered will not be processed for migration.
It possible to limit the discovery using a filter.
- Capture only information in folders specified in a specified .CSV file
- Capture items CREATED before specified date
- Capture items CREATED after specified date
- Items of specified file extensions
- All items, except those of a specified file extension
Syntax:
[{"FilterType":"<FilterType>","ObjectType":”<File or Folder>”,"Filter":"<Filter>"}]
Supported Filter Types
FilterType | Description | Object Type Supported |
CSVInclude | Will discover folders provided in the CSV | Folder |
DateBefore | Will discover files before the given date | File |
DateAfter | Will discover files after the given date | File |
ExtensionInclude | Will discover files which includes the given extension in CSV | File |
ExtensionExclude | Will discover files which excludes the given extension in CSV | File |
<ObjectType> - Folder or File depending upon FilterType.
<Filter> -
Example Limiting File Share discovery using CSVInclude
- [{"FilterType":"CSVInclude","ObjectType":"Folder","Filter":"F:\\Proventeq\\Configuration\\Product\\ShareFile\\Discovery Filter.csv"}]
Note: It is required to provide a double backward slash ‘\\’ in any file paths (e.g. C:\\Proventeq\\MimeType.csv).
Example format for CSVInclude file.
Example Limiting File Share discovery using CSVInclude using DateBefore
[{"FilterType":"DateBefore","ObjectType":"File","Filter":"7/5/2021"}]
For date filters, the date specified must be in the correct format as per the regional settings. This filter applies to the created date of the file.
Note: The date filter is not supported in delta discovery or processing.
Limiting Discovery using File Extensions - Using ExtensionInclude or ExtensionExclude
If the filter is an ExtensionInclude or ExtensionExclude list, a CSV file of relevant file extensions needs to be created. This is a simple list with each extension on a new line with the header “Type” e.g.
Type
.pdf
.txt
Extensions must begin with “.”
Using multi file properties to control Discovery
Several filter criteria can be used together if required. The criteria act as an ‘AND’ so for items to be discovered they must meet all criteria.
- [{"FilterType":"ExtensionInclude","ObjectType":"File","Filter":" C:\\Proventeq\\MimeType.csv "}, {"FilterType":"DateAfter","ObjectType":"File","Filter":"7/5/2021"}}
13.2.3. Discovery Filter for Documentum
Use the below discovery filter options to filter out cabinets, documents or folders while performing discovery. There are three types of discovery filter options supported:
- REST Service based Filter
- DQL Query based Filter
- CSV based Filter
Note Documentum DB connect type only supports CSV based filter. Documentum API connection type only supports Rest Service and DQL Query based filters.
REST Service Filter
REST service filter can be used to discover the folders or documents by iterating through give cabinets and folders hierarchy structure. So, it will discover the filtered documents or folders along with cabinet or folder structure.
Filter Format: [{"FilterType":"REST","ObjectType":<ObjectType>,"Filter":"<Filter Criteria>"}]
Object Types:
The following Object Types are supported (Use numeric values):
- RootFolder = 2
- SubFolder=3
- Item = 4
Filter:
The following functions are supported in the filter
- contains
- starts-with
- between
Rest Filter Examples:
Filter Criteria template for RootFolder with Contains function:
[{"FilterType":"REST","ObjectType":2,"Filter":"contains(object_name, 'Di') or contains(object_name, 'A')"}]
By using this filter expression, the request only returns objects whose object_name contains A.
Filter Criteria template for Items with starts-with function:
The starts-with function checks the starting string of a property.
[{"FilterType":"REST","ObjectType":4,"Filter":"starts-with(object_name, 'D')"}]
By using this filter expression, the request only returns Items whose object_name starts with D.
Filter Criteria template for Items with between function:
The between function returns items between a specified range. Check the format of the date provided in the filter.
[{"FilterType":"REST","ObjectType":4,"Filter":"between(r_modify_date, date('<date1>'), date('<date2>'))"}]
e.g:
[{"FilterType":"REST","ObjectType":4,"Filter":"between(r_modify_date, date('2014-03-02'), date('2017-09-10'))"}]
By using this filter expression, the request only returns objects whose r_modify_date is between 2014-03-02 and 2017-09-10.
DQL Query Filter
The DQL filter can only be used if connecting to Documentum API.
The DQL filter can be used to filter the documents across repository or in a particular cabinet/folder in a flat structure i.e. without iterating through cabinets/folders structure. So, discovery result will discover only filtered documents (similar to Documentum Search) without any cabinet or folder and its hierarchy/structure. DQL filter is not supported at virtual source containers like public or private cabinets level.
Filter Format: [{"FilterType":"DQL","ObjectType":"<ObjectType>","Filter":"<Condition1> AND (<Condition2> OR <Condition3>)"}]
Here Filter Type will be as DQL, Object Type is the content or object type in the Documentum and Filter will be suitable DQL filter to discover specific set of documents.
Filter Criteria template for Items within specific Public/User Cabinet:
This filter will bring all the files under respective public/user cabinet id.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"i_cabinet_id = '<Cabinet_ID>'"}]
Filter Criteria template for specific document and its versions:
This filter will bring given document id with all the versions.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"i_chronicle_id ='<Chronicle_ID>'"}]
Filter Criteria template for Items between dates:
This filter will bring all the files creation date and modified date in between Date1 and Date2.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"<Creation Date> > date('<date1>') and <Creation Date> < date('<date2>')"}]
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"<Modified Date> > date('<date1>') and <Modified Date> < date('<date2>')"}]
Filter Criteria template for specific document name:
This filter will bring the files with specific File Name.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"object_name ='<File Name>'"}]
Filter Criteria template for Items containing repeatable attribute values:
This filter will bring the files with repeating attributes like tags having <Tag1> and <Tag2> values.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"any <Repeating_Attribute> IN ('<Tag1>','<Tag2>') "}]
Eg:
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"any keywords IN ('KeywordA','KeywordB') "}]
Filter Criteria template for Items containing given name:
This filter will bring the files and its versions where <attribute> starts with <FileName>, ends with <FileName> and contains <FileName> respective to the Percentage(%) symbol specified.
- This will bring <FileName> starts with given name.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"<attribute> like '<FileName>%' "}]
- This will bring <FileName> ends with give name.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"<attribute> like '%<FileName>'"}]
- This will bring <FileName> containing given name.
[{"FilterType":"DQL","ObjectType":"dm_document","Filter":"<attribute> like '%<FileName>%'"}]
CSV Based Filter
A CSV based filter can be user to restrict the discovery based upon Documentum Cabinet ID for DocumentumDB connections. It will return all folders and items contained within the specified cabinets.
Filter Format:[{"FilterType":"CSV","ObjectType":"dm_cabinet","Filter":"<FilterCriteria>"}]
FilterCriteria: One or more cabinet ids(i_cabinet_id) separated by comma.
Example: Filter Criteria template for single cabinet:
[{"FilterType":"CSV","ObjectType":"dm_cabinet","Filter":"0c0007c580009548,0c0003f270008498"}]
Flatten discovery using CSV based Filter
Flatten discovery using CSVFile based filter is used for discovering documents underneath the provided r_object_id of the folder and its parent id (optional) in the CSV file
Filter Format: [{"FilterType":"CSV","ObjectType":"<ObjectType>","FlatDiscovery":true or false, "Filter":"<Path of the file>"}]
Filter: Filter contains the path of the file containing r_object_id of the folder and its parent id (optional) separated by comma.
Examples:
- Filter Criteria template:
[{"FilterType":"CSVFile","ObjectType":4,"FlatDiscovery":true,"Filter":"C:\\Documents\\Flat_discovery_build\\flat_discovery -modified.csv"}]
The r_object_id should be provided in column A and its parent folder id in column B. Note that the \ must appear twice in folder paths.
13.2.4. Discovery Filter for OpenText
Filters can be applied for OpenText systems using Oracle or Microsoft SQL databases. Filtering can be applied to folders and documents.
These are list of tables and fields that can be used within filters.
Table Alias | Field |
A for DTreeCore | DATAID, NAME, USERID, CreatedBy, ModifiedBy, CreateDate, Createdate, Modifydate, SubType, VersionNum |
V for DVersData | VersionID, DocID, Version, VersionName, Owner, VerCDate, VerMDate, FILENAME, FileType, MimeType |
K for KUAF | CreatedBy |
K2 for KUAF | ModifiedBy |
VK for KUAF | Owner |
Below are the Sub Types available in OpenText for different entities which can be used in filters:
- 0 - Folder
- 144 - Documents
- 749 - Email
- 751 - Email Folder
- 136 - Compound Document
- 298 - Collection
Examples:
Filter based on Name from DTREECORE Table:
To discover files where the A.Name property begins with specified value.
Product | Example Query |
SQL | (A.Name like '<Name>%' AND A.SubType in (144)) OR (A.SubType in (0,751,298,136)) |
Oracle | (A.Name like '<Name>%' AND A.SubType in (144)) OR (A.SubType in (0,751,298,136)) |
Or even
(A.Name like '<Name>%' AND A.SubType in (0,144,751,298,136))
Just like standard SQL query syntax, use %. For example, <Name>% to find records that end with specified value or %<Name>% to find records that contain specified value.
Filter criteria based on CreateDate field from DTREECore Table:
To discover files with Created Date greater than or equal to specified date. Note: Date should be in ISO format.
Product | Example Query |
SQL | A.CreateDate >'2021-01-01' AND (A.SubType = 144 OR (A.SubType in (0,751,298,136)) |
Oracle | A.SubType In (0, 144) AND A. CreateDate > TO_DATE('2021-01-01', 'yyyy-mm-dd') OR (A.SubType in (0,751,298,136)) |
Filter criteria based on VerMDate field from DVersData Table:
To discover versions of documents with a modified date greater than specified date.
Product | Example Query |
SQL | (V.VerMDate>'2021-05-06' AND A.SubType in (144)) OR (A.SubType in (0)) |
13.2.5. Discovery Filter for Filenet P8
The filter capabilities depend upon whether the FileNet P8 connection is configured to use API or Database.
API Filter
Filter Format:
[{"ObjectType":”2”,"Filter":"<Filter Criteria>"}]
The Object type determines the type of objects that can be found in the filter. The value of “2” represents “Files”, as this is the only type supported.
Examples:
• Filter criteria based on Document Title:
[{"ObjectType":"2","Filter":"(d.DocumentTitle like '<DocumentTitle>%')"}]
• Filter criteria based on File Creator:
[{"ObjectType":"2","Filter":"(d.Creator like '<FileCreatorName>%')"}]
Just like standard SQL query syntax, use %. For example, <FileCreatorName>% to find records that end with specified value or %<FileCreatorName>% to find records that contain specified value.
• Filter criteria based on Date Modified:
To discover the files based on Date created that is greater than or equal to the given <Date>. The date should be in ISO format.
[{"ObjectType":"2","Filter":"d.DateLastModified >= <DateModified>"}]
• Filter criteria based on Class or Content Type:
To discover files based on Class or Content Type.
[{"ObjectType":"2","Filter":"(IsOfClass(d, <ContentType>))"}]
Example a list of items with a Content Type/Class of “Communication” or “Policy”
E.g. [{"ObjectType":"2","Filter":"(IsOfClass(d, Communication) OR IsOfClass(d, Policy))"}]
• Filter criteria based on Metadata:
To discover the files based on metadata properties.
[{"ObjectType":"2","Filter":"(<PropertyName>='PropertyValue')"}]
E.g. [{"ObjectType":"2","Filter":"(BankName='Indian Bank' OR BankName='Overseas Bank')"}]
Database Filter
Note: The Database filter requires table alias and database column/fields name. e.g. to apply filter on DocumentTitle column name in database must use table alias “d” and column name “u1708_documenttitle”.
Filter Format: [{"ObjectType":<ObjectType>,"Filter":"<Filter Criteria>"}]
Table Alias | Fields |
D for DocVersion | object_id, mime_type , version_series_id, major_version_number , minor_version_number, create_date, creator, modify_date, modify_user, compound_document_state , 1708_documenttitle , Modify_user and custom property fields |
CR for Current DocVersion | object_id, mime_type , version_series_id, major_version_number , minor_version_number, create_date, creator, modify_date, modify_user, compound_document_state , 1708_documenttitle , Modify_user and custom property fields |
CD for ClassDefinition | symbolic_name |
D queries against all versions, CR queries against latest version of documents.
Object Type:
The following Object Type is supported (Use numeric value):
• File = 2
Examples:
Filter criteria based on Document Title:
To discover the files that starts with given <DocumentTitle>.
[{"ObjectType":"2","Filter":"(D.u1708_documenttitle like '<DocumentTitle>%')","Uri":"","IsRecursive":true}]
Just like standard SQL query syntax, use %. For example, <DocumentTitle>% to find records that end with specified value or %<DocumentTitle>% to find records that contain specified value.
• Filter criteria based on File Modifier:
To discover the files based on Modifier that starts with given <FilemodifierName>.
[{"ObjectType":"2","Filter":"(D.modify_user like '%<FilemodifierName>')","Uri":"","IsRecursive":true}]
• Filter criteria based on Date Last Modified:
To discover documents were the latest version has a modified date greater than specified date.
[{"ObjectType":"2","Filter":"CR.modify_date>'<ModifiedDate>'","Uri":"","IsRecursive":true}]
The date can be in the format depending on FileNet datetime format.
E.g. [{"ObjectType":"2","Filter":"CR.modify_date>'2020-06-01'","Uri":"","IsRecursive":true}]
• Filter criteria based on Class or Content Type:
To discover the files based on Class or Content Type. [{"ObjectType":"2","Filter":"(CD.symbolic_name=’<ContentType>')","Uri":"","IsRecursive":true}]
E.g. [{"ObjectType":"2","Filter":"(CD.symbolic_name='Communication' OR CD.symbolic_name='Policy')","Uri":"","IsRecursive":true}]
• Filter criteria based on Metadata:
To discover the files based on metadata properties.
[{"ObjectType":"2","Filter":"(D.<PropertyName> ='<PropertyValue>')", "Uri":"", "IsRecursive":true}]
E.g. [{"ObjectType":"2","Filter":"(D.u06a8_bankname='Indian Bank' OR D.u06a8_bankname='Overseas Bank')","Uri":"","IsRecursive":true}]
13.2.6. Discovery Filter for iManage
Filters can be used to limit discovery to workspaces, container(folders) and documents by allowing queries against RootContainer, Container and Item tables respectively.
Filter Type | Description | Filter Query Syntax |
Root Container (Workspace) | To filter particular workspace | {"RootContainer" : "Root.PRJ_NAME ='<Workspace Name>'"} |
To filter multiple workspaces | {"RootContainer" : "Root.PRJ_NAME in ('<Workspace Name>', '<Workspace Name2>')"} | |
Container | To filter specific folder. | {"RootContainer" : "Root.PRJ_NAME ='<Workspace Name>'","Container" : "(Container.ItemName = '<Folder Name>' AND Container.EntityType ='Folder')"} |
Item
|
To filter specific item inside the specific folder. | {"RootContainer" : "Root.PRJ_NAME ='<Workspace Name>'","Container" : "(Container.ItemName = '<Folder Name>' AND Container.EntityType ='Folder')","Item" : "Item.DoNum = '<Document Number>'"} |
To filter items with date created filter | {"Item" : "Item.EntryWhen > '<Date>')"} | |
To filter items with date modified filter | {"Item" : "Item.EditWhen > '<Date>')"} |
The <Workspace Name>, <Folder Name>, <Document Number> and <Date> should be replaced with the respective values as mentioned in above query. Standard SQL operators can be used in these filters.
13.2.7. Discovery Filter for M-files
Filters can be used to limit discovery to specific classes
Filter Criteria template:
Filter Syntax:
[{
"FilterType":"ClassList",
"Filter":"C:\\mfilesclasslist\\discoveryfilter.csv"
}]
Example File: discoveryfilter.csv .Note: ClassName is a mandatory column heading.
ClassName
Contracts
Filter M-Files Document ID
[{"FilterType":"FileList","Filter":"C:\\MigrationData\\ discoveryfilter.csv"}]
Example File: discoveryfilter.csv.Note: FileId is a mandatory column heading.
FileId,Type
1,
2,
44444,
13.2.8. Discovery Filter for Alfresco
Filters can be used to limit discovery to specific sites.
Filter Criteria template for sites:
Filter Syntax:
[{"ObjectType":"Site","Filter":"<SiteName>"}]
Example - Discovery all folders and documents within “Insurance” site.
[{"ObjectType":"Site","Filter":"Insurance"}]
Note: To filter multiple sites a comma separated list can be provided. E.g.
[{"ObjectType":"Site","Filter":"Insurance,HR"}]
13.2.9. Discovery Filter for Oracle UCM
Filter is only support for items not contained within folders, i.e. no parent folder. Filtering on Contribution Folders and Framework Folders is not supported.
Filter Syntax:
dDocName='<DocumentName>'
dDocType='<DocumentType>'
Examples
dDocName = 'PROV_PROV_090007C280000583'
dDocType = 'Document'
13.2.10. Discovery Filter for Hummingbird/EDocs
You can filter upon documents and folders based on a variety of criteria like Matter, SubMatter, DOCNUMBER, etc. Below are the table allias to use in Discovery Filter.
DP : Profile Table (DOCSADM.Profile)
DV : Versions Table (DOCSADM.Versions)
The filter needs to contain the SQL clause filter string like below:
Dynamic Filter | Description |
DP.E_DIVPROJ =’<ID>’ | This filter will get folders and documents of respective ID provided in Discovery filter. |
DP.CREATION_DATE BETWEEN ‘<Date1>' AND '<Date2>' | This filter will get the documents created between the Date1 and Date2. |
DP.CREATION_DATE > '<Date>' | This filter will get the documents created from the given date. |
DV.DOCNUMBER in (‘DOCNUM1’, DOCNUM2) | This filter will get the all the folders and the documents which are provided in Discovery Filter. |
13.2.11. Discovery Filter for BOX
A filter can be applied to the Discovery. Using a filter can speed up the discovery process, as well as reduce the size of the databases created by discovery and ensures that the database doesn’t contain any items which do not have to be migrated. Anything not discovered will not be processed for migration.
Discovery Filter Syntax:
It is required to provide a double backward slash ‘\\’ in any file paths (e.g. C:\\Proventeq\\.csv).
Inclusion Filter - Include only specified users/folders:
[{"ObjectType":"CSV","Filter":"path\\to\\infil.csv"}]
The inclusion filter requires a URL of a folder to include in the discovery (FolderURL), and the username or login email of the owner (UserName, this is required due to how Box handles permissions and service accounts).
A user's content root (the "All Files" folder) can be included by providing the user content root URL (e.g. https://app.box.com/folder/0 ) with the user's info, as in the example below.
FolderURL,UserName
https://app.box.com/folder/158718117193,Corp Admin
https://app.box.com/folder/158718322420,dev@acmecorp.com
https://app.box.com/folder/0,other_user@acmecorp.com
Exclusion Filter - Exclude only specified users/folders:
[{"ObjectType":"CSV","FilterType":"CSVExclude","Filter":"path\\to\\exfil.csv”}]
The exclusion filter only needs the URLs of the folders to exclude. No user information needs to be provided. The exclusion filter does not currently support excluding user root content folders.
FolderURL
https://app.box.com/folder/159258028738
https://app.box.com/folder/159259572184
Comments