CSV Map Connector
CSV Map Connector
The Map Connector uses ArcScript to transform XML files into arbitrary flat file formats.
Map Connectors have two modes of operation, the visual Designer and the scripting-focused Code View. The Designer supports visually establishing simple relationships between document types. More complicated translations require the Code View, which leverages the powerful ArcScript scripting language to create a mapping template between XML and a flat file structure.
To begin, upload a sample Source File and Destination File in the Settings tab. If the structure of both sample files allows for designer-driven mapping, the visual designer will be populated with the source and destination fields. The application will also attempt to populate the code view window with as much of the destination structure as it can.
This section contains all of the configurable connector properties.
Settings related to the core operation of the connector.
- Connector Id The static name of the connector. All connector-specific files are held in a folder by the same name within the Data Directory.
- Connector Description An optional field to provide free-form description of the connector and its role in the flow.
- Source File A file that represents the XML structure of input documents. Any files processed by the connector should have a matching XML structure. To support designer-driven mapping, this file must be a simple XML or CSV structure.
- Destination File A file that represents the structure of output documents. To use designer-driven mapping, this file must be a simple XML or CSV structure.
Defines the mapping relationship between input and output files. Designer view can map simple XML and CSV structures, and Code view can map more complicated XML structures. Please see Using the Designer and Using Code View for more details.
Settings related to the automatic processing of files by the connector.
- Send Whether messages arriving at the connector will automatically be processed.
Settings that determine the folder on disk that files will be processed from, and where they will be placed after processing.
- Input Folder (Send) The connector can process files placed in this folder. If Send Automation is enabled, the connector will automatically poll this location for files to process.
- Output Folder (Receive) After the connector finishes processing a file, the result will be placed in this folder. If the connector is connected to another connector in the flow, files will not remain here and will instead be passed along to the Input/Send folder for the connected connector.
- Processed Folder (Sent) After processing a file, the connector will place a copy of the processed file in this folder if Save to Sent Folder is enabled. This copy of the file will not be passed along to the next connector in the flow.
Settings related to the allocation of resources to the connector.
- Max Workers The maximum number of worker threads that will be consumed from the threadpool to process files on this connector. If set, overrides the default setting from the Profile tab.
- Max Files The maximum number of files that will be processed by the connector each time worker threads are assigned to the connector. If set, overrides the default setting from the Profile tab.
Settings not included in the previous categories.
- Send Filter A glob pattern filter that determines which files in the Send directory should be processed by the connector. Patterns will exclude matching files if the pattern is preceded by a minus sign:
Multiple patterns can be specified, comma-delimited, with later filters taking priority.
- Local File Scheme A filemask for determining local file names as they are downloaded by the connector. The following macros may be used to reference contextual information:
%ConnectorId%, %Filename%, %FilenameNoExt%, %Ext%, %ShortDate%, %LongDate%, %RegexFilename:%, %DateFormat:%.
As an example: %FilenameNoExt%_%ShortDate%%Ext%
- Parent Connector If set to a connector of the same type, this connector will inherit all settings from the Parent Connector unless directly overridden in the existing connector configuration.
- Log Subfolder Scheme By default, logs for transactions processed by the connector will be stored in the Logs subfolder for the connector. For connectors that process many transactions, it may be desirable to further divide the logs based on the datetime they were generated. When this setting is set to Daily, logs generated on the same day will be grouped in a subfolder;; when this setting is set to Weekly, logs generated in the same week will be grouped in a subfolder; and so on.
- Log Messages Whether the log entry for a processed file will include a copy of the file itself.
- Save to Sent Folder Whether files processed by the connector should be copied to the Sent folder for the connector.
Settings for specific use cases.
- Other Settings Allows configuration of hidden connector settings in a semicolon-separated list, like
setting1=value1;setting2=value2. Normal connector use cases and functionality should not require use of these settings.
Using the Designer
The visual designer is only available for CSV and simple XML files. Below is an example of a simple XML document as the Source File:
<actor table="actor"> <actor_id key="true">124455</actor_id> <first_name>Bob</first_name> <last_name>Smith</last_name> </actor>
This example Source File might be paired with the following CSV document as the Destination File:
id,first name,last name,time 124455,Bob,Smith,2017-07-18T17:38:53-04:00
When these files are configured as the Source and Destination files, the designer displays rows for each of the Source Fields in the Source File (actor_id, first_name, and last_name). For each field in the Source, the appropriate Destination column should be selected via the dropdown menu. The application will automatically attempt to match the Source and Destination fields based on the names of the columns. In the above example, the actor_id row would automatically be matched with the Id column.
Using Code View
Code view provides the ability to generate the mapping template manually using ArcScript. This allows for much more granular control over the possible input and output formats. The code view defines how the destination file looks once rendered, with ArcScript elements dynamically filling in the template with values from the source document. ArcScript elements all start with an ‘rsb’ prefix, for example:
<arc:set>. Any content in the code view that is not ArcScript will be included as part of the output file.
Navigating XML Input
ArcScript supports navigating complicated XML structures to parse out values from the Source File. The xmlDOMSearch operation takes an xpath as input and loads the XML structure at the given path. This operation loops for each instance of the xpath found in the source document; to load the entire document and avoid looping, provide the root element of the XML source as input to xmlDOMSearch. For example:
<arc:call op=xmlDOMSearch?xpath=/root> <!-- Inside this operation call the parsed XML elements can be accessed --> </arc:call>
Once the XML document is loaded, the xpath formatter supports reading out values from the XML at the specified xpath. This formatter accepts absolute xpaths as well as xpaths relative to the path loaded by xmlDOMSearch. The xpath formatter, like all formatters, can only be used within square brackets . For example:
<arc:call op="xmlDOMSearch?xpath=/root/author"> [xpath('name')] </arc:call>
Take the following simple XML as an example input to the above script:
<root> <author> <name>Stephen King</name> </author> <author> <name>Kurt Vonnegut</name> </author> </root>
In this example, the xmlDOMSearch operation will loop over each ‘author’ element within the ‘root.’ Within each ‘author’ loop, the value from the ‘name’ element is read as content in the output file. The output file in this case would look like this:
Stephen King Kurt Vonnegut
Templating Output Data
The code view serves as a template of the output file, and ArcScript fills in the values to the defined template. As a simple example, if data should be formatted as comma-separated values, these commas can be included as part of the code view to provide structure to the ArcScript functions. CSV files can therefore be generated from more complicated XML structures than the Designer mode would allow.
Take for example the following XML, which has nested elements that prevent the use of Designer mode:
<actor table="actor"> <actor_id key="true">12445</actor_id> <name> <first_name>Viggo</first_name> <last_name>Mortensen</last_name> </name> <actor_id key="true">12522</actor_id> <name> <first_name>Gal</first_name> <last_name>Gadot</last_name> </name> </actor>
The nested elements can easily be retrieved by passing the appropriate paths to ArcScript’s xpath formatter. These formatters can be placed within commas and after header names to provide the desired CSV structure, and an additional csvescape formatter ensures that the values are properly quoted:
id,first name,last name <arc:call op="xmlDOMSearch?xpath=actor"> [xpath('actor_id') | csvescape],[xpath('name/first_name') | csvescape],[xpath('name/last_name') | csvescape] </arc:call>
A wide range of text files can be generated with this mix of constant templating data and ArcScript calls.
Additional Scripting Features
Since full ArcScript is available within code view, it may be desirable to leverage ArcScript Operations within the template. For example if the source file only included the Id of an item within the database, but the actual name of the item is needed, the dbQuery operation could be used to look up the name for the corresponding Id.
ArcScript also supports conditional logic within a mapping template. The arc:if keyword is one of many keywords available to assist in performing conditional logic within templates. For example, if the source file contained information about customers within QuickBooks, it may be desirable to perform different business logic for customers with an outstanding balance as opposed to customers that have paid in full. A simple example of this use case might look like the following:
<arc:set item="Customer" attr="paidinfull" value="true" /> <arc:call op="xmlDOMSearch?xpath=Customer"> <arc:if exp="[xpath('balance')] > 0"> <arc:set item="Customer" attr="paidinfull" value="false" /> </arc:if> </arc:call> id,first name,last name, paid in full [xpath('customer_id') | csvescape],[xpath('first_name') | csvescape],[xpath('last_name') | csvescape],[Customer.paidinfull]