S3 Connector

Version 21.0.7884


S3 Connector


The S3 Connector integrates with Amazon’s S3 (Simple Storage Service) and other S3-like services (Google Storage, Wasabi, etc).

Overview

Each S3 Connector can automatically upload to and download from a single configured S3 Bucket. An Amazon Account (or Google Storage account, Wasabi account, etc) with the appropriate credentials is required. Upload and download paths can be specified within the Bucket, and files can be filtered by filename before download.

Connector Configuration

This section contains all of the configurable connector properties.

Settings Tab

Host Configuration

Settings related to the remote connection target.

  • Connector Id The static name of the connector. All connector-specific files are held in a folder by the same name within the Data Directory.
  • Connector Description An optional field to provide free-form description of the connector and its role in the flow.
  • Bucket Name The S3 Bucket that should be polled or uploaded to.

Amazon Account Settings

Settings related to the Amazon Account with permission to access the configured Bucket Name.

  • Access Key The Access Key account credential acquired from Amazon (or Google, Wasabi, etc).
  • Secret Key The Secret Key account credential acquired from Amazon (or Google, Wasabi, etc).
  • Region The Region where the specified Bucket Name is stored.

SSL Settings

Settings related to the SSL negotiation with the S3 server.

  • Use SSL when connecting with Amazon servers Whether SSL negotiation is enabled.
  • Server Public Certificate The public key certificate to trust when connecting to the S3 server. Can be set to Any Certificate to implicitly trust the server.

Upload

Settings related to the path within the specified bucket where files will be uploaded.

  • Remote Path The path within the Bucket where files will be uploaded.
  • Overwrite remote files Whether files that already exist in the specified Bucket should be overwritten during upload.

Download

Settings related to the path within the specified bucket where files will be uploaded.

  • Remote Path The path within the Bucket from which files will be downloaded. Multiple paths can be specified in a comma-delimited list.
  • File Filter A glob pattern that determines which files within the Remote Path should be downloaded. Multiple patterns may be specified in a comma-delimited list.

Automation Tab

Automation Settings

Settings related to the automatic processing of files by the connector.

  • Upload Whether files arriving at the connector will automatically be uploaded.
  • Retry Interval The amount of time before a failed upload is retried.
  • Retry Maximum Attempts The maximum number of times a failed upload will be retried.
  • Download Whether the connector should automatically poll the remote download path for files to download.
  • Download Interval The interval between automatic download attempts.
  • Minutes The number of minutes to wait before downloading. Only applicable when Download Interval is set to Minute.
  • Minutes Past the Hour The minutes offset for an hourly schedule. Only applicable when Download Interval is set to Hourly. For example, if this value is set to 5, the automation service will download at 1:05, 2:05, 3:05, etc.
  • Time The time within a given day that the download should occur. Only applicable when Download Interval is set to Daily, or Weekly, or Monthly.
  • Day The day on which the download should occur. Only applicable when Download Interval is set to Weekly or Monthly.
  • Cron Expression An arbitrary string representing a cron expression that determines when the download should occur. Only applicable when Download Interval is set to Advanced.

Advanced Tab

Local Folders

Settings that determine the folder on disk that files will be sent/uploaded from, and the folder that they will be received/downloaded to.

  • Input Folder (Send) The connector can send/upload files placed in this folder. If Send Automation is enabled, the connector will automatically poll this location for files to process.
  • Output Folder (Receive) The connector will place received/downloaded files in this folder. If the connector is connected to another connector in the flow, files will not remain here and will instead be passed along to the Input/Send folder for the connected connector.
  • Processed Folder (Sent) After processing a file, the connector will place a copy of sent/uploaded files in this folder if Save to Sent Folder is enabled.

Performance

Settings related to the allocation of resources to the connector.

  • Max Workers The maximum number of worker threads that will be consumed from the threadpool to process files on this connector. If set, overrides the default setting from the Profile tab.
  • Max Files The maximum number of files that will be processed by the connector each time worker threads are assigned to the connector. If set, overrides the default setting from the Profile tab.

Other Settings

Settings not included in the previous categories.

  • Access Policy The access policy set on objects after they are uploaded to the S3 server.
  • Enable Size Comparison Whether to cache downloaded file names and sizes; if True then files will only be downloaded if they have not been downloaded before or have changed in size.
  • Enable Timestamp Comparison Whether to cache downloaded file names and last-modified timestamps; if True then files will only be downloaded if they have not been downloaded before or have been modified since they were downloaded.
  • Encryption Password If set, object data will be encrypted on the client side before upload, and downloaded objects will be automatically decrypted.
  • UseServerSideEncryption Whether to request the S3 server encrypts object data server-side.
  • Send Filter A glob pattern filter to determine which files in the Send folder will be uploaded by the connector (e.g. *.txt). Negative patterns may be used to indicate files that should not be processed by the connector (e.g. -*.tmp). Multiple patterns may be separated by commas, with later filters taking priority except when an exact match is found.
  • Local File Scheme A filemask for determining local file names as they are downloaded by the connector. The following macros may be used to reference contextual information:
    %ConnectorId%, %Filename%, %FilenameNoExt%, %Ext%, %ShortDate%, %LongDate%, %RegexFilename:%, %DateFormat:%.
    As an example: %FilenameNoExt%_%ShortDate%%Ext%
  • Log Level The verbosity of logs generated by the connector. When requesting support, it is recommended to set this to Debug.
  • Parent Connector The connector from which settings should be inherited, unless explicitly overwritten within the existing connector configuration. Must be set to a connector of the same type as the current connector.
  • Recurse Subdirectories Whether to download files in subfolders of the target remote path.
  • Use Virtual Hosting Whether to use hosted-style or path-style when referencing the Bucket endpoint.
  • Log Subfolder Scheme By default, logs for transactions processed by the connector will be stored in the Logs subfolder for the connector. For connectors that process many transactions, it may be desirable to further divide the logs based on the datetime they were generated. When this setting is set to Daily, logs generated on the same day will be grouped in a subfolder;; when this setting is set to Weekly, logs generated in the same week will be grouped in a subfolder; and so on.

  • Log Messages Whether the log entry for a processed file will include a copy of the file itself.
  • Save to Sent Folder Whether files processed by the connector should be copied to the Sent folder for the connector.

Establishing a Connection

The requirements for establishing an S3 connection are simple:

  • Amazon account credentials (or Google, Wasabi, etc)
    • Access Key
    • Secret Key
  • An S3 Bucket that can be accessed by the above account

For Amazon S3 specifically, this link can be used to obtain Access Key and Secret Key information from Amazon.
Optionally, the connection with S3 servers can be secured by SSL by enabling the Use SSL when connecting with Amazon servers option.

Uploading

Uploading to Remote Folders

The Remote Path setting within the Upload section specifies the path within the Bucket to upload files. This allows for the logical separation of files into virtual folders within the same Bucket.

Note that S3 servers do not maintain a real folder structure, and ArcESB uses application logic to present a pseudo folder structure. Slashes in the Remote Path (/, \\) are interpreted as representing a folder hierarchy. This allows for uploading to or downloading from ‘subfolders’ within the Bucket based on the slashes in the path.

Upload Automation

The S3 Connector supports automatic upload via the Automation tab in the connector configuration panel. When Upload automation is enabled, files that reach the Input folder for the connector will be automatically uploaded to the specified Bucket Name at the specified Remote Path.

If a file fails to upload, the application will attempt to send it again after the Retry Interval has elapsed. This process will continue until the Retry Maximum Attempts has been reached, after which the connector will raise an error.

Downloading

Downloading from Remote Folders

The Remote Path setting within the Download section specifies the path within the Bucket to upload files. This allows for the logical separation of files into virtual folders within the same Bucket.

The File Filter setting provides a way to only download specific filenames within the specified path.

Note that S3 servers do not maintain a real folder structure, and Arc uses application logic to present a pseudo folder structure. Slashes in the Remote Path (/, \\) are interpreted as representing a folder hierarchy. This allows for uploading to or downloading from ‘subfolders’ within the Bucket based on the slashes in the path.

Download Automation

The S3 Connector supports automatic upload via the Automation tab in the connector configuration panel. When Download automation is enabled, will be automatically poll the remote Bucket according to the specified Download Interval.