Skip to main content

S3 Writer

Writes to Amazon S3 or Dell EMC ECS Enterprise Object Storage.

See Port Requirements for information on firewall settings.

S3 Writer properties

property

type

default value

notes

Access Key ID

String

Specify an AWS access key ID (created on the AWS Security Credentials page) for a user with "Write objects" permission on the bucket.

When Striim is running in Amazon EC2 and there is an IAM role with that permission associated with the VM, leave accesskeyid and secretaccesskey blank to use the IAM role.

For Dell EMC ECS, specify the S3 Access Key string from the All Credentials page.

Bucket Name

String

The S3 bucket name. If you specify the Region property and the bucket does not already exist, S3 Writer will create it. Otherwise, you must create the bucket manually before running S3 Writer.

See Setting output names and rollover / upload policies for advanced options. To use dynamic bucket names, you must specify a value for the Region property.

Note the limitations in Amazon's Rules for Bucket Naming.

Client Configuration

String

Optionally, specify one or more of the following property-value pairs, separated by commas.

If you access S3 through a proxy server, specify it here using the syntax ProxyHost=<IP address>,ProxyPort=<port number>,ProxyUserName=<user name>,ProxyPassword=<password>. Omit the user name and password if not required by your proxy server.

Specify any of the following to override Amazon's defaults:

  • ConnectionTimeout=<timeout in milliseconds>: how long to wait to establish the HTTP connection, default is 50000

  • MaxErrorRetry=<number of retries>: the number of times to retry failed requests (for example, 5xx errors), default is 3

  • SocketErrorSizeHints=<size in bytes>: TCP buffer size, default is 2000000

See http://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/section-client-configuration.html for more information about these settings.

For Dell EMC ECS, specify endpointConfiguration= followed by the S3 End Point string from the All Credentials page.

Compression Type

String

Set to gzip when the input is in gzip format. Otherwise, leave blank.

Folder Name

String

Optionally, specify a folder within the specified bucket. If it does not exist, it will be created.

See Setting output names and rollover / upload policies for advanced options.

Object Name

String

The base name of the files to be written. See Setting output names and rollover / upload policies.

Object Tags

String

Optionally, specify one or more object tags (see Object Tagging) to be associated with the file as key-value pairs <tag name>=<value> separated by commas. Values may include field, metadata, and/or userdata values (see Setting output names and rollover / upload policies) and/or environment variables (specified as $<variable name>).

Parallel Threads

Integer

See Creating multiple writer instances.

Partition Key

String

If you enable ParallelThreads, specify a field to be used to partition the events among the threads.  Events will be distributed among multiple S3 folders based on this field's values. 

If the input stream is of any type except WAEvent, specify the name of one of its fields.

If the input stream is of the WAEvent type, specify a field in the METADATA map (see WAEvent contents for change data) using the syntax @METADATA(<field name>), or a field in the USERDATA map (see Adding user-defined data to WAEvent streams), using the syntax @USERDATA(<field name>). If appropriate, you may concatenate multiple METADATA and/or USERDATA fields.WAEvent contents for change data

Region

String

Optionally, specify an AWS region, for example, us-west-1. This is required to use dynamic bucket names (see Setting output names and rollover / upload policies).

Rollover on DDL

Boolean

True

Has effect only when the input stream is the output stream of a CDC reader source. With the default value of True, rolls over to a new file when a DDL event is received. Set to False to keep writing to the same file.

Secret Access Key

encrypted password

Specify the AWS secret access key for the specified access key.

For Dell EMC ECS, specify the S3 Secret Key 1 string from the All Credentials page.

Upload Policy

String

eventcount:10000, interval:5m

The upload policy may include eventcount, interval, and/or filesize (see Setting output names and rollover / upload policies for syntax). Cached data is written to S3 every time any of the specified values is exceeded. With the default value, data will be written every five minutes or sooner if the cache contains 10,000 events. When the app is undeployed, all remaining data is discarded.

When uploading configurations to a bucket protected by Object Lock, specify AWSS3ObjectLockEnabled=true in the request.

This adapter has a choice of formatters. See Supported writer-formatter combinations for more information.Supported writer-formatter combinations

S3 Writer sample application

CREATE APPLICATION testS3;

CREATE SOURCE PosSource USING FileReader ( 
  wildcard: 'PosDataPreview.csv',
  directory: 'Samples/PosApp/appData',
    positionByEOF:false )
PARSE USING DSVParser (
  header:Yes,
  trimquote:false ) 
OUTPUT TO PosSource_Stream;

CREATE CQ PosSource_Stream_CQ 
INSERT INTO PosSource_TransformedStream
SELECT TO_STRING(data[1]) AS MerchantId,
  TO_DATE(data[4]) AS DateTime,
  TO_DOUBLE(data[7]) AS AuthAmount,
  TO_STRING(data[9]) AS Zip
FROM PosSource_Stream;

CREATE TARGET testS3target USING S3Writer (
  bucketname:'mybucket',
  objectname:'myfile.json',
  accesskeyid:'********************',
  secretaccesskey:'******************************',
  foldername:'myfolder')
FORMAT USING JSONFormatter ()
INPUT FROM PosSource_TransformedStream;

END APPLICATION tests3;

Note that since the test data set is less than 10,000 events, and the application is using the default upload policy, the data will be uploaded to S3 after five minutes, or when you undeploy the application.