Skip to main content

Azure Event Hub Writer

Writes to an existing Azure event hub, which is equivalent to a Kafka topic.

Azure Event Hubs is similar to Kafka, compatible with many Kafka tools, and uses some of the same architectural elements, such as consumer groups and partitions. AzureEventHubWriter is generally similar to Kafka Writer in sync mode and its output formats are the same.

When Striim is deployed on a network with both a firewall and a proxy, open port 443. If there is a firewall but no proxy, open port 5671 and perhaps also 5672. See Connections and sessions for information on firewall settings.

Azure Event Hub Writer properties

property

type

default value

notes

Batch Policy

String

Size:1000000, Interval:30s

The batch policy may include size or interval. Cached data is written to the target every time either of the specified values is exceeded. With the default setting, data will be written every 30 seconds or sooner if the cache contains 1,000,000 bytes. When the application is stopped any remaining data in the buffer is discarded.

Connection Retry

String

Retries:0, RetryBackOff:1m

With the default Retries:0, retry is disabled. To enable retries, set a positive value for Retries and in RetryBackOff specify the interval between retries in minutes (#m) or seconds (#s) . For example, with the setting Retries:3, RetryBackOff:30s, if the first connection attempt is unsuccessful, in 30 seconds Striim will try again. If the second attempt is unsuccessful, in 30 seconds Striim will try again. If the third attempt is unsuccessful, the adapter will fail and log an exception. Negative values are not supported.

Consumer Group

String

If E1P is true, specify an Event Hub consumer group for Striim to use for tracking which events have been written.

E1P

Boolean

false

With the default value, after recovery (see Recovering applications) there may be some duplicate events. Set to true to ensure that there are no duplicates ("exactly once processing"). If recovery is not enabled for the application, this setting will have no effect.Recovering applications

When this property is set to true, the target event hub must be empty the first time the application is started, and other applications must not write to the event hub.

When set to true, AzureEventHubWriter will use approximately 42 MB of memory per partition, so if the hub has 32 partitions, it will use 1.3 GB.

Event Hub Config

String

If Striim is connecting with Azure through a proxy server, provide the connection details, in the format ProxyIP=<IP address>, ProxyPort=<port>, ProxyUsername=<user name>, ProxyPassword:<password>, for example, EventHubConfig='ProxyIP=192.0.2.100, ProxyPort=8080, ProxyUsername=myuser, ProxyPassword=passwd.

Event Hub Name

String

the name of the event hub, which must exist when the application is started and have between two and 32 partitions

Event Hub Namespace

String

the namespace of the specified event hub

Operation Timeout

Integer

1m

amount of time Striim will wait for Azure to respond to requests (reading, writing, or closing connections) before the application will fail

Partition Key

String

The name of a field in the input stream whose values determine how events will be distributed among multiple partitions. Events with the same partition key field value will be written to the same partition.

If the input stream is of any type except WAEvent, specify the name of one of its fields.

If the input stream is of the WAEvent type, specify a field in the METADATA map (see WAEvent contents for change data) using the syntax @METADATA(<field name>), or a field in the USERDATA map (see Adding user-defined data to WAEvent streams), using the syntax @USERDATA(<field name>). If appropriate, you may concatenate multiple METADATA and/or USERDATA fields.WAEvent contents for change data

SAS Key

String

the primary key associated with the SAS policy

SAS Policy Name

String

an Azure SAS policy to authenticate connections (see Shared Access Authorization Policies)

For samples of the output, see:

If E1P is set to true, the records will contain information Striim can use to ensure no duplicate records are written during recovery (see Recovering applications).Recovering applications

Azure Event Hub Writer sample application that writes data to an event hub

The following sample application will write data from PosDataPreview.csv to an event hub.

CREATE SOURCE PosSource USING FileReader (
  directory:'Samples/PosApp/AppData',
  wildcard:'PosDataPreview.csv',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:yes
)
OUTPUT TO RawStream;

CREATE CQ CsvToPosData
INSERT INTO PosDataStream
SELECT TO_STRING(data[1]) as merchantId,
  TO_DATEF(data[4],'yyyyMMddHHmmss') as dateTime,
  TO_DOUBLE(data[7]) as amount,
  TO_STRING(data[9]) as zip
FROM RawStream;

CREATE TARGET EventHubTarget USING AzureEventHubWriter (
  EventHubNamespace:'myeventhub-ns',
  EventHubName:’PosAppData’,
  SASTokenName:'RootManageSharedAccessKey',
  SASToken:'******',
  PartitionKey:'merchantId'
)
FORMAT USING DSVFormatter ()
INPUT FROM PosDataStream;

Azure Event Hub Writer sample application that replicates data to an event hub

The following sample application will replicate data from two Oracle tables to two partitions in an event hub.

CREATE SOURCE OracleSource USING OracleReader (
  Username:'myname',
  Password:'******',
  ConnectionURL: 'localhost:1521:XE’,
  Tables:'QATEST.EMP;QATEST.DEPT’
) 
OUTPUT TO sourceStream;

CREATE TARGET EventHubTarget USING AzureEventHubWriter (
  EventHubNamespace:'myeventhub-ns',
  EventHubName:’OracleData’,
  SASTokenName:'RootManageSharedAccessKey',
  SASToken:'******',
  PartitionKey:'@metadata(TableName)',
  E1P:'True',
  ConsumerGroup:'testconsumergroup'
)
FORMAT USING DSVFormatter()
INPUT FROM sourceStream;