Skip to main content

Cassandra Cosmos DB Writer

Writes to Cosmos DB using the Azure Cosmos DB Cassandra API. This allows you to write to Cosmos DB as if it were Cassandra.

Note

If the writer exceeds the number of Request Units per second provisioned for your Cosmos DB instance (see Request Units in Azure Cosmos DB), the application may halt. The Azure Cosmos DB Capacity Calculator can give you an estimate of the appropriate number of RUs to provision:

CosmosDBRUs.png

You may need more RUs during initial load than for continuing replication.

See Optimize your Azure Cosmos DB application using rate limiting for more information.

Notes:

  • Add a Baltimore root certificate to Striim's Java environment following the instructions in To add a root certificate to the cacerts store.

  • Target tables must have primary keys.

  • Primary keys can not be updated.

  • During recovery (see Recovering applications), events with primary keys that already exist in the target will be updated with the new values.Recovering applications

  • When the input stream of a Cassandra Cosmos DB Writer target is the output of a SQL CDC source, Compression must be enabled in the source.

  • If the writer exceeds the number of Request Units per second provisioned for your Cosmos DB instance (see Request Units in Azure Cosmos DB), the application will halt. You may use the Azure Cosmos DB Capacity Calculator to determine the appropriate number of RUs to provision. You may need more RUs during initial load than for continuing replication.

Data type support and correspondence are the same as for Database Writer (see Database Writer data type support and correspondence).Database Writer data type support and correspondence

Cassandra Cosmos DB Writer properties

property

type

default value

notes

Account Endpoint

String

Contact Point from the Azure Cosmos DB account's Connection String page

Account Key

encrypted password

Primary Password from the Azure Cosmos DB account's Connection String page's Read-write Keys tab

Checkpoint Table

String

CHKPOINT

To support recovery (see Recovering applications, a checkpoint table must be created in the target keyspace using the following DDL:Recovering applications

CREATE TABLE chkpoint (
  id varchar PRIMARY KEY,
  sourceposition blob,
  pendingddl int,
  ddl ascii);

If necessary you may use a different table name, in which case change the value of this property.

Column Name Escape Sequence

String

 

When the input stream of the target is the output of a DatabaseReader, IncrementalBatchReader, or SQL CDC source, you may use this property to specify which characters Striim will use to escape column names that contain special characters or are on the List of reserved keywords. You may specify two characters to be added at the start and end of the name (for example, [] ), or one character to be added at both the start and end (for example, ").

Connection Retry

String

retryInterval=30, maxRetries=3

With the default setting, if a connection attempt is unsuccessful, the adapter will try again in 30 seconds (retryInterval. If the second attempt is unsuccessful, in 30 seconds it will try a third time (maxRetries). If that is unsuccessful, the adapter will fail and log an exception. Negative values are not supported.

Consistency Level

String

ONE

How many replicas need to respond to the coordinator in order to consider the operation a success. Supported values are ONE, TWO, THREE, ANY, ALL, EACH QUORUM, and LOCAL QUORUM. For more information, see Consistency levels and Azure Cosmos DB APIs.

Excluded Tables

String

If Tables uses a wildcard, data from any tables specified here will be omitted. Multiple table names (separated by semicolons) and wildcards may be used exactly as for Tables.

Flush Policy

String

EventCount:1000, Interval:60

If data is not flushed properly with the default setting, you may use this property to specify how many events Striim will accumulate before writing and/or the maximum number of seconds that will elapse between writes. For example:

  • flushpolicy:'eventcount:5000'

  • flushpolicy:'interval:10s'

  • flushpolicy:'interval:10s, eventcount:5000'

Note that changing this setting may significantly degrade performance.

With a setting of 'eventcount:1', each event will be written immediately. This can be useful during development, debugging, testing, and troubleshooting.

Ignorable Exception Code

String

By default, if the Cassandra API returns an error, the application will terminate. Specify a portion of an error message to ignore errors and continue. This property is not case-sensitive.

When the input stream is the output of a SQL CDC source, and primary keys will be updated in the source, set this to primary key to ignore primary key errors and continue.

Ignored exceptions will be written to the application's exception store (see CREATE EXCEPTIONSTORE).

Keyspace

String

the Cassandra keyspace containing the specified tables

Load Balancing Policy

String

TokenAwarePolicy(RoundRobinPolicy())

See Specifying load balancing policies for more information.

Overload Policy

String

retryInterval=10, maxRetries=3

With the default setting, if Cassandra Cosmos DB Writer exceeds the number of Request Units per second provisioned for your Cosmos DB instance (see Request Units in Azure Cosmos DB) and the Cassandra API reports an overload error, the adapter will try again in ten seconds (retryInterval. If the second attempt is unsuccessful, in ten seconds it will try a a second time. If the second attempt is unsuccessful, in ten seconds it will try a third time (maxRetries). If that is unsuccessful, the adapter will fail and log an exception. Negative values are not supported.

Parallel Threads

Integer

See Creating multiple writer instances.

Port

String

10350

Port from the Azure Cosmos DB account's Connection String page

Tables

String

Cassandra table names must be lowercase. The tables must exist in Cassandra. Since columns in Cassandra tables are not usually created in the same order they are specified in the CREATE TABLE statement, when the input stream of the DatabaseWriter target is the output of a DatabaseReader or CDC source, the ColumnMap option is usually required (see Mapping columns) and wildcards are not supported. You may omit ColumnMap if you verify that the Cassandra columns are in the same order as the source columns.

Cassandra Cosmos DB Writer sample application

CREATE TARGET CassandraTarget USING CassandraCosmosDBWriter (
  AccountEndpoint: 'myCosmosDBAccount.cassandra.cosmos.azure.com',
  AccountKey: '**************************************************************************************==',
  Keyspace: 'myKeyspace',
  Tables: '<myKeyspace.MyTable1,myKeyspace.MyTable2'
INPUT FROM FilteredDataStream;