Skip to main content

Azure Synapse Writer

Writes to Azure Synapse (formerly Azure SQL Data Warehouse).

Prerequisites:

  • Deploy an Azure Synapse instance.

  • Deploy an Azure Blob Storage or Azure Data Lake Storage instance to be used for staging the data. See Best practices for loading data into a dedicated SQL pool in Azure Synapse Analytics.

  • Optionally, connect the Azure Synapse instance and the Azure Blob Storage or Azure Data Lake Storage Gen2 instance with an Azure Virtual Network (VNet). See Impact of using VNet Service Endpoints with Azure storage, particularly the prerequisites and the instructions for creating a database master key.

  • Create an Azure Synapse login for use by Striim.

  • Create an Azure Synapse database scoped credential with the storage account name as the IDENTITY and the storage account access key as the SECRET. For example:

    CREATE MASTER KEY ENCRYPTION BY PASSWORD='<password>';
    CREATE DATABASE SCOPED CREDENTIAL AppCred WITH IDENTITY = '<storage account name>',
    SECRET = '<access key>'; 

    You can view scoped credentials with the command:

    SELECT * FROM sys.database_scoped_credentials;
  • If using MERGE mode:

Azure Synapse Writer properties

property

type

default value

notes

Account Access Key

String

the account access key for the storage account from Storage accounts > <account name> > Access keys

Account Name

String

the storage account name

CDDL Action

String

Process

See Handling schema evolution.

If TRUNCATE commands may be entered in the source and you do not want to delete events in the target, precede the writer with a CQ with the select statement ELECT * FROM <input stream name> WHERE META(x, OperationName).toString() != 'Truncate'; (replacing <input stream name> with the name of the writer's input stream). Note that there will be no record in the target that the affected events were deleted.

Client Configuration

String

If using a proxy, specify ProxyHost=<host name or IP address>,ProxyPort=<port number>.

Column Delimiter

String

|

If the data to be written may contain the default column delimiter (ASCII / UTF-8 124), specify a different delimiter that will never appear in the data.

Connection Retry Policy

String

initialRetryDelay=10s, retryDelayMultiplier=2, maxRetryDelay=1m, maxAttempts=10, totalTimeout=10m

With the default setting, if a connection attempt is unsuccessful, the adapter will try again in 10 seconds (InitialRetryDelay=10s). If the second attempt is unsuccessful, in 20 seconds it will try a third time (InitialRetryDelay=10s multiplied by retryDelayMultiplier=2). If that fails, the adapter will try again in 40 seconds (the previous retry interval 20s multiplied by 2). If connection attempts continue to fail, the the adapter will try again every 60 seconds (maxRetryDelay=1m) until a total of 10 connection attempts have been made (maxAttempts=10), after which the adapter will halt and log an exception.

The adapter will halt when either maxAttempts or totalTimeout is reached.

InitialRetryDelay, maxRetryDelay, and totalTimeout may be specified in milliseconds (ms), seconds (s, the default), or minutes (m).

If retryDelayMultiplier is set to 1, connection will be attempted on the fixed interval set by InitialRetryDelay.

To disable connection retry, set maxAttempts=0.

Negative values are not supported.

Connection URL

String

the JDBC connection URL for Azure Synapse, in the format jdbc:sqlserver://<fully qualified server name>:<port>;database=<SQL pool name>, for example, jdbc:sqlserver://mysqldw.database.windows.net:1433;database=mydb

Alternatively, you may use Active Directory authentication (see Supporting Active Directory authentication for Azure).

Excluded Tables

String

If Tables uses a wildcard, data from any tables specified here will be omitted. Multiple table names (separated by semicolons) and wildcards may be used exactly as for Tables.

Ignorable Exception Code

String

Set to TABLE_NOT_FOUND to prevent the application from terminating when Striim tries to write to a table that does not exist in the target. See Handling "table not found" errors for more information.

Ignored exceptions will be written to the application's exception store (see CREATE EXCEPTIONSTORE).

Merge API

String

STRIIM_MERGE

Mode

String

MERGE

With the default value of MERGE, inserts and deletes in the source are handled as inserts and deletes in the target. With this setting:

  • Since Synapse does not have primary keys, you may include the keycolumns option in the Tables property to specify a column in the target table that will contain a unique identifier for each row: for example, Tables:'SCOTT.EMP,mydb.mydataset.employee keycolumns(emp_num)'.

  • You may use wildcards for the source table provided all the tables have the key columns: for example, Tables:'DEMO.%,mydataset.% KeyColumns(...)'.

  • If you do not specify keycolumns , Striim will concatenate all column values and use that as a unique identifier.

Set to APPENDONLY to handle all operations as inserts. With this setting:

  • Updates and deletes from DatabaseReader, IncrementalBatchReader, and SQL CDC sources are handled as inserts in the target.

  • Primary key updates result in two records in the target, one with the previous value and one with the new value. If the Tables setting has a ColumnMap that includes @METADATA(OperationName), the operation name for the first event will be DELETE and for the second INSERT.

Optimized Merge

Boolan

False

Not supported when CDDL Action is Process.

Set to True only when Mode is MERGE and the target's input stream is the output of an HP NonStop reader, MySQL Reader, or Oracle Reader source and the source events will include partial records. For example, with Oracle Reader, when supplemental logging has not been enabled for all columns, partial records are sent for updates. When the source events will always include full records, leave this set to false.

Set to True also when the source is Oracle Reader and the source table includes BLOB or CLOB columns

Parallel Threads

Integer

See Creating multiple writer instances. Not supported when Mode is MERGE.

Password

encrypted password

The password for the specified user. See Encrypted passwords.

Storage Access Driver Type

String

WASBS

Set to ABFS if you are using an Azure Data Lake Storage instance for staging the data, or if you are using a general-purpose blob storage instance connected to Synapse using VNet or across a firewall. (See The Azure Blob Filesystem driver (ABFS) for more information.)

Leave at the default setting WASBS if using a general-purpose V1 or V2 blob storage account without VNet or a firewall.

Tables

String

The name(s) of the table(s) to write to. The table(s) must exist in the DBMS and the user specified in Username must have insert permission.

When the target's input stream is a user-defined event, specify a single table.

If the source table has no primary key, you may use the KeyColumns option to define a unique identifier for each row in the target table: for example, Tables:'sourcedb.emp,mydb.mySchema.emp KeyColumns(emp)'. The target table must be specified with a three-part name. If necessary to ensure uniqueness, specify multiple columns with the syntax KeyColumns(<column 1>,<column 2>,...). You may use wildcards for the source table, provided all the tables have the key columns: for example, Tables:'sourcedb.%,mydb.myschema.% KeyColumns(...)'. If the source has no primary key and KeyColumns is not specified, the concatenated value of all source fields is used as the primary key in the target.

When the input stream of the target is the output of a DatabaseReader, IncrementalBatchReader, or SQL CDC source (that is, when replicating data from one database to another), it can write to multiple tables. In this case, specify the names of both the source and target tables. You may use the % wildcard only for tables, not for schemas or databases. If the reader uses three-part names, you must use them here as well. Note that Oracle CDB/PDB source table names must be specified in two parts when the source is Database Reader or Incremental Batch reader (schema.%,schema.%) but in three parts when the source is Oracle Reader or OJet ((database.schema.%,schema.%). Note that SQL Server source table names must be specified in three parts when the source is Database Reader or Incremental Batch Reader (database.schema.%,schema.%) but in two parts when the source is MS SQL Reader or MS Jet (schema.%,schema.%). Examples:

source.emp,target.emp
source.db1,target.db1;source.db2,target.db2
source.%,target.%
source.mydatabase.emp%,target.mydb.%
source1.%,target1.%;source2.%,target2.%

MySQL and Oracle names are case-sensitive, SQL Server names are not. Specify names as <schema name>.<table name> for MySQL and Oracle and as <database name>.<schema name>.<table name> for SQL Server.

See Mapping columns for additional options.

Upload Policy

String

eventcount:10000, interval:5m

The upload policy may include eventcount and/or interval (see Setting output names and rollover / upload policies for syntax). Cached data is written to the storage account every time any of the specified values is exceeded. With the default value, data will be written every five minutes or sooner if the cache contains 10,000 events. When the app is undeployed, all remaining data is written to the storage account.

Username

String

the user name Striim will use to log in to the Azure Synapse specified in ConnectionURL

Azure Synapse Writer sample application

The following sample application would read from Oracle using IncrementalBatchReader and write to Azure Synapse.

CREATE  SOURCE ibr2azdw_Source USING IncrementalBatchReader  ( 
  Username: 'striim',
  Password: '********',
  ConnectionURL: '192.0.2.1:1521:orcl',
  Tables: 'MYSCHEMA.TABLE1',
  CheckColumn: 'MYSCHEMA.TABLE1=UUID',
  StartPosition: 'MYSCHEMA.TABLE1=1234'
) 
OUTPUT TO ibr2azdw_Source_Stream ;

CREATE  TARGET ibr2azdw_AzureSynapseTarget1 USING AzureSQLDWHWriter  ( 
  Username: 'striim',
  Password: '********',
  ConnectionURL: 'jdbc:sqlserver://testserver.database.windows.net:1433;database=rlsdwdb',
  Tables: 'MYSCHEMA.TABLE1,dbo.TABLE1',
  AccountName: 'mystorageaccount'
  AccountAccessKey: '********'
) 
INPUT FROM ibr2azdw_Source_Stream;

Azure Synapse data type support and correspondence

TQL type

Azure Synapse type

java.lang.Byte

tinyint

java.lang.Double

float

java.lang.Float

float

java.lang.Integer

int

java.lang.Long

bigint

java.lang.Short

smallint

java.lang.String

char, nchar, nvarchar, varchar

org.joda.time.DateTime

datetime, datetime2, datetimeoffset

When the input of an Azure Synapse target is the output of a MySQL source (DatabaseReader, IncremenatlBatchReader, or MySQLReader):

MySQL type

Azure Synapse type

bigint

bigint, numeric

bigint unsigned

bigint

binary

binary

char

nchar

date

date

datetime

datetime, datetime2, datetimeoffset

decimal

decimal

decimal unsigned

decimal

double

money, smallmoney

float

float, real

int

int

int unsigned

int

longblob

varbinary

longtext

varchar

mediumblob

binary

mediumint

int

mediumint unsigned

int

mediumtext

varchar

numeric unsigned

int

smallint

smallint

smallint unsigned

smallint

text

varchar

time

time

tinyblob

binary

tinyint

bit (if only one digit), tinyint

tinyint unsigned

tinyint

tinytext

varchar

varbinary

varbinary

varchar

nvarchar, varchar

year

varchar

When the input of an Azure Synapse target is the output of an Oracle source (DatabaseReader, IncremenatlBatchReader, or OracleReader):

Oracle type

Azure SQL Data Synapse type

binary_double

float

binary_float

real

blob

binary, varbinary

char

char

clob

nvarchar

date

date

float

float

nchar

nchar

nclob

varchar

number(1)

bit

number(10,4)

smallmoney

number(10)

int

number(19,4)

money

number(19)

bigint

number(3)

tinyint

number(5)

char, smallint

timestamp

datetime, datetime2, datetimeoffset

timestamp with local timezone

datetimeoffset

timestamp with timezone

datetimeoffset

varchar2

varchar

varchar2(30)

time

xmltype

varchar

When the input of an AzureSynapse target is the output of a SQL Server source (DatabaseReader, IncremenatlBatchReader, or MSSQLReader):

SQL Server type

Azure Synapse type

bigint

bigint

binary

binary

bit

bit, char

date

date

datetime

datetime

datetime2

datetime2

datetimeoffset

datetimeoffset

decimal

decimal

float

float

image

varbinary

int

int

money

money

nchar

nchar

ntext

varchar

numeric

numeric

nvarchar

nvarchar

nvarchar

nvarchar

real

real

smalldatetime

smalldatetime

smallint

smallint

smallmoney

smallmoney

text

varchar

time

time

tinyint

tinyint

varbinary

varbinary

varchar

varchar

xml

varchar