Handling schema evolution
For the CDC sources listed below, Striim can capture certain changes to the DDL in the source tables. When CDDL Capture is enabled in a source, you must choose which of the following actions you want to happen when it encounters a DDL change:
Halt in source: DDL changes are never expected in the source. Striim will halt the application so you can investigate the issue. Use this option when you want to prohibit changes to the source schema.
Ignore in source: target is schemaless and does not have the CDDL Action property (for example, FileWriter with JSON Formatter, or MongoDB).
Process in source, Halt in target: the application has one or more targets that support the Process action and one or more that have the CDDL Action property but do not support the Process action. In this case, set the CDDL Action property to Halt for the targets that do not support Process. Schema changes in the source will be replicated to the targets that support Process and the application will halt for you to deal with the others manually. If recovery is enabled for the application, after restart the DDL operation will be sent again.
DDL changes to tables specified in Excluded Tables will not trigger Halt.
Process in source, Ignore in target: the application has multiple targets that have the CDDL Action property and you do not want to replicate changes to this one.
Process in source, Process in target: replicate changes to the target, keeping the target schema in sync with the source automatically without interrupting operation of the application. This is supported only for the targets listed below..
When an unsupported DDL operation or unsupported data type is encountered in a DDL change, the application will halt for you to troubleshoot the problem.
Quiesce in source: target does not support schema evolution, so when a DDL change is detected, Striim will write all the events received prior to the DDL operation to the target, then quiesce the application. Then you can update the target schema manually. If recovery is enabled for the application, after restart the DDL operation will not be sent again.
Always select Process in the source when Using the Confluent or Hortonworks schema registry.
Supported CDC sources
MySQL
Oracle Database 18c and earlier
Oracle GoldenGate 12.1 or later for Oracle Database only
PostgreSQL (see PostgreSQL setup for schema evolution)
SQL Server (with MSJet only)
Targets with the CDDL Action property
Azure Synapse
Databricks Writer: supports only CREATE TABLE, ADD COLUMN, and DROP TABLE
Google BigQuery
Google Cloud Spanner
MySQL (via Database Writer)
Oracle Database (via Database Writer)
PosgtgreSQL (via Database Writer)
SAP Hana (via Database Writer - does not support Process)
Snowflake
SQL Server (via Database Writer)
Sybase (via Database Writer - does not support Process)
Supported DDL operations
CREATE TABLE (default column values are not supported)
ALTER TABLE ... ADD COLUMN (default column values are not supported)
with Oracle Database, adding NOT NULL constraints is not supported (known issues DEV-24666, DEV-25424)
ALTER TABLE ... MODIFY COLUMN: The modification must be compatible with existing data, for example, you could change
short
tolong
, orvarchar(20)
tovarchar(30)
. Default column values are not supported.not supported with BigQuery or Databricks targets
If a ColumnMap is specified (see Mapping columns), the mapped target column will be modified.
with Oracle Database, adding NOT NULL constraints is not supported (known issues DEV-24666, DEV-25424)
Snowflake Writer: see limitations described in ALTER TABLE … ALTER COLUMN
not supported with SQL Server sources (known issue DEV-26386)
ALTER TABLE ... ADD PRIMARY KEY
not supported with BigQuery or Databricks targets
supported with GoldenGate sources only when table has no primary key (known issues DEV-26575)
not supported with SQL Server sources
ALTER TABLE ... ADD CONSTRAINT ... PRIMARY KEY
not supported with BigQuery or Databricks targets
With GoldenGate sources, use the syntax
ALTER TABLE <name> ADD CONSTRAINT <name> PRIMARY KEY (id)
. (The syntaxALTER TABLE MODIFY ID NUMBER NOT NULL PRIMARY KEY
will not work.)not supported with SQL Server sources
ALTER TABLE ... ADD CONSTRAINT ... UNIQUE
not supported with BigQuery or Databricks targets
supported with GoldenGate sources only when table has a primary key column and the constraint is not added to that column (known issues DEV-26575)
not supported with SQL Server sources (known issue DEV-26386)
ALTER TABLE ... DROP COLUMN:
not supported with BigQuery or Databricks targets
If a ColumnMap is specified for the column (see Mapping columns), the application will halt. ALTER and RECOMPILE the application to remove the ColumnMap for the column, drop the column from the target table, and restart the application.
with Oracle Database, adding NOT NULL constraints is not supported (known issues DEV-24666, DEV-25424)
DROP TABLE
TRUNCATE TABLE
supported when the source is OJet and the target is Database Writer writing to MariaDB, MySQL, Oracle, PostgreSQL, or SQL Server
Data type support and mappings
See Data type support & mapping for schema conversion & evolution.
Process in the source is supported only when all source table data types are supported. Process in the target is supported only when all source types are mapped to target types.
Monitoring schema evolution
The MON command includes the following metrics for schema evolution:
number of DDL operations, by table
last captured / applied DDL statement
time of last captured / applied DDL
ignored DDL count
Notes and limitations
Schema evolution is not supported when using Bidirectional replication.
Striim can capture only those DDL changes made after schema evolution is enabled.
The first time you start an application with CDDL Capture enabled, the CDC reader will take a snapshot of the source database's table metadata from the. It is essential that there are no DDL changes made to the database until startup completes. Otherwise, the schema captured in the snapshot will be out of date, which will eventually cause the application to terminate.
When the Tables property in the reader uses a wildcard, the first time the application is started Striim must fetch the metadata for all tables in the schema. If there are many tables in the schema, this may take a significant amount of time.
After an application with both recovery and schema evolution enabled is restarted, Striim will automatically use the correct schema for the restart position
If the application halts due to an unsupported DDL change, an unsupported column data type, or a Parser Exception, you may add the table causing the halt to the Excluded Tables list and restart the application.
Limitations
Renaming tables is not supported.
TRUNCATE TABLE is not supported. Use
DELETE FROM <table name>;
or some other method for deleting all rows from a table.
See also Schema evolution known issues and limitations.
Sample WAEvents for DDL operations when schema evolution is enabled
DDL command | example | resulting WAEvent |
---|---|---|
CREATE TABLE | CREATE TABLE PRODUCT.CUSTOMER ( c_custkey BIGINT not null, c_name VARCHAR(25) not null, c_address VARCHAR(40) not null, c_nationkey INTEGER not null, c_phone CHAR(15) not null, c_acctbal DOUBLE PRECISION,, c_mktsegment CHAR(10) not null ); | WAEvent{ data: ["CREATE TABLE PRODUCT.CUSTOMER …”] metadata:{ "OperationName": "Create", "TableName": "PRODUCT.CUSTOMER", "SchemaName": "PRODUCT", "OperationType": "DDL", "CDDLMetadata": “<Info about DDL>” } }; |
ALTER TABLE ADD COLUMN | ALTER TABLE PRODUCT.CUSTOMER ADD c_comment VARCHAR(117) not null; | WAEvent{ data: ["ALTER TABLE PRODUCT.CUSTOMER ADD c_comment VARCHAR(117) not null;"] metadata:{ "OperationName": "AlterColumns", "OperationSubName": "AddColumn", "TableName": "PRODUCT.CUSTOMER", "SchemaName": "PRODUCT", "OperationType": "DDL", "CDDLMetadata": “<Info about DDL>” } }; |
ALTER TABLE MODIFY COLUMN | ALTER TABLE PRODUCT.CUSTOMER ALTER COLUMN c_address TYPE VARCHAR(200); | WAEvent{ data: ["ALTER TABLE PRODUCT.CUSTOMER ALTER COLUMN c_address TYPE VARCHAR(200);"] metadata:{ "OperationName": "AlterColumns", "OperationSubName": "AlterColumn", "TableName": " PRODUCT.CUSTOMER", "SchemaName": "PRODUCT", "OperationType": "DDL", "CDDLMetadata": “<Info about DDL>” } }; |
ALTER TABLE DROP COLUMN | ALTER TABLE PRODUCT.CUSTOMER DROP COLUMN c_acctbal; | WAEvent{ data: ["ALTER TABLE PRODUCT.CUSTOMER DROP COLUMN c_acctbal;"] metadata:{ "OperationName": "AlterColumns", "OperationSubName": "DropColumn", "TableName": "PRODUCT.CUSTOMER", "SchemaName": "PRODUCT", "OperationType": "DDL", "CDDLMetadata": “<Info about DDL>” } }; |
DROP TABLE | Drop Table PRODUCT.CUSTOMER; | WAEvent{ data: ["DROP TABLE PRODUCT.CUSTOMER"] metadata:{ "OperationName": "Drop", "TableName": "HR.EMP", "SchemaName": "HR", "OperationType": "DDL", "CDDLMetadata": “<Info about DDL>” } }; |
Schema evolution known issues and limitations
The following are known issues in this release related to schema evolution. Additional known issues are flagged by "DEV-#####"in Handling schema evolution.
All sources
Columns with a data type that has a scale set to a negative value (for example, number(1, -17)
) are not supported.
Azure Synapse Writer
If using Optimized Merge mode, CREATE TABLE will cause the application to halt (DEV-29689).
Adding a NOT NULL constraint on a column that already has a UNIQUE constraint is not supported. (DEV-26158)
BigQuery Writer
When Streaming Upload is True and Mode is APPENDONLY, it may take BigQuery up to 90 minutes to apply the DDL update and be ready to accept more data. Do not insert, update, or delete records in the source before BigQuery is ready or the Striim application will halt.
MSJet
If a table is dropped and a table of the same name is created, the application may terminate. (DEV-26417)
The application will terminate if the database contains tables with names that vary only by case, for example,
id
andID
, even if those tables are not among those read by MSJet. (DEV-26872)CDDL Capture is not supported when using a downstream server (DEV-31641).
MySQL Reader
Adding a UNQUE constraint syntax with a system generated constraint name (for example,
ADD CONSTRAINT cs_01ec19f5f75caa91a1160eca1 UNIQUE (created_att)
is not supported. (DEV-26678)CDDL Capture is not supported when using a downstream server (DEV-31641).
OJet
CDDL Capture is not supported when using a downstream server (DEV-31641).
Columns of type ROWID are not supported.
Invisible, virtual, and unused columns are not supported.
Oracle Reader
Columns of type INTERVAL DAY(x) TO SECOND(y) are not supported. (DEV-24624).
Invisible, virtual, and unused columns are not supported.
PostgreSQL Reader
To capture DDL changes when the command has more than 1024 characters (for example, a CREATE TABLE statement with many columns), you must raise PostgreSQL's
track_activity_query_size
parameter from its default value of 1024. (DEV-24650)Creating a table with a column of type
serial
or adding a column of typeserial
is not supported.