Skip to main content

Using the schema conversion utility

When copying or continuously replicating tabular data from a database to another database, Striim usually requires that the target contain corresponding tables. The schema conversion utility will analyze the source tables and generate one or more DDL scripts to create compatible target tables. This DDL will include primary keys, foreign keys, UNIQUE constraints, and NOT NULL constraints. You must create other objects such as indexes and CHECK constraints in the target manually.

In this release, the supported DBMSes and data warehouses are:

  • BigQuery (target only)

  • Databricks (target only)

    Note: if you are using Databricks' Unity Catalog, after generating the scripts, edit their create table statements to add your catalog name before the database name. For example, if your catalog is named myunitycatalog, change create table hr.emp to create table myunitycatalog.hr.emp.

  • MariaDB

  • MySQL

  • Oracle Database

  • PostgreSQL

  • Snowflake (target only)

  • Spanner (target only)

  • SQL Server

For source database versions supported, see MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader properties.MySQLReader and MariaDBReader propertiesOracle Reader propertiesPostgreSQL Reader propertiesMS SQL Reader properties

The conversion utility is striim/bin/schemaConversionUtility.sh or in Windows schemaConversionUtility.bat. Its syntax is:

schemaConversionUtility.sh
  -s="<source database type>"
  -d="<source database connection URL>" 
  -u="<source database user name>"
  -p="<source database password>"
  -b="<source tables>" 
  -e="<excluded tables>"
  -t="<target database type>"
  [ -f="<exclude foreign keys>" ]
  [ -c="<SSL configuration string>" ]
  [ -fp="<path to startUp.properties>" ]

You may also use the -h option to display help text, including verbose versions of the options.

Specify the options as follows:

  • -s: mariadb, mysql, oracle, postgres, or sqlserver

  • -d: see discussion of the Connection URL property in MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader propertiesMySQLReader and MariaDBReader propertiesOracle Reader propertiesPostgreSQL Reader propertiesMS SQL Reader properties

  • -u: a user name for the source specified in -d that has the roles, permissions, or privileges necessary to read the DLL for the tables specified in -b, specifically:

    • MariaDB / MySQL: SELECT ON *.*

    • Oracle: CONNECT, SELECT ANY TABLE

    • PostgreSQL: CONNECT ON DATABASE <database name>, USAGE ON DATABASE <database nam>

    • SQL Server: VIEW ANY DEFINITION

  • -p: the password for the user specified in -u (special characters such as $ may need to be escaped per the requirements of your operating system or shell)

  • -b: the source tables to convert

    The % wildcard is supported for table names only, for example, myschema.% or myschema.prefix%. See also discussion of the Tables property in MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader properties.MySQLReader and MariaDBReader propertiesOracle Reader propertiesPostgreSQL Reader propertiesMS SQL Reader properties

    Three-part names are not supported.

  • -e: see discussion of the Excluded Tables property in MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader propertiesMySQLReader and MariaDBReader propertiesOracle Reader propertiesPostgreSQL Reader propertiesMS SQL Reader properties

    -e is subject to the same limitations as -b.

  • -t: bigquery, deltalake (for Databricks), mariadb, mysql, oracle, postgres, snowflake, spanner (requires 3.10.1.1 or later), or sqlserver

  • -f: set to true to exclude foreign keys from the generated DDL. If you do not specify -f="true", foreign keys are included (see example below).

  • -c: if -s is oracle and the source database uses SSL, see discussion of the SSL Config property in Oracle Reader propertiesOracle Reader properties

  • -fp: specifiy path to startUp.properties if it is not in its default location striim/conf

For example, to generate DDL for BigQuery tables from the PostgreSQL source tables discussed in Running the CDC demo apps:Running the CDC demo apps

striim/bin/schemaConversionUtility.sh -s="postgres" -d="jdbc:postgresql://localhost:5432/webaction"
  -u="striim" -p="striim" -b="public.%" -t="bigquery"

The output will look something like:

SCHEMA CONVERSION RESULTS - 
Schema name - public
Number of compatible tables - 4
Number of tables compatible with Striim Intelligence - 3
Number of incompatible tables - 0
The resultant output SQL files, and the report of the schema conversion are located at the folder: ... 

The output folder will contain three files: conversion_report.txt, converted_tables.sql, and converted_tables_with_striim_intelligence.sql, containing the following:

SCHEMA CONVERSION REPORT
Generated at: 2020-07-01 12:44:46-0700

Topology: postgres -> bigquery

Table Conversion Score:
    - Tables converted successfully: 57.14%
    - Tables converted successfully with Striim Intelligence: 42.86%
    - Tables not converted: 0.0%

Table list - Successful Conversion (4)
    - public.nation
    - public.regiontarget
    - public.nationtarget
    - public.region

Table list - Successful Converted with Striim Intelligence (3)
    - public.customertarget
        - numeric : c_acctbal
    - public.chkpoint
        - bytea : sourceposition
    - public.customer
        - numeric : c_acctbal

Table list - Failed Conversion (0)

Foreign key list - Successful Conversion (0)

Foreign key list - Failed Conversion (0)

The schema conversion report tells you that four tables converted successfully, three were converted using "Striim intelligence" (best-guess data type mapping), and none failed.

converted_tables.sql contains a script to create the four successfully converted tables:

-- SCHEMA NAME - public
-- TABLE NAME - public.nation
CREATE TABLE `public`.`nation`(`n_nationkey` int64 NOT NULL,`n_name` string,`
n_regionkey` int64 NOT NULL,`n_comment` string);
-- TABLE NAME - public.regiontarget
CREATE TABLE `public`.`regiontarget`(`r_regionkey` int64 NOT NULL,
`r_name` string,`r_comment` string);
-- TABLE NAME - public.nationtarget
CREATE TABLE `public`.`nationtarget`(`n_nationkey` int64 NOT NULL,`n_name` string,
`n_regionkey` int64 NOT NULL,`n_comment` string);
-- TABLE NAME - public.region
CREATE TABLE `public`.`region`(`r_regionkey` int64 NOT NULL,`r_name` string,
`r_comment` string);

converted_tables_with_striim_intelligence.sql contains a script to create the converted tables along with comments detailing data type mappings that may not be appropriate. (The chkpoint table is used by Database Writer to store information used in recovery, so you should not create it in Snowflake.)

-- SCHEMA NAME - public
-- TABLE NAME - public.customertarget
-- Target data type string might not be the best fit for the incoming data type numeric of the 
column c_acctbal.
CREATE TABLE `public`.`customertarget`(`c_custkey` int64 NOT NULL,`c_name` string,
`c_address` string,`c_nationkey` int64 NOT NULL,`c_phone` string,`c_acctbal` string,`
c_mktsegment` string,`c_comment` string);
-- TABLE NAME - public.chkpoint
-- Target data type bytes might not be the best fit for the incoming data type bytea of the 
column sourceposition.
CREATE TABLE `public`.`chkpoint`(`id` string NOT NULL,`sourceposition` bytes,`pendingddl` 
numeric,`ddl` string);
-- TABLE NAME - public.customer
-- Target data type string might not be the best fit for the incoming data type numeric of the 
column c_acctbal.
CREATE TABLE `public`.`customer`(`c_custkey` int64 NOT NULL,`c_name` string,`c_address` string,
`c_nationkey` int64 NOT NULL,`c_phone` string,`c_acctbal` string,`c_mktsegment` string,
`c_comment` string);

When Striim is unable to convert one or more tables, the output will also include conversion_failed_tables.sql. For example:

-- SCHEMA NAME - SCOTT
-- TABLE NAME - SCOTT.OFFICE_LOCATIONS
-- Source data type SDO_GEOMETRY of the column COORDS is not supported,
 defaulting to SDO_GEOMETRY_STRIIM_UNKNOWN
CREATE TABLE "SCOTT"."OFFICE_LOCATIONS"("LOCATIONID" numeric(38,0) NOT NULL,
  "COORDS" SDO_GEOMETRY_STRIIM_UNKNOWN,"CITY" VARCHAR(100),"ZIPCODE" VARCHAR(10),
  "COUNTRY" VARCHAR(100),PRIMARY KEY("LOCATIONID"));

Modify the commands in this script as necessary to resolve the problems.

When the source table definitions include foreign keys, the output will also include converted_foreign_keys.sql. For example, with these Oracle source tables:

CREATE TABLE SCOTT.EMP (
    EMPNO NUMBER(4) CONSTRAINT PK_EMP PRIMARY KEY,
	ENAME VARCHAR2(10),
	JOB VARCHAR2(9),
	MGR NUMBER(4),
	HIREDATE DATE,
    HIREDMONTHS INTERVAL YEAR TO MONTH,
	SAL NUMBER(7,2),
	COMM NUMBER(7,2),
	DEPTNO NUMBER(2) CONSTRAINT FK_DEPTNO REFERENCES SCOTT.DEPT(DEPTNO)
);
CREATE TABLE SCOTT.EMP_LAPTOP (
  LAPTOPID      INTEGER CONSTRAINT PK_EMP_LAPTOP PRIMARY KEY,
  EMPID         INTEGER   CONSTRAINT FK_EMP_LAPTOP REFERENCES SCOTT.EMP (EMPNO),
  OBJECTNAME    VARCHAR2(200),
  DATEPROVIDED  TIMESTAMP
);

converted_foreign_keys.sql will include something like:

-- SCHEMA NAME - SCOTT
-- CONSTRAINT NAME - FK_DEPTNO  PARENT TABLE - SCOTT.EMP  REFERENCE TABLE - SCOTT.DEPT
ALTER TABLE SCOTT.EMP ADD CONSTRAINT FK_DEPTNO FOREIGN KEY(DEPTNO)
  REFERENCES SCOTT.DEPT(DEPTNO) ON UPDATE NO ACTION ON DELETE NO ACTION;
-- CONSTRAINT NAME - FK_EMP_LAPTOP  PARENT TABLE - SCOTT.EMP_LAPTOP
  REFERENCE TABLE - SCOTT.EMP
ALTER TABLE SCOTT.EMP_LAPTOP ADD CONSTRAINT FK_EMP_LAPTOP FOREIGN KEY(EMPID)
  REFERENCES SCOTT.EMP(EMPNO) ON UPDATE NO ACTION ON DELETE NO ACTION;

This script should be run after your initial load is complete.