Using the schema conversion utility
When copying or continuously replicating tabular data from a database to another database, Striim usually requires that the target contain corresponding tables. The schema conversion utility will analyze the source tables and generate one or more DDL scripts to create compatible target tables. This DDL will include primary keys, foreign keys, UNIQUE constraints, and NOT NULL constraints. You must create other objects such as indexes and CHECK constraints in the target manually.
In this release, the supported DBMSes and data warehouses are:
BigQuery (target only)
Databricks (target only)
Note: if you are using Databricks' Unity Catalog, after generating the scripts, edit their
create table
statements to add your catalog name before the database name. For example, if your catalog is namedmyunitycatalog
, changecreate table hr.emp
tocreate table myunitycatalog.hr.emp
.MariaDB
MySQL
Oracle Database
PostgreSQL
Snowflake (target only)
Spanner (target only)
SQL Server
For source database versions supported, see MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader properties.
The conversion utility is striim/bin/schemaConversionUtility.sh
or in Windows schemaConversionUtility.bat
. Its syntax is:
schemaConversionUtility.sh -s="<source database type>" -d="<source database connection URL>" -u="<source database user name>" -p="<source database password>" -b="<source tables>" -e="<excluded tables>" -t="<target database type>" [ -f="<exclude foreign keys>" ] [ -c="<SSL configuration string>" ] [ -fp="<path to startUp.properties>" ]
You may also use the -h
option to display help text, including verbose versions of the options.
Specify the options as follows:
-s:
mariadb
,mysql
,oracle
,postgres
, orsqlserver
-d: see discussion of the Connection URL property in MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader properties
-u: a user name for the source specified in -d that has the roles, permissions, or privileges necessary to read the DLL for the tables specified in -b, specifically:
MariaDB / MySQL: SELECT ON *.*
Oracle: CONNECT, SELECT ANY TABLE
PostgreSQL: CONNECT ON DATABASE <database name>, USAGE ON DATABASE <database nam>
SQL Server: VIEW ANY DEFINITION
-p: the password for the user specified in -u
-b: the source tables to convert
The % wildcard is supported for table names only, for example,
myschema.%
ormyschema.prefix%
. See also discussion of the Tables property in MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader properties.Three-part names are not supported.
-e: see discussion of the Excluded Tables property in MySQL Reader properties, Oracle Reader properties, PostgreSQL Reader properties, or MS SQL Reader properties
-e
is subject to the same limitations as-b
.-t:
bigquery
,deltalake
(for Databricks),mariadb
,mysql
,oracle
,postgres
,snowflake
,spanner
(requires 3.10.1.1 or later), orsqlserver
-f: set to
true
to exclude foreign keys from the generated DDL. If you do not specify-f="true"
, foreign keys are included (see example below).-c: if -s is
oracle
and the source database uses SSL, see discussion of the SSL Config property in Oracle Reader properties-fp: specifiy path to
startUp.properties
if it is not in its default locationstriim/conf
For example, to generate DDL for BigQuery tables from the PostgreSQL source tables discussed in Running the CDC demo apps:
striim/bin/schemaConversionUtility.sh -s="postgres" -d="jdbc:postgresql://localhost:5432/webaction" -u="striim" -p="striim" -b="public.%" -t="bigquery"
The output will look something like:
SCHEMA CONVERSION RESULTS - Schema name - public Number of compatible tables - 4 Number of tables compatible with Striim Intelligence - 3 Number of incompatible tables - 0 The resultant output SQL files, and the report of the schema conversion are located at the folder: ...
The output folder will contain three files: conversion_report.txt
, converted_tables.sql
, and converted_tables_with_striim_intelligence.sql
, containing the following:
SCHEMA CONVERSION REPORT Generated at: 2020-07-01 12:44:46-0700 Topology: postgres -> bigquery Table Conversion Score: - Tables converted successfully: 57.14% - Tables converted successfully with Striim Intelligence: 42.86% - Tables not converted: 0.0% Table list - Successful Conversion (4) - public.nation - public.regiontarget - public.nationtarget - public.region Table list - Successful Converted with Striim Intelligence (3) - public.customertarget - numeric : c_acctbal - public.chkpoint - bytea : sourceposition - public.customer - numeric : c_acctbal Table list - Failed Conversion (0) Foreign key list - Successful Conversion (0) Foreign key list - Failed Conversion (0)
The schema conversion report tells you that four tables converted successfully, three were converted using "Striim intelligence" (best-guess data type mapping), and none failed.
converted_tables.sql
contains a script to create the four successfully converted tables:
-- SCHEMA NAME - public -- TABLE NAME - public.nation CREATE TABLE `public`.`nation`(`n_nationkey` int64 NOT NULL,`n_name` string,` n_regionkey` int64 NOT NULL,`n_comment` string); -- TABLE NAME - public.regiontarget CREATE TABLE `public`.`regiontarget`(`r_regionkey` int64 NOT NULL, `r_name` string,`r_comment` string); -- TABLE NAME - public.nationtarget CREATE TABLE `public`.`nationtarget`(`n_nationkey` int64 NOT NULL,`n_name` string, `n_regionkey` int64 NOT NULL,`n_comment` string); -- TABLE NAME - public.region CREATE TABLE `public`.`region`(`r_regionkey` int64 NOT NULL,`r_name` string, `r_comment` string);
converted_tables_with_striim_intelligence.sql
contains a script to create the converted tables along with comments detailing data type mappings that may not be appropriate. (The chkpoint table is used by Database Writer to store information used in recovery, so you should not create it in Snowflake.)
-- SCHEMA NAME - public -- TABLE NAME - public.customertarget -- Target data type string might not be the best fit for the incoming data type numeric of the column c_acctbal. CREATE TABLE `public`.`customertarget`(`c_custkey` int64 NOT NULL,`c_name` string, `c_address` string,`c_nationkey` int64 NOT NULL,`c_phone` string,`c_acctbal` string,` c_mktsegment` string,`c_comment` string); -- TABLE NAME - public.chkpoint -- Target data type bytes might not be the best fit for the incoming data type bytea of the column sourceposition. CREATE TABLE `public`.`chkpoint`(`id` string NOT NULL,`sourceposition` bytes,`pendingddl` numeric,`ddl` string); -- TABLE NAME - public.customer -- Target data type string might not be the best fit for the incoming data type numeric of the column c_acctbal. CREATE TABLE `public`.`customer`(`c_custkey` int64 NOT NULL,`c_name` string,`c_address` string, `c_nationkey` int64 NOT NULL,`c_phone` string,`c_acctbal` string,`c_mktsegment` string, `c_comment` string);
When Striim is unable to convert one or more tables, the output will also include conversion_failed_tables.sql
. For example:
-- SCHEMA NAME - SCOTT -- TABLE NAME - SCOTT.OFFICE_LOCATIONS -- Source data type SDO_GEOMETRY of the column COORDS is not supported, defaulting to SDO_GEOMETRY_STRIIM_UNKNOWN CREATE TABLE "SCOTT"."OFFICE_LOCATIONS"("LOCATIONID" numeric(38,0) NOT NULL, "COORDS" SDO_GEOMETRY_STRIIM_UNKNOWN,"CITY" VARCHAR(100),"ZIPCODE" VARCHAR(10), "COUNTRY" VARCHAR(100),PRIMARY KEY("LOCATIONID"));
Modify the commands in this script as necessary to resolve the problems.
When the source table definitions include foreign keys, the output will also include converted_foreign_keys.sql
. For example, with these Oracle source tables:
CREATE TABLE SCOTT.EMP ( EMPNO NUMBER(4) CONSTRAINT PK_EMP PRIMARY KEY, ENAME VARCHAR2(10), JOB VARCHAR2(9), MGR NUMBER(4), HIREDATE DATE, HIREDMONTHS INTERVAL YEAR TO MONTH, SAL NUMBER(7,2), COMM NUMBER(7,2), DEPTNO NUMBER(2) CONSTRAINT FK_DEPTNO REFERENCES SCOTT.DEPT(DEPTNO) ); CREATE TABLE SCOTT.EMP_LAPTOP ( LAPTOPID INTEGER CONSTRAINT PK_EMP_LAPTOP PRIMARY KEY, EMPID INTEGER CONSTRAINT FK_EMP_LAPTOP REFERENCES SCOTT.EMP (EMPNO), OBJECTNAME VARCHAR2(200), DATEPROVIDED TIMESTAMP );
converted_foreign_keys.sql
will include something like:
-- SCHEMA NAME - SCOTT -- CONSTRAINT NAME - FK_DEPTNO PARENT TABLE - SCOTT.EMP REFERENCE TABLE - SCOTT.DEPT ALTER TABLE SCOTT.EMP ADD CONSTRAINT FK_DEPTNO FOREIGN KEY(DEPTNO) REFERENCES SCOTT.DEPT(DEPTNO) ON UPDATE NO ACTION ON DELETE NO ACTION; -- CONSTRAINT NAME - FK_EMP_LAPTOP PARENT TABLE - SCOTT.EMP_LAPTOP REFERENCE TABLE - SCOTT.EMP ALTER TABLE SCOTT.EMP_LAPTOP ADD CONSTRAINT FK_EMP_LAPTOP FOREIGN KEY(EMPID) REFERENCES SCOTT.EMP(EMPNO) ON UPDATE NO ACTION ON DELETE NO ACTION;
This script should be run after your initial load is complete.