Building pipelines with HubSpot Reader
You can read from HubSpot using the HubSpot Reader and write to any target supported by Striim. Typically, you will set up pipelines in two phases—initial load, followed by continuous incremental replication—as explained in Pipelines.
For initial load, use HubSpot Reader in Initial Load mode to create a point-in-time copy of selected objects at the target.
After initial load has completed, start continuous replication by reading new or changed records created in the same objects after the initial load began, then writing those changes to the target. For near-real-time continuous replication of new source data, use the HubSpot Reader in Incremental mode.
You can use automated pipeline wizards to build HubSpot pipelines that Striim manages end-to-end, or you can manually create separate applications for initial load and continuous incremental replication and manage the lifecycle yourself. Since the HubSpot Reader supports both initial load and continuous replication, you can handle both with a single application in Automated mode.
Using an automated pipeline wizard: if you want to build near-real-time pipelines from HubSpot and write to a supported target, we recommend using an automated pipeline wizard with a HubSpot Reader source. These wizards perform the following steps automatically:
Create two applications: one for initial load and the other for continuous incremental replication (polling).
Create a schema and tables in the target that match the objects to be synced from HubSpot.
Run the initial-load application to copy existing data from HubSpot to the target.
When initial load completes, run the incremental application to replicate new or changed records using the configured incremental marker and polling interval.
Not using an automated pipeline wizard: if your use case or policies do not allow using an automated pipeline, create separate applications for initial load and continuous replication:
Before performing initial load, identify a stable, monotonically increasing watermark field for each object (for example, a last-modified timestamp such as
Last_Modified_Datewhen available).Create a schema and tables in the target and perform initial load: use a wizard with a HubSpot Reader source. (Alternatively, you may pre-create target tables using native or third-party utilities.)
Perform an initial load when the schema and tables already exist in the target: use a wizard with a HubSpot Reader source configured for full export.
Switch from initial load to continuous replication: provide the last successful watermark value from the initial load as the starting offset.
Replicate new data: use a wizard with a HubSpot Reader source configured for incremental polling (set the incremental marker and polling interval). Configure the target for upsert semantics using appropriate key columns.
Alternatively, instead of using wizards, you can create applications using Flow Designer, TQL, or Striim’s REST API.
Pre-requisite - The initial setup and configuration you do in HubSpot are described in the Initial setup section.
Create a HubSpot Reader application using the Flow Designer
This procedure outlines how to use Striim’s Flow Designer to build and configure data pipelines. Flow Designer enables you to visually create applications with minimal or no coding.
Go to the Apps page in the Striim UI and click Start from scratch.
Provide the Name and Namespace for your app. The namespace helps organize related apps.
In the component section, expand Sources, and enter a keyword such as HubSpot Reader in the search field to filter available sources.
Select the desired source (HubSpot).
In the properties panel, provide the properties for the reader (for example,
Mode,Tables,Polling interval, and authentication). If you created a connection profile, set Use connection profile toTrueand select the profile by name.Click Save to complete the properties configuration.
Drag and drop processors, enrichers, and targets to complete your pipeline logic.
Deploy and start the application to begin data flow.
Create a HubSpot Reader application using TQL
The following sample TQL uses Striim’s HubSpot Reader to read from a HubSpot object and write to Snowflake.
CREATE FLOW hubspot_SourceFlow; CREATE OR REPLACE SOURCE hubspot_Source USING Global.HubSpotReader ( StartPosition: '%=0', ThreadPoolCount: '0', Tables: 'Companies', FetchSize: '100', PollingInterval: '120s', AuthMode: 'PrivateAppToken', PrivateAppToken: 'example', ClientId: '', PrivateAppToken_encrypted: 'true', IncrementalLoadMarker: 'UpdatedAt', adapterName: 'HubSpotReader', MigrateSchema: true, Mode: 'Automated' ) OUTPUT TO hubspot_OutputStream; END FLOW hubspot_SourceFlow; CREATE OR REPLACE TARGET Snowflake_hubspot_Target USING Global.SnowflakeWriter ( streamingUpload: 'false', password_encrypted: 'true', tables: '%,TEST.%', appendOnly: 'true', CDDLAction: 'Process', azureContainerName: 'striim-snowflake-container', StreamingConfiguration: 'MaxParallelRequests=5, MaxRequestSizeInMB=5, MaxRecordsPerRequest=10000', authenticationType: 'Password', optimizedMerge: 'false', connectionUrl: 'jdbc:snowflake://striim_partner.east-us-2.azure.snowflakecomputing.com/?db=DEMO_DB', columnDelimiter: '|', s3Region: 'us-west-1', username: 'user', password: 'example', uploadPolicy: 'eventcount:1000,interval:60s', externalStageType: 'Local', adapterName: 'SnowflakeWriter', fileFormatOptions: 'null_if = \"\"', s3BucketName: 'striim-snowflake-bucket' ) INPUT FROM hubspot_OutputStream; END APPLICATION hubspot;