Alok Pareek, EVP of Products at Striim, and Codin Pora, Director of Partner Technology at Striim, provide a demo of the Striim platform at Google Cloud Next SF, April 2019. Alok goes into detail about how Google Cloud users can move real-time data from a variety of sources into their Google Cloud Spanner environment using the Striim platform.
Unedited Transcript:
So with that, I’d like to invite Alok and call them up to stage to give us a demo of Spanner. And their company Striim is strategic partners of ours that do basically replication and migration of data into Google cloud. Thank you. Thank you.
Thank you, Tobias. So today I’m going to show a demonstration of another. You have these wonderful endpoints on the Google cloud. How do you actually use them? How do you actually move your data into them? And I’m going to talk about in this demo how we move real time data from your applications from an on premise Oracle database into Cloud Spanner. So before I get into the demojust a little bit about Striim. Striim is the next generation platform that helps in three solution categories. These are cloud option, hybrid cloud data integration, in-memory stream processing. Today I’m going to be focusing on the cloud adoption, specifically, how do we move data into Spanner? So with that, we’re going to jump into the demo.
Okay. So what you see on the screen is the landing page. And I’m gonna keep this going pretty fast. We’re going to step into the apps part of the demo. That’s where the data pipelines are defined. That helps you move the data from on premise to Spanner. In this case, what you are seeing, there are two pipelines. One of them is meant to do an initial load or an instantiation of your existing data onto Cloud Spanner tables. And the other one is also meant to catch it up. So while you are actually moving the data, you might have very large tables, for example, or massive amounts of volumes. So how do you actually go ahead and not lose any data? And all of the consistency things that we heard about from Tobia survey earlier.
It’s important that while you are moving the data, you also don’t have disruption to your applications and to your business. So let’s step into the pipeline here. So this is a very simple pipeline. It actually has a simple flow. You have at the top a data source, which is in this case Oracle, it’s running on premise. So we connect into this Oracle database. It has a line items table. We’re going to show you a movement of about a hundred thousand records. And also there’s an order stabler where we’re going to show you the delta processing. The way this application is constructed is by using these components on the left side of the UI in the flow designer as you drag and drop one of these things and you push them into the pipeline.
And that’s how you actually construct your data flow. And once we actually go we can also step into the Spanner target definition and this is your service account and the connectivity and the config for your Spanner. We’re gonna next deploy this application or the pipeline and once we deploy it, this is where you can sort of see that I can actually run this within the Striim platform. This can be run either on premise or on the Google Cloud. We want to probably show, Codin, that there’s nothing available yet in the tables on the Spanner side. So let’s go ahead and execute a query against a line item table. And in this case you’re seeing that there are zero records there and you can take my word that there is a hundred thousand records on the Oracle side.
In the interest of time we’ll assume that and let’s go ahead and run the application. And as soon as we are on the application you can see that in the preview in the lower part of your screen, you can actually see the records running live. This is while we are uploading the data and applying them into Cloud Spanner. You can see that we have completed a 100,000 records and it was pretty fast. This morning I’d done a million records so I was holding my breath there, but that was pretty fast as well. So now you can see that the data part is completed. I mentioned to you that there’s a second phase here. That’s the change data capture phase. So this is while you’re actually executing this query, of course, this query is consistent as of a specific snapshot.
At Oracle, there’s also DML activity against your application. So how do we actually take this data? This is the second pipeline now, so we can step into pipeline number two. Codin is already deployed it and in this case we use a special reader and that actually operates against the redo logs of the Oracle database and actually monitors that. So it doesn’t actually have any impact on the production system per se, impact us in like it’s at least not doing any query impact there. We grabbed the data from the redo logs and then we are going to reapply that as DMO, as inserts, updates and so forth on the Cloud Spanner system. So let’s go ahead and run this application. We are going to generate some DML using a data generator.
And let’s go ahead and run the generator and you’ll see that there’s a number of inserts, updates and deletes against the orders table. And now let’s switch over to the Cloud Spanner system and query the order stable here. As you can see, there’s data in the orders table. This was also something that was just propagated. So this is sort of like the two phase, very fast demo of how you get data from your on prem databases into Cloud Spanner. And of course this can work against other databases that we support as well. And this a available in the Google Cloud. So with that, I’m gonna hand the control back to Tobias.