Hi everyone. My name is Michael Gualtieri, principal analyst at Forrester Research. And I want to talk to you today about streaming analytics, sometimes synonymously known as real-time analytics.
And really what this means is about analyzing data, finding insights and actions. Hang on those insights in real time. Okay, so was like to start this out. Business perspective, what are the priorities that we’re seeing from our survey data and the business. And you can see here, we surveyed 3000 global data analytics decision makers to ask them what they’re seeing in their organizations as top priorities. And you can see that improving the experience of their customers is number one. Okay. Closely followed by improving those products and services. Of course you also have the likely suspects of reducing costs there, but there’s also ability to innovate dressing rise or customer x day patients. Yeah. So you can see that there’s a deem there improving the experience customers. And the reason for that is clear. Companies want to gain new customers, retain them, increase loyalty.
And this is really getting back to something that we’ve always had, which is personal relationships with customers way back when someone walked into that corner store and that shopkeeper knew you and knew what you needed, and new personal things about you that created a really intense individualized customer experience. We moved quickly to mass production. We sort of got away from that. They were of a mass production. We tried to get them to it with CRM, customer relationship management by segmenting customers. And now the reason why we’re talking about big data and you know, kind of analytics now we can use the data and the connection activity of customers to create hyper personal, real-time digital experiences that art well to those experiences, those personal relationship experiences. And in some cases better because it can scale for the use of analytics. So this whole trend, okay, what should our roster, we call the age of the customer.
I just net it out. I call it celebrity because really what it means is that customers want and increasingly expect to be treated like celebrities. And how do you create celebrities? Well, we know everything about them, don’t we? So if George Clooney walked into the cheesecake factory, they wouldn’t just give him any old menu. When you walk in there, we all get the same menu. Why? Why can’t it be more individualized? George Clooney, they would prepare a menu, they’d understand what he wants and imagine it too, if he had a wearable device and it even connected. So their menuing system determined, yeah, he ran four miles and therefore would display the Kalua cheese cake. So these are the types of individualized experiences we’re trying to create. Okay. But the only way we can do that is with analytics and it’s not, plain old traditional analytics.
There are more types of analytics. Every organization needs to deliver these types of experiences. Okay. There’s descriptive analytics and that I think of that as traditional bi and that’s where most firms have invested, collects literally billions of dollars over the years to create reports and dashboards. Look at these three other types of what we call advanced analytics, predictive analytics, streaming analytics, and prescriptive analytics. These types of analytics are where firms must best if they’re going to be able to create these real time personalized experiences, predictive analytics to create models about individuals, their behaviors, their characteristics, what they need and want in the moment, streaming or real-time analytics to detect that context. What’s happening right now. And then prescriptive analytics to it. Adapt to figure out what to actually do about the context that was just detected. And this whole idea of advanced analytics is surging right now.
If you look at this data where we compare it, adoption of different analytics technology between 2014 and 2015, whoa. I mean, it’s unbelievable. Look at the momentum. I’ve circled sort of these lower quadrant, just this lower half of the crap she predicted. Location. Yeah. Streaming analytics went from 24% to 42%. So there’s an enormous amount of adoption on and momentum on these advances analytics topics. And that’s actually good for those of you who aren’t using advanced analytics yet because no with only 43% adoption, 42% adoption, that still means that most of your competitors haven’t started. So this is a really good time to get into these advanced analytics. Now, analytics is nothing without data. And we often talk about big data, but all data is born fast. There’s always a transaction that occurs. There’s always a sensor. It’s always born fast. Some say that data is the new oil and I believe that’s too limiting.
It’s more like the sun. It’s virtually limitless. It’s being created constantly, and we have to capture it. And most industries, have much more data. It’s all the transactional data from all of their business applications. And many enterprises have a portfolio of hundreds of business applications that all originate data in a moment. There’s usage of behavior data from mobile and web, social media, log data, IoT device data, and other forms of data that you can buy and sell. So there’s plenty of data out there. This industry and enterprises have plenty of data from both internal and external sources. And see here we ask companies to estimate what is the size of the data stored within their company. And you can see there’s a big range, a hundred to 500 terabytes greater than 500 terabytes, but there’s plenty of data out there for analysis.
And we also asked, what percentage of that data is available from internal and external? So there’s plenty of data and there’s plenty of data from both internal and external sources to analyze. Okay. And all of that data is born fast and you mostly think of data ending up in a database, ending up in a data warehouse, ending up at it, file system, but when it originates, it’s all the results of some moment in time some of bent that is occurring. Okay. All of that data is born fast, so don’t think of the data as sort of historical anymore. You have to think of it as real time, right? The problem is all data is born fast, but the analytics is usually done much later.
The data is generated in an instant and then ultimately it’s fine. Why do we have to wait a day, an hour, a month, a week to analyze that data? Why don’t we analyze it as soon as it is born? As soon as that data is originated and when we can do that, we can capture what I call perishable insights and perishable insights can have exponentially more value than the traditional after the fact analytics. If you think about the word perishable, what does that actually mean? There’s some analytics that are only good right now when George Clooney walks into that restaurant. There’s a certain set of analytics and content that comes together and you can only act on it, right? Then you can only display that glue a cheesecake.
And there’s many other applications for perishable insights. For example, fraud detection. And then we could capture all the credit card transactions, all the electronic payment transactions, store them, and had to put them and analyze them. Oh, tonight or later. Those insights, what do you want to stop the fraud right now? Got to perishable insights. There’s many financial companies that are analyzing the social media feed for real time insights on what stocks may move at the moment. Right? It doesn’t matter if you know Mr or Mrs X said something about a certain stock, you can’t actually do it tomorrow, you have to act on it. Now, again, a perishable insight. Increasingly there’s wearable devices like the one that’s as babies wearing that monitors the baby’s position, and the temperature. So it’s a much more sophisticated baby monitor. You don’t want to just collect this information and analyze it later.
If something’s wrong, you want to know about it right away. Again, the perishable insight and even automobiles are getting smarter. Typical automobile might have 250 sensors in it. Yeah. And one auto manufacturer is working on if there’s tire slippage, if it detects tire slippage, it can then broadcast to the car to be kind it two miles behind. Warning of slippery conditions. Again, a perishable insight and vocation analytics is very big there as well. What if these girls were shopping in an outlet mall. What offer or not should you make to them when they’re in the proximity of the store? That’s a perishable insight that has to do with the location. So there’s many, many examples of perishable insights. Yeah. Final example I’ll give you is Pandora or, Spotify. There’s sensors, an accelerometer. What if the Pandora app could detect when someone’s jogging and then play music in their stream.
Again, a perishable insight because you can only act on it when they’re running. So perishable insights are great. And we need to develop applications that can take advantage of perishable insights, but there’s an age old problem with this and there’s technical challenges. In survey after survey that we’ve done, we’ve asked what are the technical challenges impeding you from processing and analyzing more data? I’m not even asking about necessarily real time or predictive, and these are the answers. And none of these will surprise you. Difficulty integrating data from multiple sources, always a problem. Data volume increasingly large, creating the data models, preparing the data too many formats and difficult to access from multiple sources. So that begs a question, how can we solve this? How can we overcome some of these technical challenges?
And streaming analytics provides two benefits to overcome some of those challenges. And the biggest one is that you can actually prepare and process data in stream. You don’t have to wait. You can process it immediately. So let’s look at these two concepts that are very familiar these days. The concept of a data lake or a data warehouse versus the concept of a real-time stream. So you can ingest data in real time. You know, cause you can have connected, there’s interest load that file system Hadoop constraints. Lakes can certainly accommodate multiple data sources and formats. So constraints lakes typically store that data and the data warehouse to do profile system first streams on the other hand can analyze that data in the stream and then be routed to a data warehouse or file systems later for additional analytics.
But the key here is, and this is where the key differences that data can be analyzed in stream lakes, you typically run back analytics. Hadoop and Spark are batch platforms. And those batch analytics can run monthly, weekly, daily or hourly. For historical insights, streaming analytics as real time analytics runs continuously with sub second or multiple second immediate insights and the analytics who are doing the lake are after the fact analytics. And those can be used to adjust strategy tactics or some sort of future actions. Those insights can be used immediately to take actions. Those insights are used to find those perishable insights and to take action on them in real time. So when you’re thinking data lakes also bank streams, and I’m not saying here that you just need streams, but I’m saying a lot of the things that you think about doing in a lake, you can actually do real time in a stream.
You need data lakes and you need streams simultaneously. Let’s go a little bit deeper on what we mean by streaming analytics. And again, I say streaming synonymous with real time analytics. So streaming analytics can detect then act on those perishable insights. And the definition that we use is spending analytics, filter, transform, aggregate and rich analyze a high throughput of data from multiple data sources, identify interesting patterns, context from IoT, from whatever data is flowing into tech those situations. And then automate immediate actions in real time. And a lot of people get tripped on this whole term real-time because what does that actually mean? Well, it means different things in different businesses. So if a customer walks into a shopping mall, real-time can mean a few seconds, you know, a shopper clicking on an online ad that can mean maybe a hundred milliseconds stock price arises.
That could be microseconds. So real-time just means business time. You need streaming platforms to be incredibly fast that the handle data. No thinking in streams is very good. And data lakes, there’s two core capabilities that streaming analytics wants to, and this is very important, especially the left side. Well they’re both important, but the left side, it might surprise you because it’s about real-time ETL. The typical analytics process is to acquire these sources and then to do ETL. And even if you load the data directly into Hadoop and do some ETL, you’re just doing an an hour ahead Why wait? Why not do it? Real time streaming analytics have the capabilities to ingest, filter, transform, normalized link, which you can do in stream ETL. And then if you look at the right side, these are the core analytics capabilities to correlate multiple streams, vocation, geo fencing, time, windows, temporal pattern detection, physically business logic and rules, execution.
And then actions for basis. Take action on those insights. So these are the core features of streaming analytics. Let’s look at how this might be used in a real life situation. Let’s just pretend that we’re a online retailer and we want to sell more motorcycle helmet and we want to optimize profits. So we can use four of the key streaming analytics, pattern detection, windows, business logic and actions basis. So look at the left on the temporal pattern detection. Suppose one of the rules, the analytics we want is when has the user viewed at these three products, including at least one helmet. So they could view two helmets or one helmet, gloves, other safety products. And once we detect that pattern in real time, then we want to display the most profitable motorcycle helmets. Cause we think that this analytic tells us that they’re interested.
Okay, that’s interesting. But now let’s add in addition to this, which is a time window, what is the real-time daily sales total of the motorcycle helmets. Now the reason why we want this is because we might want to adjust the price up and down. Cause you remember our first goal is to sell more motorcycle helmets. Our second goal was to maximize profits. No worries. So sales is trending lower than usual and this customer is price sensitive, then dynamically lower the price. Likewise, we can raise the price. So all of these rules in these analytics are expressed and executed and delivered and that streaming analytics gives you incredible results. Analytics rules that can all happen in real time for multiple users at once. Try doing that with plain old SQL on real time data.
If you’re a really good at SQL, you can probably implement what I just showed you on the prior slide. If you really start to think about it, you can’t do this in real time without doing polling and that’s not scalable. So if you’re interested just take that exercise on yourself and and see what you come up with. Ultimately you’re going to need continuous analytics to action architecture and streaming is a key component. That makes that happen. So what are the requirements? What are the things that you need to have in a streaming analytics? Well, one that’s going to give you this real time capability. Well, look at the left. You can see at the top, these are the real time platforms going down to the patch platforms to streaming analytics is the most real time of all platforms and it’s often supported by at an in memory data and compute grid, a cluster database. And you still need general purpose data processing to do things like predictive modeling after the fact. And that’s often done these days on Hadoop and Spark, but also data warehouses.
So the individual requirements you need is all streaming analytics platforms are absolutely in-memory platforms. We don’t say in-memory streaming analytics platform because the analytic, it has to have in-memory, that sort of latency or complex data analytical operations. And many of them are stateful because the time when does it has to be an in memory platform and memory by the way roughly is 58,000 times faster than disk. It has to be a scale on architecture to handle any amount of born fast data. Going back to the motorcycle helmet example, we just had rules about one person. They click on three safety products within a one minute time frame. Imagine if there’s thousands, tens of thousands or millions of rules and customers that you have to access at that time. So it has to be able to scale out the handle large amounts of data, like IoT, but also large amounts of users.
It also has to provide high level streaming operators to build complex applications very fast. There’s a number of open source. There’s both open source and commercial platforms that have emerged. The open source platforms often require lots of programming to do it. Commercial platforms often provide high level streaming operators. So it’s not just about the enterprise qualities of scalability and performance, but it’s also about the speed and ease if you can develop the application of course to connect seamlessly into existing architectures, not just to data sources and sources of events, but also for the action interest. As I said before, streams replace a lot of what’s done in the lake, but it doesn’t replace everything that’s done in the lake.
So the streaming platform also has to feed batch clusters to process that historical analytics and build predictive models after the fact that’s going to later support the real-time analytics. And finally, since we’re talking about insights to action here, form has to have rules. You have to be able to embed rules and then call out two other applications to initiate actions. If we bring it back to business priority, you have to to develop real time hyper personalized customer experiences. And the only way to do that is to have specific types of analytics and to think about the perishable insights within your organization. Start with streaming because with real-time platforms you can see they’re still an enormous amount of momentum on that and all the other advanced analytics spot. So only the realtime enterprise company is going to be able to compete and win in the age of the customer.
So make sure, in addition to your business intelligence, that you also have the three forms of advanced analytics project: streaming and prescriptive. Okay. So thank you very much.