Multiple or Single Input Sink

Category: sql server streaminsight


Plob109 on Thu, 14 Feb 2013 20:42:31


My current application takes bursty data from multiple (100+) TCP streams, puts into a single concurrent list and then decodes and analyzes packet contents to provide various realtime dashboard KPIs. I also aggregate KPIs and insert into SQL server for historical analysis.

I would like to investigate changing the solution to use StreamInsight in the hope to make more efficient with managing the incoming data and flexible with dashboard reporting, my question is how to go about doing this:

Should I create a input sink for each TCP socket or continue to manage a list to join all the TCP data and feed a single sink ?

As well as averaging and reporting KPIs across all the TCP sockets to provide a Network view of performance, I want the ability to drill-down and query on individual TCP streams, what would be a good approach to achieve this ?

Thanks for any pointers you can provide !


TXPower125 on Thu, 14 Feb 2013 22:03:31

What does your data look like? This is important to know so you can create a proper payload class for use in StreamInsight. At a minimum, it will contain some kind of TCP stream identifier that identifies which event belongs to which TCP stream.

1. Are all these TCP streams connecting to the same socket? If you are using the Rx-based approach in StreamInsight 2.1, inputs are sources while sinks are outputs. You would need to create an input source for each socket you want to accept data on.

2. You could create a WCF sink that would allow your dashboard(s) to subscribe to the data they want to display. So by default, you can subscribe to the aggregate data. Then if you want to drill-down to see the specifics on a specific TCP stream you can subscribe to those events which your WCF sink will handle filtering out the relevant events.

Plob109 on Fri, 15 Feb 2013 04:03:28

I have multiple sensors each with its own socket sending data back to the server. Data is nothing more than a byte array containing the sensor identifer and payload which I have to then decode to get status information on sensor conditions.

As an example I display (amongst other metrics) average power level across all the sensors over the past minute, but also want to monitor the individual sensors to trigger when any sensor power level exceeds a certain threshold over a 30 second window.

Hope this makes sense.

TXPower125 on Fri, 15 Feb 2013 04:27:08

Yep makes perfect sense. This is the kind of thing that StreamInsight is built to do. Sounds like a cool project.

DevBiker on Fri, 15 Feb 2013 15:02:57

Does each sensor/socket combination have the same deserialization requirements? Can each one be placed into the same schema (I'm assuming yes for this)? While StreamInsight can certainly handle 100+ input sinks, I would have some caution just from a manageability perspective ... writing a query where you have 100+ input sinks that you then union ... and have to synchronize in time ... could be quite challenging. And while my initial, knee-jerk instinct is to do 1 sink per socket, synchronizing the CTIs across all of those sensors that are at different event rates could get ugly. If you do go down the path of multiple sockets into one sink, you'll have to think long and hard about your AdvanceTimePolicy and what kind of delay you'll need to make sure that you handle any late-arriving events from sensors.

Make sense?

Plob109 on Fri, 15 Feb 2013 15:50:58

Every byte array will be decoded in the same manner, different measurements will be reported in each packet at differing rates so a sensor may report power measurements at one point in time and light levels another time, and perhaps nothing for a period of time. My dashboard needs to show various user defined measurements (such as avg power, max light level) so at times there may be nothing to report for the current time (indicating a potential issue)

I have just stumbled on StreamInsight technology in the past week, read high-level and seemed to fit the design need but I have to now dig into the nuts and bolts and wasn't clear on how I should approach the problem - good to hear that StreamInsight should be a good solution to this task though.

Thanks for your advice