Rethinking Product Analytics Architecture

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Rethinking Product Analytics Architecture

Sinthuja Ragendran
Hi,

With my recent work with metrics and other monitoring systems, I'm thinking whether our model of sending everything and calculate in the analytics server side is correct. 

Basically IMHO, the majority of the product analytics use cases are statistics calculation. For example, in APIM, we are calculating API statistics, Subscription statistics, etc. And for this, we are sending events for every request/response/fault to analytics server, and analytics server is performing the actual statistics calculation on the events. Therefore in the high traffic scenario, there are a lot of events needs to be published to analytics server, and we are getting issues like "Event Queue Full" in the gateway. 

My proposal is, what if we calculate the statistics on the local node itself (similar to the edge analytics by running Siddhi within the products), and only send the minute statistics summary to analytics server to do the global (across all nodes) calculation. With this way, the traffic to analytics server will be reduced drastically as only the final calculated value for each group by combination will be reported. Therefore, the analytics server can focus on global summarization and other global monitoring aspects. 

Thanks,
Sinthuja.

--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955



_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Reply | Threaded
Open this post in threaded view
|

Re: Rethinking Product Analytics Architecture

Rukshan Premathunga
Hi Sinthuja,

Per min summary within local node will be fine and reduce the traffic to Analyzer. But can we grantee that, Siddhi apps will not slow down other functionalities(gateway request)?
This can be try out in ballerina based gateway since streams is already in there. But if we take c4 based products this will not be easy right? we have to release the products with siddhi features.

Thanks and Regards

On Tue, Jun 19, 2018 at 5:04 PM, Sinthuja Rajendran <[hidden email]> wrote:
Hi,

With my recent work with metrics and other monitoring systems, I'm thinking whether our model of sending everything and calculate in the analytics server side is correct. 

Basically IMHO, the majority of the product analytics use cases are statistics calculation. For example, in APIM, we are calculating API statistics, Subscription statistics, etc. And for this, we are sending events for every request/response/fault to analytics server, and analytics server is performing the actual statistics calculation on the events. Therefore in the high traffic scenario, there are a lot of events needs to be published to analytics server, and we are getting issues like "Event Queue Full" in the gateway. 

My proposal is, what if we calculate the statistics on the local node itself (similar to the edge analytics by running Siddhi within the products), and only send the minute statistics summary to analytics server to do the global (across all nodes) calculation. With this way, the traffic to analytics server will be reduced drastically as only the final calculated value for each group by combination will be reported. Therefore, the analytics server can focus on global summarization and other global monitoring aspects. 

Thanks,
Sinthuja.

--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955





--
Rukshan Chathuranga.
Software Engineer.
WSO2, Inc.
+94711822074

_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Reply | Threaded
Open this post in threaded view
|

Re: Rethinking Product Analytics Architecture

Fazlan Nazeem
In reply to this post by Sinthuja Ragendran
Hi Sinthuja,

There is an ongoing effort to combine request, response and execution time event streams into a single stream and publish a single event instead of 3 events to the Stream processor. This is targeted for the Q3 release and can bring down the traffic by 3 times to the analytics server. 

Mail Subject: Moving APIM Analytics to SP

On Tue, Jun 19, 2018 at 5:05 PM Sinthuja Rajendran <[hidden email]> wrote:
Hi,

With my recent work with metrics and other monitoring systems, I'm thinking whether our model of sending everything and calculate in the analytics server side is correct. 

Basically IMHO, the majority of the product analytics use cases are statistics calculation. For example, in APIM, we are calculating API statistics, Subscription statistics, etc. And for this, we are sending events for every request/response/fault to analytics server, and analytics server is performing the actual statistics calculation on the events. Therefore in the high traffic scenario, there are a lot of events needs to be published to analytics server, and we are getting issues like "Event Queue Full" in the gateway. 

My proposal is, what if we calculate the statistics on the local node itself (similar to the edge analytics by running Siddhi within the products), and only send the minute statistics summary to analytics server to do the global (across all nodes) calculation. With this way, the traffic to analytics server will be reduced drastically as only the final calculated value for each group by combination will be reported. Therefore, the analytics server can focus on global summarization and other global monitoring aspects. 

Thanks,
Sinthuja.

--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955




--
Thanks & Regards,

Fazlan Nazeem
Senior Software Engineer
WSO2 Inc
Mobile : +94772338839

_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Reply | Threaded
Open this post in threaded view
|

Re: Rethinking Product Analytics Architecture

Sinthuja Ragendran
In reply to this post by Rukshan Premathunga
Hi Rukshan,

On Wed, Jun 20, 2018 at 9:40 AM Rukshan Premathunga <[hidden email]> wrote:
Hi Sinthuja,

Per min summary within local node will be fine and reduce the traffic to Analyzer. But can we grantee that, Siddhi apps will not slow down other functionalities(gateway request)?

IMHO it should not slow down. :) 
Addtionally we should be simply calculating a time batch window with sum aggregations and group bys therefore I believe, it shouldn't have a great impact on the gateway.  

This can be try out in ballerina based gateway since streams is already in there.

Well, ballerina based gateway, IMHO we should use the Ballerina observabilty (metrics and tracing) APIs to calculate the statistics. We are also thinking about merging the Ballerina observability to streams, but still not yet functional upto that level. 
 
But if we take c4 based products this will not be easy right? we have to release the products with siddhi features.

We just simply need siddhi librabry, not other SP related features. As we promote siddhi as edge analytics libabry for IoT devices as well, I beleieve it will not have a major impact. 

Anyhow, IMO it's worth to put some effort and see. IMHO, we are simply moving some parts of the existing queries to the gateway. Since we are already doing some APIM analytics SP migration, why not try this as well? Shouldn't be a very hard thing to try. :) Ofcourse, if we have enough bandwidth to the release timelines. 

Thanks,
Sinthuja. 

Thanks and Regards

On Tue, Jun 19, 2018 at 5:04 PM, Sinthuja Rajendran <[hidden email]> wrote:
Hi,

With my recent work with metrics and other monitoring systems, I'm thinking whether our model of sending everything and calculate in the analytics server side is correct. 

Basically IMHO, the majority of the product analytics use cases are statistics calculation. For example, in APIM, we are calculating API statistics, Subscription statistics, etc. And for this, we are sending events for every request/response/fault to analytics server, and analytics server is performing the actual statistics calculation on the events. Therefore in the high traffic scenario, there are a lot of events needs to be published to analytics server, and we are getting issues like "Event Queue Full" in the gateway. 

My proposal is, what if we calculate the statistics on the local node itself (similar to the edge analytics by running Siddhi within the products), and only send the minute statistics summary to analytics server to do the global (across all nodes) calculation. With this way, the traffic to analytics server will be reduced drastically as only the final calculated value for each group by combination will be reported. Therefore, the analytics server can focus on global summarization and other global monitoring aspects. 

Thanks,
Sinthuja.

--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955





--
Rukshan Chathuranga.
Software Engineer.
WSO2, Inc.
+94711822074


--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955



_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Reply | Threaded
Open this post in threaded view
|

Re: Rethinking Product Analytics Architecture

Sinthuja Ragendran
In reply to this post by Fazlan Nazeem
Hi Fazlan,

Yes, would reduce some events 3 times, but still, I think in the new approach we need to send at least one event per gateway request to analytics server. Which means, based on the TPS values (load) of APIM server, we need to scale up analytics nodes as well. Basically, our requirement is calculating request/response/fault count based on different bucket values (group by). Therefore, based on the proposed model, if we calculate local statistics summary, then we don't need to scale analytics nodes based on APIM TPS, because only one event will be pushed per group by bucket per min. 

Thanks,
Sinthuja

On Wed, Jun 20, 2018 at 10:23 AM Fazlan Nazeem <[hidden email]> wrote:
Hi Sinthuja,

There is an ongoing effort to combine request, response and execution time event streams into a single stream and publish a single event instead of 3 events to the Stream processor. This is targeted for the Q3 release and can bring down the traffic by 3 times to the analytics server. 

Mail Subject: Moving APIM Analytics to SP

On Tue, Jun 19, 2018 at 5:05 PM Sinthuja Rajendran <[hidden email]> wrote:
Hi,

With my recent work with metrics and other monitoring systems, I'm thinking whether our model of sending everything and calculate in the analytics server side is correct. 

Basically IMHO, the majority of the product analytics use cases are statistics calculation. For example, in APIM, we are calculating API statistics, Subscription statistics, etc. And for this, we are sending events for every request/response/fault to analytics server, and analytics server is performing the actual statistics calculation on the events. Therefore in the high traffic scenario, there are a lot of events needs to be published to analytics server, and we are getting issues like "Event Queue Full" in the gateway. 

My proposal is, what if we calculate the statistics on the local node itself (similar to the edge analytics by running Siddhi within the products), and only send the minute statistics summary to analytics server to do the global (across all nodes) calculation. With this way, the traffic to analytics server will be reduced drastically as only the final calculated value for each group by combination will be reported. Therefore, the analytics server can focus on global summarization and other global monitoring aspects. 

Thanks,
Sinthuja.

--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955




--
Thanks & Regards,

Fazlan Nazeem
Senior Software Engineer
WSO2 Inc
Mobile : +94772338839


--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955



_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Reply | Threaded
Open this post in threaded view
|

Re: Rethinking Product Analytics Architecture

Chamila De Alwis
IIUC, per minute calculations at the GW themselves would reduce the statistics granularity. However, that wouldn't be a problem as most scenarios do not involve real-time analysis based on seconds.

I'm curious on the load added by this on the GW however. Especially with scenarios where high concurrency and high message sizes are involved, we may need to see if this design would impact capacity planning. 

Also, I suppose this would not be an optional feature. If we consider a cluster of GWs, we can't have different GWs publishing different types of calculations. If we try to make this an optional setting, then blue-green deployments would not be possible.

Regards,
Chamila de Alwis
Committer and PMC Member - Apache Stratos
Associate Technical Lead | WSO2 
+94 77 220 7163
Blog: https://medium.com/@chamilad




On Tue, Jun 19, 2018 at 10:49 PM Sinthuja Rajendran <[hidden email]> wrote:
Hi Fazlan,

Yes, would reduce some events 3 times, but still, I think in the new approach we need to send at least one event per gateway request to analytics server. Which means, based on the TPS values (load) of APIM server, we need to scale up analytics nodes as well. Basically, our requirement is calculating request/response/fault count based on different bucket values (group by). Therefore, based on the proposed model, if we calculate local statistics summary, then we don't need to scale analytics nodes based on APIM TPS, because only one event will be pushed per group by bucket per min. 

Thanks,
Sinthuja

On Wed, Jun 20, 2018 at 10:23 AM Fazlan Nazeem <[hidden email]> wrote:
Hi Sinthuja,

There is an ongoing effort to combine request, response and execution time event streams into a single stream and publish a single event instead of 3 events to the Stream processor. This is targeted for the Q3 release and can bring down the traffic by 3 times to the analytics server. 

Mail Subject: Moving APIM Analytics to SP

On Tue, Jun 19, 2018 at 5:05 PM Sinthuja Rajendran <[hidden email]> wrote:
Hi,

With my recent work with metrics and other monitoring systems, I'm thinking whether our model of sending everything and calculate in the analytics server side is correct. 

Basically IMHO, the majority of the product analytics use cases are statistics calculation. For example, in APIM, we are calculating API statistics, Subscription statistics, etc. And for this, we are sending events for every request/response/fault to analytics server, and analytics server is performing the actual statistics calculation on the events. Therefore in the high traffic scenario, there are a lot of events needs to be published to analytics server, and we are getting issues like "Event Queue Full" in the gateway. 

My proposal is, what if we calculate the statistics on the local node itself (similar to the edge analytics by running Siddhi within the products), and only send the minute statistics summary to analytics server to do the global (across all nodes) calculation. With this way, the traffic to analytics server will be reduced drastically as only the final calculated value for each group by combination will be reported. Therefore, the analytics server can focus on global summarization and other global monitoring aspects. 

Thanks,
Sinthuja.

--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955




--
Thanks & Regards,

Fazlan Nazeem
Senior Software Engineer
WSO2 Inc
Mobile : +94772338839


--
Sinthuja Rajendran
Senior Technical Lead
WSO2, Inc.:http://wso2.com

Blog: http://sinthu-rajan.blogspot.com/
Mobile: +94774273955



_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture