[MB4] Some design decisions

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[MB4] Some design decisions

Riyafa Abdul Hameed-2
Hi,

Some modifications have been proposed to the current MB architecture to improve performance in MB4 and for the purpose of supporting new features for JMS 2.0

PERSISTENT and NON_PERSISTENT messages

Current implementation of MB has a flow similar to the following:

Inline image 1

That is a message coming to the MB first passes through a Disruptor for preprocessing, flow controlling and persisted in the database. Then it is added to the queue in FIFO order (only metadata of each input message is added to the queue). Afterwards from the FIFO queue it is transferred to the disruptor for post-processing (content of the message is also added at this point). There are two types of messages in JMS which can be either PERSISTENT or NON_PERSISTENT. These two types of messages were handled in the same manner in the previous versions of MB.
NON_PERSISTENT messages are supposed to be faster than PERSISTENT messages. Hence in MB4 we plan to implement two separate pipelines for PERSISTENT and NON_PERSISTENT messages:
Inline image 1

In this scenario only the PERSISTENT messages will be persisted in the DB while the NON_PERSISTENT messages will pass through only a preprocessing step, through the queue and through a post-processing step and to the subscriber. There will be code reuse in the preprocessing and post-processing steps between both the PERSISTENT and NON_PERSISTENT cases. There will be two queues for the same scenario--that is two queues that represent the same scenario. Since there would be little IO operations involved in NON_PERSISTENT case we hope it would be faster than PERSISTENT step but there would be less guarantee for the messages to reach the subscriber. This comes to the following design decisions:
  • It is allowed in the JMS spec for the messages to be dropped due to failures on MB side. Hence it has been decided that a queue size for NON_PERSISTENT messages can be configured in the MB and when queue is filled with more data than it can handle that they will be dropped(possibly the earliest messages). This design decision was based on activeMQ[1]. Other possible options include configuring the queue depth--in this case it is possible that message sizes may vary and the MB buffer can end up being overflowed.
  • Further it has to be decided if we need flow-control in the case of NON_PERSISTENT messages. By flow control we mean that if the publisher is publishing messages at a speed greater than that can be handled by MB message acceptance is blocked for all the publishers. In previous architecture this flow control was available for both NON_PERSISTENT and PERSISTENT messages since both were handled through the same message flow

Delivery delay (JMS 2.0)

Further JMS 2.0 support a new feature called delivery delay which can be specified by a client. A message’s delivery time is the earliest time when a JMS provider may deliver the message to a consumer. Delivery delay would be implemented by using a single internal staging queue. Messages that have a nonzero delivery delay are placed on this queue with a header that indicates the delivery delay and information about the target queue. A component monitors the messages on the staging queue. When a message's delivery delay completes, the message is taken off the staging queue and placed on the target queue[2]:

Inline image 2

Here design decision involves to see if we are using java DelayQueue[3] for this purpose.

Slow/fast producers and consumers

It is also essential to cater to slow/fast producers and consumers. Fast publishers are handled through the incoming flow-control system in place in the architecture. Slow Consumers can cause problems on non-durable topics since they can force the broker to keep old messages in RAM which once it fills up, forces the broker to slow down producers, causing the fast consumers to be slowed down. The plan is to allow configuring the maximum number of matched messages the broker will keep around for a consumer. Once this maximum is reached, as new messages come in, older messages are discarded. This allows  to keep the RAM for current messages and keep sending messages to a slow consumer but to discard old messages[4].

JMS message selectors

A JMS message selector allows a client to specify, by message header, the messages it’s interested in. Only messages whose headers and properties match the selector are delivered.

Inline image 4


Each queue or topic can have multiple subscribers forming several filter groups. Currently messages are delivered to the subscribers in round robin manner (in the case of queue and shared durable topics). Following are the suggestions for implementing selectors in MB:

  1. Use a single cursor. When catering to each of the subscriber go to the beginning of the queue and move through the queue filtering for each of the subscriber from the beginning.
  • Pros: This the most simplest way to get this working and is quite easy to implement.
  • Cons: This is quite inefficient because for each filter group need to go through the queue from the beginning (would become a performance bottleneck for larger queues and larger number of subscribers)

       2. Use multiple cursors. In this case each filter group shall have a cursor. When its turn comes the cursor will move from the point where it stopped.

  • Pros: This is a very efficient method to achieve better performance when selectors are used
  • Cons: This complicates implementation as each cursor needs to be updated whenever a message is removed from the queue.



_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
Reply | Threaded
Open this post in threaded view
|

Re: [MB4] Some design decisions

Riyafa Abdul Hameed-2
Hi All,

The issue with current selector implementation is that when the selector of any subscriber do not match a given message it is routed to the DLC to avoid the following issue:
When you have a queue, and consumers filtering the queue with a very restrictive selector you may get into a situation where you won't be able to read more data from paging until you consume messages from the queue.

Example: in one consumer you make a selector as 'color="red"' but you only have one color red 1 millions messages after blue, you won't be able to consume red until you consume blue ones.


So there are a few proposal to implement this:
1) Have no buffers
2) Configurable buffers. If buffers are configured then avoid moving to the DLC and keep in the buffer till the messages in buffer are consumed as in artemis and activemq [1][2]
3) Move to a database when no subscribers match and when a new subscriber comes in bring all the early messages back into the queue (sliding buffer like implementation)
4) Multiple cursors with sliding buffer
5) Multiple queues for each filter (more complex and does not guarantee ordered delivery)

We plan to go ahead with the option 4 and see if that works. This decision is based on the fact that it is essential to have a buffer for performance reasons and having a single cursor as in 3 can starve certain consumers.
Any other suggestions would be highly valued.

 [1] https://activemq.apache.org/artemis/docs/1.5.3/paging.html
[2]http://trenaman.blogspot.co.uk/2009/01/message-selectors-and-activemq.html
Thank you.
Yours sincerely,
Riyafa


On Mon, May 22, 2017 at 3:34 PM, Riyafa Abdul Hameed <[hidden email]> wrote:
Hi,

Some modifications have been proposed to the current MB architecture to improve performance in MB4 and for the purpose of supporting new features for JMS 2.0

PERSISTENT and NON_PERSISTENT messages

Current implementation of MB has a flow similar to the following:

Inline image 1

That is a message coming to the MB first passes through a Disruptor for preprocessing, flow controlling and persisted in the database. Then it is added to the queue in FIFO order (only metadata of each input message is added to the queue). Afterwards from the FIFO queue it is transferred to the disruptor for post-processing (content of the message is also added at this point). There are two types of messages in JMS which can be either PERSISTENT or NON_PERSISTENT. These two types of messages were handled in the same manner in the previous versions of MB.
NON_PERSISTENT messages are supposed to be faster than PERSISTENT messages. Hence in MB4 we plan to implement two separate pipelines for PERSISTENT and NON_PERSISTENT messages:
Inline image 1

In this scenario only the PERSISTENT messages will be persisted in the DB while the NON_PERSISTENT messages will pass through only a preprocessing step, through the queue and through a post-processing step and to the subscriber. There will be code reuse in the preprocessing and post-processing steps between both the PERSISTENT and NON_PERSISTENT cases. There will be two queues for the same scenario--that is two queues that represent the same scenario. Since there would be little IO operations involved in NON_PERSISTENT case we hope it would be faster than PERSISTENT step but there would be less guarantee for the messages to reach the subscriber. This comes to the following design decisions:
  • It is allowed in the JMS spec for the messages to be dropped due to failures on MB side. Hence it has been decided that a queue size for NON_PERSISTENT messages can be configured in the MB and when queue is filled with more data than it can handle that they will be dropped(possibly the earliest messages). This design decision was based on activeMQ[1]. Other possible options include configuring the queue depth--in this case it is possible that message sizes may vary and the MB buffer can end up being overflowed.
  • Further it has to be decided if we need flow-control in the case of NON_PERSISTENT messages. By flow control we mean that if the publisher is publishing messages at a speed greater than that can be handled by MB message acceptance is blocked for all the publishers. In previous architecture this flow control was available for both NON_PERSISTENT and PERSISTENT messages since both were handled through the same message flow

Delivery delay (JMS 2.0)

Further JMS 2.0 support a new feature called delivery delay which can be specified by a client. A message’s delivery time is the earliest time when a JMS provider may deliver the message to a consumer. Delivery delay would be implemented by using a single internal staging queue. Messages that have a nonzero delivery delay are placed on this queue with a header that indicates the delivery delay and information about the target queue. A component monitors the messages on the staging queue. When a message's delivery delay completes, the message is taken off the staging queue and placed on the target queue[2]:

Inline image 2

Here design decision involves to see if we are using java DelayQueue[3] for this purpose.

Slow/fast producers and consumers

It is also essential to cater to slow/fast producers and consumers. Fast publishers are handled through the incoming flow-control system in place in the architecture. Slow Consumers can cause problems on non-durable topics since they can force the broker to keep old messages in RAM which once it fills up, forces the broker to slow down producers, causing the fast consumers to be slowed down. The plan is to allow configuring the maximum number of matched messages the broker will keep around for a consumer. Once this maximum is reached, as new messages come in, older messages are discarded. This allows  to keep the RAM for current messages and keep sending messages to a slow consumer but to discard old messages[4].

JMS message selectors

A JMS message selector allows a client to specify, by message header, the messages it’s interested in. Only messages whose headers and properties match the selector are delivered.

Inline image 4


Each queue or topic can have multiple subscribers forming several filter groups. Currently messages are delivered to the subscribers in round robin manner (in the case of queue and shared durable topics). Following are the suggestions for implementing selectors in MB:

  1. Use a single cursor. When catering to each of the subscriber go to the beginning of the queue and move through the queue filtering for each of the subscriber from the beginning.
  • Pros: This the most simplest way to get this working and is quite easy to implement.
  • Cons: This is quite inefficient because for each filter group need to go through the queue from the beginning (would become a performance bottleneck for larger queues and larger number of subscribers)

       2. Use multiple cursors. In this case each filter group shall have a cursor. When its turn comes the cursor will move from the point where it stopped.

  • Pros: This is a very efficient method to achieve better performance when selectors are used
  • Cons: This complicates implementation as each cursor needs to be updated whenever a message is removed from the queue.





--
Riyafa Abdul Hameed
Software Engineer, WSO2 Lanka (Pvt) Ltd

Email: [hidden email]
Website: https://riyafa.wordpress.com/

  

_______________________________________________
Architecture mailing list
[hidden email]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture