Request based pub/sub architecture


Pub/sub makes it easy to get updates on specific topics as soon as they happen.

A pub/sub system need to keep track of all subscribers (S) and their keys (K) that they subscribe to.

When a value (V) is updated the subscribers are directly notified.

A data structure like the following are required.

Subscribers: [ K -> [S] ]

When a value is published ...

K -> V => K -> V'

The new value V' is sent to subscribers [S] if K exist in Subscribers.

This is very powerful and fast if you have seldom updates. But there might be issues.

Request based change query

The proposed alternative is to let clients keep the state of the subscriptions and request changes on demand based on local state instead.

The system consist of the following

The idea is to retrieve all values that have changed its values for a set of keys.

This means that you don't need to subscribe to get updates. If you have a set of keys and corresponding hash values you are able to get all keys that have changed on demand request.

Keys (K), values (V) and hashes (H) of values are actually stored on the server.

Values: [ K -> V ]
Hashes: [ K -> H ]

When a value is published ...

K -> V => K -> V'

The new value is stored and a new hash of the value is calculated.

Changes are only determined on request from clients.

A request is able to determine which keys (K2) with values changed for a set of keys (K1) and its locally stored hashes.

[(K, H)] -> [K]

All hash values for requested keys need to looked up.

Each hash is compared and ones that differ is returned.

The advantages are the following:

The disadvantages are the following:

All keys must be sorted to be efficient.

Pub/sub with changes queued for clients

Yet another middle ground is keep subscriptions i.e. a list of keys and hashes for each client and update result on update instead of on request.

A subscription for a client (C) is a set of keys and its hashes from last update request.

Subsription: C -> [(K, H)]

When a value is updated and the changed key and hash is stored for each client that subscribes to it.

Each key has a list of clients that are to be noted. Most often numer of clients are low.

Subscribers of key: K -> [C]

Updates are stored as a list of keys and hashes for updated keys.

Updates: C -> [(K, H)]

In this variant a request for changes keys will only return the pre-generated list of changed keys for the specific client.

If most keys only have one or two clients this should be quite efficient. A value update only require one extra lookup of clients interested in the key update.

This solution makes change requests very efficient. So if change requests happen more often than value changes this may be more efficient.