6/23/2017

Master Data Management for Distributed Applications







Requirement Criteria

1. Self Load Balancing.

There are 100 products and those products need to be processed by each node without duplicating.

Example:


There are two nodes which need to process all the products as 50 products each at a time. So Node A will process 50 products while Node B will process remains 50 products.


2. High Availability and Fault Tolerant

If one node goes down remains should process all the products.

Example:

Let's say out of two nodes one nodes crashed so remain other node need to process all the 100 products until crashed node get recover.

3. Local (L1 cache) cache should be maintained by each node.

By doing this we can save unwanted round trips to Database for master data reading. Same time each node only need few Database connections to get master data from centralized DB. This is a micro services friendly approach when you use centralized DB for master data management.

4. Once master data got changed that should be reflected (synched) in each Local Cache.

Each node should process with latest data.

5. Distributed Map should maintain bare minimum information within its cache.

This will help to maintain performance and stability of distributed Map.


Architecture Explained

1. We used Hazelcast as a distributed in memory cache.

Using hazelcast distributed map we share all the products among the nodes.

2. Again we use Hazelcast for application clustering so each node will act as hazelcast cluster node. All the cluster communication and cluster management will done by hazelcast itself.

3. We used Hibernate as an ORM tool.

Hibernate Session act as a Local Cache so basically we just have to use it's API for L1 cache management.

4. In here Kafka act as data pipeline for the whole architecture.

So once master data got changed by Node.JS admin panel it will be sent to Kafka as a JSON.


Node which enabled DB write permission will update DB with necessary changes and same time update Distributed map with business key and primary key.


5. Each hazelcast cluster node acting as a consumer for Kafka under different consumer group.

Hazelcast Cluster == Kafka Consumer Group

So one of cluster member which disabled DB write permission will consume same JSON and updated relevant distributed map with master data record business key and primary key. By doing that rest of the members in the same consumer group (cluster) will get to know the changes and updated their L1 cache after reading changes from DB.