Enable cluster (HA/DR)
Using Orbit Config Manager ➔ Panel ➔ wizard or Orbit Config Manager ➔ Replication ➔ settings, it is possible to create a new cluster or add the instance to an existing cluster.
You can always check the cluster health in the Orbit Config Manager ➔ Replication ➔ status.
The senhasegura architecture operates with two information bases: one, where the configurations are stored, and the other containing the logs and videos generated.
Configuring High Availability
To configure a High Availability cluster it is necessary that all instances of senhasegura are configured with the following premises:
- The activation license must be applied to all instances;
- All instances must be visible to each other through the network;
- TCP ports 22, 443, 3306, 4444, 4567, 4568, 9300 and UDP 4567 must be released between all instances;
- The backup drive must be visible to all instances;
- Only one unit should be active and with asynchronous services active;
- An instance will be chosen as Primary after the cluster creation process;
- Back up the data and get a snapshot of the instance as a guarantee back;
Having the premises resolved. All the rest of the operation can be executed through the Orbit Web interface.
During the configuration process of each instance, it will have its database restarted. It is extremely important that you wait for the instance to be restarted before you start configuring the next cluster member. Not respecting this time can lead the cluster to a split-brain scenario, where it will not be sure which information is most up-to-date among the instances. This scenario, although reversible, will increase the maintenance time.
The application will restart the database service if you have chosen to configure the cluster.
To perform the configuration, go to the menu Orbit Config Manager ➔ Replication ➔ Settings or during wizard Orbit Config Manager ➔ Panel ➔ wizard.
Change the mode of operation from Standalone to Cluster;
Enable the flag Enable replication
Add the network IP of the first cluster member in the Primary member field;
Add the network IP of other members in the following fields;Attention!
The order of IPs registered as primary and other members must be replicated in all other instances!
If members are in different datacenters, activate the flag Members are in different datacenters;
In this case, also change the latency setting in the Latency between nodes field and its unique numeric identifier in the Network segment field;
Add in the Recovery screen display message field a message that will be displayed to all users in case of cluster failure;
Activate file synchronization with the Files flag if you want the files generated by one instance to be forwarded to the other instances using rsync;
Click save to finish the operation;
View replication status
On this screen, you will see the status and operation of the instances through the generated logs and if errors occur:
In Orbit Config Manager ➔ Replication ➔ Status where we can visualize the instance situation with the following information:
Replication Status: Disabled/Enabled
Instance Type: Primary/Secondary
Operating mode: On/off
Last Sync: 12/18/2022 16:00:01
Authentication token: 85d4d3f2b2e3840179fa8327551690b6
If you have a remote server, you will have information about it as well.
View elasticsearch cluster status
This screen shows the status of the elasticsearch cluster service, located in the Orbit Config Manager ➔ Replication ➔ Elasticsearch module:
Data search cluster
Status: brings information about the instance's health.
Uuid: Universally unique identifier for the instance.
Size: the number of instances.
Master: is an identifier of the master instance in the cluster.
Version: shows which version this instance is in.
Shards: the amount of sharded data subset or pieces for a sharded cluster. Which together form all the data in the cluster.
State Uuid: Identifier of the state of the instance in the cluster.
Timed out: informs if an error occurred during the time and if the instance took a while to get a response from the cluster.
Index: is a unique identifier of the cluster instance's index.
Name: is the name of the cluster application.
Ephemeral: is an ephemeral identifier of the cluster instance.
Address: Shows the IP and Port of the instance.
Name: brings the names of the logs that were generated by the instance.
Uuid Total: is a unique identifier referring to that generated log.
Size: is the size of the log that was generated by the system.
Health: This shows whether the log is working correctly.
Status: brings information about the instance's health.
Hot Spare - Automatic intances switch
senhasegura instances can be remotely activated and inactivated through HTTP requests that can be carried out from its load balancer. This control allows an instance that is under maintenance or unavailable for some reason, to be not considered in the load balancer redirection.
To configure IPs allowed to perform such query and operation, you must register the IP list in the Remote system activation field at the Orbit Config Manager ➔ Settings ➔ Recovery menu.
- At this screen, switch on the flag Allow system remote activation
- Fill the IP list allowed to execute automated requests at the field Allowed origin IPs to perform system remote activation
- Save the configuration clicking in Save button
You should realize this operation at every cluster members.
At this moment, these IPs can request the monitoring and administration URL
GET /flow/orbit/mntr. E.g.:
The response will be the current instance state. It can be:
- HTTP 200: Application is enabled and available for user operation
- HTTP 403: Application is enabled but unavailable for user operation
- HTTP 451: Activation license is expired
- HTTP 503: Application is disabled
Thus, in a practical case, if the administrator inactivates the application of an instance, it begins responding HTTP 403 for the load balancer, which in turn will no longer forward traffic to that instance. As if any instance loses communication with other cluster members, and as consequence, make it database unavailable, this instance will respond HTTP 503 to the load balancer, which will no longer forward traffic for this instance.
Automatic instance activation and inactivation
Another interesting control is to allow an external system to control which instances should be activated and deactivated automatically. Imagine a scenario where the load on an entire network must be redirected to a contingency data center. The target datacenter instance must be active and ready to receive the requests, and the old production instance must lose its role as Master.
It is possible to switch between the instances roles through the activation/inactivation URL.
Activates the instance to be used by users as long as the activation license is valid. If executed successfully, the instance that previously performed the role of Primary in the cluster, loses its relevance and this new instance receives the title of Primary. The other instances will not be automatically deactivated.
Inactivates the instance to be used by users. If this instance is the Primary, it will be inactivated without electing any other member of the cluster as the new Primary. This action will also not activate other instances if they are inactive.
Always be in control of which instances are active and inactive in the cluster. Do not run the risk of accidentally inactivating all instances, causing an interruption in users' operations.