Clustering issues #318

Jotschi · 2018-02-28T10:46:06Z

Re-Sync errors:

https://www.prjhub.com/#/issues/9914
A bogus delta sync is executed and failing when new files (e.g. new verticle types) have been added.
The issue should not affect clustering since a full sync is executed as a fallback mechanism.
The issue will be fixed with the next OrientDB release (2.2.34)
Syncing restarted node fails with cluster not found error orientechnologies/orientdb#8127
Sync issue will be fixed with OrientDB 2.2.34
https://www.prjhub.com/#/issues/9860
Delta Sync was failing in some cases. Issue has been fixed with 2.2.33 (Already included in Gentics Mesh)
https://www.prjhub.com/#/issues/9947
In some cases the "master" election did not finish. OrientDB switched from one instance to another.
Issue will be fixed with OrientDB 2.2.34
The issue is also covered by Gentics Mesh clustering tests.
https://www.prjhub.com/#/issues/9950
When recovering from a split-brain situation usually one instance in the cluster is forced to backup the data so that the other node in the cluster will become the new "source of truth". In some cases a wrong backup was used for these nodes. Additionally there seems to be a Gentics Mesh bug which causes the OrientDB database to be modified before the instance is joining the cluster. This is bad because it alters the version of the DB and thus OrientDB could identify this as a recent change which leads to the db being chosen as "the latest" one.
Somewhat odd workaround: Delete the data/graphdb folder before joining a cluster. Thus a full sync will be executed and no complex mechanism is involved to elect the "latest" db.

cschockaert · 2018-04-03T08:13:15Z

can you explain what are the issues on prjhub?
cannot see the details

Jotschi · 2018-04-03T09:21:19Z

@cschockaert I added some information and new issues.

clems159 · 2018-04-30T12:21:53Z

Hi again, so 0.19.0 is fixing all theses issues?

Jotschi · 2018-04-30T13:04:57Z

@clems159 The last issue is still open. The other issues have been resolved.

clems159 · 2018-04-30T13:11:16Z

ok, so if i remember, when we tryied master / master we were in an infinite loop between node to find who is the master, causing instance to never restart.

this is associated with split-brain situation ?

Jotschi · 2018-04-30T13:27:15Z

@clems159 When setting up a mesh cluster it is important to only initialize one Gentics Mesh instance.
https://getmesh.io/docs/beta/clustering.html#_setup
The other one must be empty in order to join the cluster. Maybe you started a cluster with two instances and OrientDB got stuck in a replication loop.

I don't think that the issue you observed is related to the split brain issue. Could you perhaps try again and provide logs if you encounter issues?

Jotschi · 2018-10-18T14:16:20Z

Last issue has been fixed with OrientDB upgrade some time ago.

cschockaert · 2018-10-18T15:19:01Z

Agree,
working with latest getmesh version in 1 master / x replicas mode and did not met this situation again.

Jotschi added bug f/clustering labels Mar 2, 2018

Jotschi added this to the 1.0.0 milestone Mar 2, 2018

Jotschi closed this as completed Oct 18, 2018

rhoxhaj mentioned this issue Jun 11, 2021

SUP-11538/fix for the update issue in the S3 field #1213

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clustering issues #318

Clustering issues #318

Jotschi commented Feb 28, 2018 •

edited

Loading

cschockaert commented Apr 3, 2018

Jotschi commented Apr 3, 2018

clems159 commented Apr 30, 2018

Jotschi commented Apr 30, 2018

clems159 commented Apr 30, 2018

Jotschi commented Apr 30, 2018

Jotschi commented Oct 18, 2018

cschockaert commented Oct 18, 2018

Clustering issues #318

Clustering issues #318

Comments

Jotschi commented Feb 28, 2018 • edited Loading

Re-Sync errors:

cschockaert commented Apr 3, 2018

Jotschi commented Apr 3, 2018

clems159 commented Apr 30, 2018

Jotschi commented Apr 30, 2018

clems159 commented Apr 30, 2018

Jotschi commented Apr 30, 2018

Jotschi commented Oct 18, 2018

cschockaert commented Oct 18, 2018

Jotschi commented Feb 28, 2018 •

edited

Loading