Article number
000003553
Affected Versions
All
Source Hypervisor
All
Target Hypervisor
All

An Unresolvable Error Has Occurred With VPG X. The VPG Must Be Deleted.

Viewed 246 times

Root Cause

When reviewing the Recovery VRA logs, the following messages will be present shortly before the time the alert was activated:

2020-01-13 16:11:20.266 (ZVM),VRAs/<VRA_NAME>/zlog166558.txt.gz,2020-01-13 16:11:20.266,ERR: Error in func SfsStream::checkMdEntryValidity - stream<MIRROR-ID,0> Possible corrupt meta-data. Invalid metadata type type=0xb757f31a [UNKNOWN] aborting mirror traversal. SFS Item: type 0xb757f31a [UNKNOWN]:

2020-01-13 16:11:20.266 (ZVM),VRAs/<VRA_NAME>/zlog166558.txt.gz,2020-01-13 16:11:20.266,ERR: Error in func SfsStreamBundle::recover - bundle<MIRROR-ID> Recover returned RC_CORRUPTED_OBJECT. Most likely meta-data corruption in journal.


These errors, as stated, reflect corruption in the journal storage itself.

Symptoms

An administrator may find a VPG is in a red error status with the following error:

An Unresolvable Error Has Occurred With VPG X. The VPG Must Be Deleted.

No actions can be taken on said VPG besides Delete when in this state.

Solution

Workaround

The VPG(s) affected must be recreated. Follow the Preseeding Volumes KB to recreate the VPG with preseed disks to avoid initial sync.

NOTE: Preseed to AWS is not supported. Manual preseed to Azure is also not supported (must be done via the Import option in the Zerto Diagnostics Utility)

Permanent Fix

It is highly recommended for the administrator to engage their storage team/vendor to review the datastore that owned the affected journal disk(s) and resolve whatever issue caused the corruption to avoid this from occurring again. Creating a new datastore to use as a journal datastore could also avoid this problem in future as well.

Please note: When storage teams/vendors review for datastore issues, it's encouraged to do a deeper review, as this issue has been seen as the result of various storage related components, including physical storage array issues, outdated HBA firmware and drivers on hosts, and SAN connectivity issues.