VRA is Disconnected From ZVM and Peer VRAs, yet Replication Appears to be Healthy
Viewed 95 times
An administrator may notice alerts that a VRA is disconnected from ZVM and peer VRAs, yet replication seems to be OK.
The VraMain process does not have the ample resources to perform its duties. This typically occurs for one or more of the below:
Only 1 vCPU is configured for the VRA and it was overloaded.
The maximum of 2 vCPU is configured for the VRA but not fully reserved.
The RAM configured for the VRA was not fully reserved.
There is backend storage latency high enough that IOs to the backend are stuck in VRA memory and fill up said VRA memory.
If one or more of the above occurs, VraMain will eventually be killed off as seen in the screenshot below.
Alerts are seen in the GUI for ZVM is not connected to VRA as well as Connection between local VRA and VRA with IP is down.
When viewing the VRA via console or SSH session the following can be seen:
To workaround this issue, simply reboot the VRA.
To resolve the issue, check/perform the below as necessary:
Ensure VRA has 2 vCPU configured and fully reserved.
Ensure all RAM (at least 3GB no more than 16GB) are fully reserved.
Check for backend storage latency during the timeframe of when the VRA first became unresponsive.