Failure Description
A customer's core production storage is built by two HDS G200 GAD dual-activation.
Server room power failure, all equipment down, power restoration, emergency restorationof a single G200 to
provide services to the outside world (a service provider operation),
and the customer does not know which G200 to provide services to the outside world.
Now the customer needs to restore storage GAD dual-activity.
Failure Analysis
1. GAD Architecture
2. Inspection matters
Check the present network environment according to the GAD architecture.
Check the links between Primary and Secondary storage.
Check the links between Primary and Secondary storage and External storage.
Check the links from the host to Primary and Secondary storage.
Confirm by the storage switch and the host side that all the above links are normal.
3. Confirm the storage currently providing services
Turn on Primary storage port monitoring and check the traffic statistics.
Turn on Secondary storage port monitoring to view traffic statistics.
To summarize: the current storage for external operations is Primary.
4. GAD Status
Log in to the CCI management terminal and check the current status of GAD by executing the following command:
pairdisplay -g ORA_GAD -fxce -IH100
Primary storage end P-VOL is in SMPL state and Secondary storage end S-VOL is in PSUE state.
SMPL: The volume is not paired.
PSUE: The pair was suspended due to a failure.
Troubleshooting
Based on the above analysis, the link status is normal. Confirm that Primary provides services to the outside world, and GAD is currently in the running state,
which satisfies the configuration of restoring GAD dual-activity.
1. Forcibly delete Pair on the Secondary side.
Delete ORA_GAD_00 to ORA_GAD_07 as follows:
pairsplit -g ORA_GAD -d ORA_GAD_00 -RF -IH200
2. Checking P-VOL and S-VOL Status
P-VOL LDEV shows normal and S-VOL VIR_LDEV is ffff, as shown below, both are in normal state.
3. Check the Pair status
Both P-VOL and S-VOL status are SMPL.
4. Rebuild Pair to restore dual-activity
paircreate -g ORA_GAD -f never -vl -jq 0 -IH100
Note: The create command needs to be executed from the Primary storage side, i.e., the externally provided business storage side,
and the opposite will cause storage synchronization errors and data loss.
Check the status:
Above can be seen: data began to synchronize, the state is back to normal for PAIR.
5, host link check
After the synchronization is completed, the host scans for identification and verifies that the link is doubled.
Lessons Learned
When a GAD failure occurs, do not operate blindly. You need to clarify the current network architecture and confirm the storage of external services.
Clearly the current Pair state, when the need to synchronize or rebuild, must be careful and cautious,
do not reverse synchronization, otherwise it will cause data disruption and loss.
For more information, please visit Antute's official website:h3q4.yzmum.com