General Database Error Recovery Process

The general process to recover from a database inconsistency is:

  • Determine the type of database inconsistency and its severity

  • Collect all logs and data related to the inconsistency

  • Report the database inconsistency to DirX Directory support, supplying the collected logs and data

  • Choose a recovery strategy according to the type and severity of the inconsistency

  • Preserve all data related to the inconsistency

The next sections describe these steps in more detail.

Determining the Type of Database Inconsistency

When a database inconsistency is detected, the first step is to determine its type:

  • If the database inconsistency is detected by the dbamverify command, its type can usually be determined by examining the dbamverify command-line output and the generated log files.The chapter “Error-Specific Database Recovery Procedures” describes how to analyze the dbamverify output for the main types of database inconsistencies that can occur and how to recover from these errors.

  • If the inconsistency is detected via DSA malfunction, it can be difficult to determine the type of database inconsistency that has caused the DSA problem.The chapter “General Methods for Database Recovery” describes the main types of recovery procedures you can use when you can’t determine the type of database inconsistency.

Determining the Database Inconsistency Severity

The second step should be to determine the severity of the inconsistency:

  • Check whether a consistent database exists in the system, either a supplier for a consumer inconsistency or a consumer for a supplier inconsistency.

  • Collect information about the available backups: check whether there is a recent LDIF dump or a verified backup archive available and when it was created.This step is important because the availability of a consistent database and verified backups will determine the extent of the data loss during the recovery.

Collecting Data for Analysis

The files listed below should be collected and provided with all database inconsistency reports to DirX Directory support:

  • dbamverify

  • $DIRX_INST_PATH/tools/log/LOG*dbamverify_PID.sequence_number of the erroneous *dbamverify execution, decoded with dirxdumplog

  • The information written into the stdout and stderr output of dbamverify

  • DSA

  • DSA_EXC*DSA_PID.sequence_number (on Linux) and *DSA_LOG*DSA_PID.*sequence_number written between the last error-free backup and the detection of the inconsistency

  • DSA_LOG should be decoded using the dirxdumplog tool

  • *$DIRX_INST_PATH/server/log/fatalDSA_*PID**_helper_number

  • $DIRX_INST_PATH/server/log/schema*DSA_PID.txt*

  • Watchdog

  • SRV_EXC*Watchdog_PID.sequence_number (on Linux) and *SRV_LOG*Watchdog_PID.*sequence_number

  • SRV_LOG should be decoded using the dirxdumplog tool

  • *$DIRX_INST_PATH/server/log/fatalSRV_*PID___helper_number

  • Audit

  • DSA audit created between the last error-free backup and the detection of the inconsistency

  • If DSA audits are not available, then LDAP audit logs of all the LDAP servers from the same time period

  • All audit files should be decoded with dirxauddecode -v -v

Additional error-specific log files and other output may also need to be provided.See the chapter “Error-specific Database Recovery Procedures” for details on the files to be provided for the different error types.

Reporting the Inconsistency to DirX Directory Support

As database inconsistencies are usually caused by a software problem in the DSA or some external circumstances, we recommend opening a ticket as soon as possible to report a database inconsistency.Repairing a database inconsistency with one of the methods described here without fixing the root cause of the problem means that sooner or later, the database may become inconsistent again.

Provide at least the following information in all tickets opened for database inconsistencies:

  • The DirX Directory version (the output of the dirxdsa -V command)

  • The platform in use:

  • Windows or Linux

  • Virtual or physical machine

  • Virtualization platform

  • The output of the dirxadm sob show all and dirxadm lob show all operations executed on the supplier node

  • The node(s) on which the inconsistency was observed and the node(s) that are error free

  • The type and age of the available backups

  • When the dirxbackup archive files were verified and the dbamverify options that were used for verification

  • The log and data files you collected in the previous step

Choosing a Database Recovery Strategy

After sending a report to DirX Directory support about the problem, the next step is to determine the recovery procedure to use.

If you’ve determined the type of inconsistency and there is a specific recovery procedure for it (see the chapter “Error-specific Database Recovery Procedures”), follow that procedure.

If you have not been able to determine the type of inconsistency, use the following guidelines to determine the best recovery strategy for your scenario:

  • Is the problem causing a severe system outage?A solution to a database inconsistency is highly dependent on what type of problem it is.Even if all nodes in a system report an inconsistent state, a database problem can sometimes be solved without having to restore the database from a backup.Thus, if the problem does not cause a severe system outage, consider contacting DirX Directory support and waiting for a possible solution that does not involve data loss.

  • Is there a consistent database in the system?A consistent database on one of the nodes in the system allows you to use the “Restore by Initiating a Total Update” method described in the chapter “General Methods Database Recovery” and minimize or avoid data loss.

  • Is there a recent LDIF dump or a verified backup archive?When the system has no consistent database to use for recovery, a backup must be used instead.If an LDIF dump exists, you can use the “Restore Using an LDIF Dump” method described in the chapter “General Methods fo Database Recovery” to recover the database.If a verified dirxbackup archive file exists, you can use the “Restore Using a Binary Backup” method described in the chapter “General Methods for Database Recovery” to recover the database.We generally recommend using the "Restore Using an LDIF Dump" method because it completely rebuilds the internal structure of the database.

If the system does not have a node with a consistent database and there is no LDIF dump or verified backup archive of the database to use for recovery, the database may be lost.

Preserving Relevant Data

The section “Collecting Data for Analysis” describes the log files that are usually necessary for investigating a database inconsistency.However, there are more log files that may be necessary in a later phase of the investigation.To make sure that all the data is available for finding the inconsistency, we recommend backing up all the log files, audit files, files in the tmp directory, and database backups.The backup should contain at least the files modified between the creation of the last error-free database and the detection of the inconsistency.

In some cases, the root cause of the inconsistency originates from an earlier problem or operation, and it may become necessary to obtain log files that were created before the last consistent backup; for example, the log of the purge operations of the last month in case of a tree inconsistency.Thus, if there is an automatic cleanup mechanism activated, it should be paused for the duration of the analysis to preserve all the log files that may become necessary during the analysis.