• Tariq Saeed's avatar
    ocfs2: race between umount and unfinished remastering during recovery · bba1cb17
    Tariq Saeed authored
    Orabug: 19074140
    When umount is issued during recovery on the new master that has not
    finished remastering locks, it triggers BUG() in
    dlm_send_mig_lockres_msg().  Here is the situation:
     1) node A has a lock on resource X mastered by node B.
     2) node B dies ->  node A sets recovering flag for res X
     3) Node C becomes the new master for resources owned by the
        dead node and is remastering locks of the dead node but
        has not finished the remastering process yet.
     4) umount is issued on node C.
     5) During processing of umount, ignoring unfished recovery,
        node C attempts to migrate resource X to node A.
     6) node A finds res X in DLM_LOCK_RES_RECOVERING state, considers
        it a logic error and sends back -EFAULT.
     7) node C asserts BUG() upon seeing EFAULT resp from node B.
    Fix is to delay migrating res X till remastering is finished at which
    point recovering flag will be cleared on both A and C.
    Signed-off-by: default avatarTariq Saeed <tariq.x.saeed@oracle.com>
    Cc: Mark Fasheh <mfasheh@suse.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
dlmmaster.c 95.4 KB