7/24/2011

07-24-11 - A cond_var that's actually atomic - part 2

Last time I noted that I thought the mutex for the waiter list was not needed and an idea on how to remove it. In fact it is easy to remove in exactly that manner, so I have done so.

First of all, almost everywhere in the cond_var (such as signal_unlock) , we know that the external mutex is held while we mess with the waiter list. So it is protected from races by the external mutex. There is one spot where we have a problem :

The key section is in unlock_wait_lock :


    void unlock_wait_lock(guard & g)
    {
        HANDLE unlock_h;
        
        HANDLE wait_handle = alloc_event();

        {//!
        m_waiter_mutex.lock($);
        
            // get next lock waiter and take my node out of the ownership chain :
            unlock_h = unlock_internal(g,NULL);

            // (*1)

            // after unlock, node is now mine to reuse :
            mcs_node * node = &(g.m_node);
            node->m_next($).store( 0 , std::mo_relaxed );
            node->m_event($).store( wait_handle , std::mo_relaxed );
        
            // put on end of list :
            if ( m_waiters($) == NULL )
            {
                m_waiters($) = node;
            }
            else
            {
                // dumb way of doing FIFO but whatever for this sketch
                mcs_node * parent = m_waiters($);
                while ( parent->m_next($).load(std::mo_relaxed) != NULL )
                    parent = parent->m_next($).load(std::mo_relaxed);
                parent->m_next($).store(node,std::mo_relaxed);
            }

            // (*2)         

        m_waiter_mutex.unlock($);
        }//!

        if ( unlock_h )
        {
            //SetEvent(unlock_h);
            SignalObjectAndWait(unlock_h,wait_handle,INFINITE,FALSE);
        }
        else
        {   
            WaitForSingleObject(wait_handle,INFINITE);
        }
        
        // I am woken and now own the lock
        // my node has been filled with the guy I should unlock
        
        free_event(wait_handle);
        
        RL_ASSERT( g.m_node.m_event($).load(std::mo_relaxed) == wait_handle );
        g.m_node.m_event($).store(0,std::mo_relaxed);
        
    }

The problem is at (*1) , the mutex may become unlocked. If there was a waiter for the mutex, it is not actually unlocked, we just retreive unlock_h, but we haven't set the event yet, so ownership is not yet transfered. The problem is if there was no waiter, then we set tail to NULL and someone else can jump in here, and do a signal, but we aren't in the waiter list yet, so we miss it. The waiter mutex fixes this.

Your first idea might be - move the unlock() call after the waiter list maintenance. But, you can't do that because my mcs_node in the guard "g" which I have on the stack is being used in the lock chain, and I wish to repurpose it to use it in the waiter list.

So, one simple solution would be just to have the stack guard hold two nodes. One for the lock chain and one for the waiter chain. Then we don't have to repurpose the node, and we can do the unlock after building the waiter list (move the unlock down to *2). That is a perfectly acceptible and easy solution. It does make your stack object twice and big, but it's still small (a node is two pointers, so it would be 4 pointers instead). (you might also need a flag to tell unlock() whether "me" is the lock node or the wait node).

But we can do it without the extra node, using the dummy owner idea I posted last time :


        {//!
            mcs_node dummy;

            // remove our node from the chain, but don't free the mutex
            //  if no waiter, transfer ownership to dummy       
            unlock_h = unlock_internal(&(g.m_node),&dummy);

            // after unlock, stack node is now mine to reuse :
            mcs_node * node = &(g.m_node);
            node->m_next($).store( 0 , std::mo_relaxed );
            node->m_event($).store( wait_handle , std::mo_relaxed );
        
            // put on end of list :
            if ( m_waiters($) == NULL )
            {
                m_waiters($) = node;
            }
            else
            {
                // dumb way of doing FIFO but whatever for this sketch
                mcs_node * parent = m_waiters($);
                while ( parent->m_next($).load(std::mo_relaxed) != NULL )
                    parent = parent->m_next($).load(std::mo_relaxed);
                parent->m_next($).store(node,std::mo_relaxed);
            }
            
            if ( unlock_h == 0 )
            {
                // ownership was transfered to dummy, now give it to the
                //  successor to dummy in case someone set dummy->next
                unlock_h = unlock_internal(&dummy,NULL);                
            }
        }//!

(parts outside of ! are the same).

Anyway, full code is here :

at pastebin

.. and that completes the proof of concept.

(BTW don't be a jackass and tell me the FIFO walk to the tail is horrible. Yeah, I know, obviously you should keep a tail pointer, but for purposes of this sketch it's irrelevant.)

No comments:

old rants