C#多线程编程中的锁系统（四）：自旋锁

程序员文章站 2023-11-17 19:06:46

目录一：基础二：自旋锁示例三：spinlock 四：继续spinlock 五：总结一：基础内核锁：基于内核对象构造的锁机制，就是通常说的内核构造模式...

目录
一：基础

二：自旋锁示例

三：spinlock

四：继续spinlock

五：总结

一：基础

内核锁：基于内核对象构造的锁机制，就是通常说的内核构造模式。

优点：cpu利用最大化。它发现资源被锁住，请求就排队等候。线程切换到别处干活，直到接受到可用信号，线程再切回来继续处理请求。

缺点：托管代码->用户模式代码->内核代码损耗、线程上下文切换损耗。

在锁的时间比较短时，系统频繁忙于休眠、切换，是个很大的性能损耗。

自旋锁：原子操作+自循环。通常说的用户构造模式。线程不休眠，一直循环尝试对资源访问，直到可用。

优点：完美解决内核锁的缺点。

缺点：长时间一直循环会导致cpu的白白浪费，高并发竞争下、cpu的消耗特别严重。

混合锁：内核锁+自旋锁。混合锁是先自旋锁一段时间或自旋多少次，再转成内核锁。

优点：内核锁和自旋锁的折中方案，利用前二者优点，避免出现极端情况（自旋时间过长，内核锁时间过短）。

缺点：自旋多少时间、自旋多少次，这些策略很难把控。

ps：操作系统或net框架，这块算法策略做的已经非常优了，有些api函数也提供了时间及次数可配置项，让开发者根据需求自行判断。

二:自旋锁示例

来看下我们自己简单实现的自旋锁：

复制代码代码如下:

int signal = 0;

            var li = new list<int>();

            parallel.for(0, 1000 * 10000, r =>

            {

                while (interlocked.exchange(ref signal, 1) != 0)//加自旋锁

                {

                    //黑魔法

                }

                li.add(r);

                interlocked.exchange(ref signal, 0);  //释放锁

            });

            console.writeline(li.count);

            //输出：10000000

上面就是自旋锁：interlocked.exchange+while

1：定义signal 0可用，1不可用。

2：parallel模拟并发竞争，原子更改signal状态。后续线程自旋访问signal，是否可用。

3：a线程使用完后，更改signal为0。剩余线程竞争访问资源，b线程胜利后，更改signal为1，失败线程继续自旋，直到可用。

三：spinlock

spinlock是net4.0后系统帮我们实现的自旋锁，内部做了优化。

简单看下实例：

复制代码代码如下:

 var li = new list<int>();

            var sl = new spinlock();

            parallel.for(0, 1000 * 10000, r =>

            {

                bool gotlock = false;     //释放成功

                sl.enter(ref gotlock);    //进入锁

                li.add(r);

                if (gotlock) sl.exit();  //释放

            });

            console.writeline(li.count);

            //输出：10000000

四：继续spinlock

new spinlock(false) 这个构造函数主要用来帮我们检查死锁用，true是开启。

开启状态下，如果发生死锁会直接抛异常的。

贴了一部分源码(已折叠)，我们来看下：

复制代码代码如下:

public void enter(ref bool locktaken)
        {
            if (locktaken)
            {
                locktaken = false;
                throw new system.argumentexception(environment.getresourcestring("spinlock_tryreliableenter_argumentexception"));
            }

            // fast path to acquire the lock if the lock is released
            // if the thread tracking enabled set the new owner to the current thread id
            // id not, set the anonymous bit lock
            int observedowner = m_owner;
            int newowner = 0;
            bool threadtrackingenabled = (m_owner & lock_id_disable_mask) == 0;
            if (threadtrackingenabled)
            {
                if (observedowner == lock_unowned)
                    newowner = thread.currentthread.managedthreadid;
            }
            else if ((observedowner & lock_anonymous_owned) == lock_unowned)
            {
                newowner = observedowner | lock_anonymous_owned; // set the lock bit
            }
            if (newowner != 0)
            {
#if !feature_coreclr
                thread.begincriticalregion();
#endif

#if pfx_legacy_3_5
                if (interlocked.compareexchange(ref m_owner, newowner, observedowner) == observedowner)
                {
                    locktaken = true;
                    return;
                }
#else
                if (interlocked.compareexchange(ref m_owner, newowner, observedowner, ref locktaken) == observedowner)
                {
                    // fast path succeeded
                    return;
                }
#endif
#if !feature_coreclr
                thread.endcriticalregion();
#endif
            }
            //fast path failed, try slow path
            continuetryenter(timeout.infinite, ref locktaken);
        }
private void continuetryenter(int millisecondstimeout, ref bool locktaken)
        {
            long startticks = 0;
            if (millisecondstimeout != timeout.infinite && millisecondstimeout != 0)
            {
                startticks = datetime.utcnow.ticks;
            }

#if !feature_pal && !feature_coreclr   // pal doesn't support eventing, and we don't compile cds providers for coreclr
            if (cdssyncetwbclprovider.log.isenabled())
            {
                cdssyncetwbclprovider.log.spinlock_fastpathfailed(m_owner);
            }
#endif

            if (isthreadownertrackingenabled)
            {
                // slow path for enabled thread tracking mode
                continuetryenterwiththreadtracking(millisecondstimeout, startticks, ref locktaken);
                return;
            }

            // then thread tracking is disabled
            // in this case there are three ways to acquire the lock
            // 1- the first way the thread either tries to get the lock if it's free or updates the waiters, if the turn >= the processors count then go to 3 else go to 2
            // 2- in this step the waiter threads spins and tries to acquire the lock, the number of spin iterations and spin count is dependent on the thread turn
            // the late the thread arrives the more it spins and less frequent it check the lock avilability
            // also the spins count is increaes each iteration
            // if the spins iterations finished and failed to acquire the lock, go to step 3
            // 3- this is the yielding step, there are two ways of yielding thread.yield and sleep(1)
            // if the timeout is expired in after step 1, we need to decrement the waiters count before returning

            int observedowner;

            //***step 1, take the lock or update the waiters

            // try to acquire the lock directly if possoble or update the waiters count
            spinwait spinner = new spinwait();
            while (true)
            {
                observedowner = m_owner;
                if ((observedowner & lock_anonymous_owned) == lock_unowned)
                {
#if !feature_coreclr
                    thread.begincriticalregion();
#endif

#if pfx_legacy_3_5
                    if (interlocked.compareexchange(ref m_owner, observedowner | 1, observedowner) == observedowner)
                    {
                        locktaken = true;
                        return;
                    }
#else
                    if (interlocked.compareexchange(ref m_owner, observedowner | 1, observedowner, ref locktaken) == observedowner)
                    {
                        return;
                    }
#endif

#if !feature_coreclr
                    thread.endcriticalregion();
#endif
                }
                else //failed to acquire the lock,then try to update the waiters. if the waiters count reached the maximum, jsut break the loop to avoid overflow
                    if ((observedowner & waiters_mask) == maximum_waiters || interlocked.compareexchange(ref m_owner, observedowner + 2, observedowner) == observedowner)
                        break;

                spinner.spinonce();
            }

            // check the timeout.
            if (millisecondstimeout == 0 ||
                (millisecondstimeout != timeout.infinite &&
                timeoutexpired(startticks, millisecondstimeout)))
            {
                decrementwaiters();
                return;
            }

            //***step 2. spinning
            //lock acquired failed and waiters updated
            int turn = ((observedowner + 2) & waiters_mask) / 2;
            int processorcount = platformhelper.processorcount;
            if (turn < processorcount)
            {
                int processfactor = 1;
                for (int i = 1; i <= turn * spinning_factor; i++)
                {
                    thread.spinwait((turn + i) * spinning_factor * processfactor);
                    if (processfactor < processorcount)
                        processfactor++;
                    observedowner = m_owner;
                    if ((observedowner & lock_anonymous_owned) == lock_unowned)
                    {
#if !feature_coreclr
                        thread.begincriticalregion();
#endif

                        int newowner = (observedowner & waiters_mask) == 0 ? // gets the number of waiters, if zero
                            observedowner | 1 // don't decrement it. just set the lock bit, it is zzero because a previous call of exit(false) ehich corrupted the waiters
                            : (observedowner - 2) | 1; // otherwise decrement the waiters and set the lock bit
                        contract.assert((newowner & waiters_mask) >= 0);
#if pfx_legacy_3_5
                        if (interlocked.compareexchange(ref m_owner, newowner, observedowner) == observedowner)
                        {
                            locktaken = true;
                            return;
                        }
#else
                        if (interlocked.compareexchange(ref m_owner, newowner, observedowner, ref locktaken) == observedowner)
                        {
                            return;
                        }
#endif

#if !feature_coreclr
                        thread.endcriticalregion();
#endif
                    }
                }
            }

            // check the timeout.
            if (millisecondstimeout != timeout.infinite && timeoutexpired(startticks, millisecondstimeout))
            {
                decrementwaiters();
                return;
            }

            //*** step 3, yielding
            //sleep(1) every 50 yields
            int yieldsofar = 0;
            while (true)
            {
                observedowner = m_owner;
                if ((observedowner & lock_anonymous_owned) == lock_unowned)
                {
#if !feature_coreclr
                    thread.begincriticalregion();
#endif
                    int newowner = (observedowner & waiters_mask) == 0 ? // gets the number of waiters, if zero
                           observedowner | 1 // don't decrement it. just set the lock bit, it is zzero because a previous call of exit(false) ehich corrupted the waiters
                           : (observedowner - 2) | 1; // otherwise decrement the waiters and set the lock bit
                    contract.assert((newowner & waiters_mask) >= 0);
#if pfx_legacy_3_5
                    if (interlocked.compareexchange(ref m_owner, newowner, observedowner) == observedowner)
                    {
                        locktaken = true;
                        return;
                    }
#else
                    if (interlocked.compareexchange(ref m_owner, newowner, observedowner, ref locktaken) == observedowner)
                    {
                        return;
                    }
#endif

#if !feature_coreclr
                    thread.endcriticalregion();
#endif
                }

                if (yieldsofar % sleep_one_frequency == 0)
                {
                    thread.sleep(1);
                }
                else if (yieldsofar % sleep_zero_frequency == 0)
                {
                    thread.sleep(0);
                }
                else
                {
#if pfx_legacy_3_5
                    platform.yield();
#else
                    thread.yield();
#endif
                }

                if (yieldsofar % timeout_check_frequency == 0)
                {
                    //check the timeout.
                    if (millisecondstimeout != timeout.infinite && timeoutexpired(startticks, millisecondstimeout))
                    {
                        decrementwaiters();
                        return;
                    }
                }

                yieldsofar++;
            }
        }

        /// <summary>
        /// decrements the waiters, in case of the timeout is expired
        /// </summary>
        private void decrementwaiters()
        {
            spinwait spinner = new spinwait();
            while (true)
            {
                int observedowner = m_owner;
                if ((observedowner & waiters_mask) == 0) return; // don't decrement the waiters if it's corrupted by previous call of exit(false)
                if (interlocked.compareexchange(ref m_owner, observedowner - 2, observedowner) == observedowner)
                {
                    contract.assert(!isthreadownertrackingenabled); // make sure the waiters never be negative which will cause the thread tracking bit to be flipped
                    break;
                }
                spinner.spinonce();
            }

        }

从代码中发现spinlock并不是我们简单的实现那样一直自旋，其内部做了很多优化。

1：内部使用了interlocked.compareexchange保持原子操作， m_owner 0可用，1不可用。

2：第一次获得锁失败后，继续调用continuetryenter，continuetryenter有三种获得锁的情况。

3：continuetryenter函数第一种获得锁的方式。使用了while+spinwait，后续再讲。

4：第一种方式达到最大等待者数量后，命中走第二种。继续自旋 turn * 100次。100这个值是处理器核数(4, 8 ,16)下最好的。

5：第二种如果还不能获得锁，走第三种。这种就有点混合构造的意味了，如下：

复制代码代码如下:

if (yieldsofar % 40 == 0) 

                    thread.sleep(1);

                else if (yieldsofar % 10 == 0)

                    thread.sleep(0);

                else

                    thread.yield();

thread.sleep(1) ：终止当前线程，放弃剩下时间片休眠1毫秒。退出跟其他线程抢占cpu。当然这个一般会更多，系统无法保证这么细的时间粒度。

thread.sleep(0)：终止当前线程，放弃剩下时间片。但立马还会跟其他线程抢cpu，能不能抢到跟线程优先级有关。

thread.yeild()：结束当前线程。让出cpu给其他准备好的线程。其他线程ok后或没有准备好的线程，继续执行。跟优先级无关。

thread.yeild()还会返回个bool值，是否让出成功。

从源码中，我们可以学到不少编程技巧。比如我们也可以使用自旋+thread.yeild() 或 while+thread.yeild() 等组合。

五：总结

本章谈了自旋锁的基础+楼主的经验。 spinlock类源码这块，只粗浅理解了下，并没有深究。

测了下spinlock和自己实现的自旋锁性能对比（并行添加1000w list<int>()），spinlock是单纯的自旋锁性能2倍以上。

还测了下lock的性能，是系统spinlock性能的3倍以上。可见lock内部自旋的效率更高，clr暂没开源，所以看不到clr具体实现的代码。

上一篇：理解C#中的Lambda表达式

下一篇： C#中的高阶函数介绍

C#多线程编程中的锁系统（四）：自旋锁

C#多线程编程中的锁系统（三）

C#多线程编程中的锁系统基本用法