多角度让你彻底明白yield语法糖的用法和原理及在C#函数式编程中的作用

如果大家读过dapper源码，你会发现这内部有很多方法都用到了yield关键词，那yield到底是用来干嘛的，能不能拿掉，拿掉与不拿掉有多大的差别，首先上一段dapper中精简后的Query方法，先让大家眼见为实。


private static IEnumerable QueryImpl(this IDbConnection cnn, CommandDefinition command, Type effectiveType)
 {
     object param = command.Parameters;
     var identity = new Identity(command.CommandText, command.CommandType, cnn, effectiveType, param?.GetType());
     var info = GetCacheInfo(identity, param, command.AddToCache);

     IDbCommand cmd = null;
     IDataReader reader = null;

     bool wasClosed = cnn.State == ConnectionState.Closed;
     try
     {
  while (reader.Read())
  {
      object val = func(reader);
      if (val == null || val is T)
      {
   yield return (T)val;
      }
      else
      {
   yield return (T)Convert.ChangeType(val, convertToType, CultureInfo.InvariantCulture);
      }
  }
     }
 }


一：yield探究
1. 骨架代码猜想
骨架代码其实很简单，方法的返回值是IEnumerable，然后return被yield开了光，让人困惑的地方就是既然方法的返回值是IEnumerable却在方法体内没有看到任何实现这个接口的子类，所以第一感觉就是这个yield不简单，既然代码可以跑，那底层肯定帮你实现了一个继承IEnumerable接口的子类，你说对吧？
2. msdn解释
有自己的猜想还不行，还得相信权威，看msdn的解释：https://docs.microsoft.com/zh-cn/dotnet/csharp/language-reference/keywords/yield
如果你在语句中使用 yield 上下文关键字，则意味着它在其中出现的方法、运算符或 get 访问器是迭代器。 通过使用 yield 定义迭代器，可在实现自定义集合类型的 IEnumerator 和 IEnumerable 模式时无需其他显式类（保留枚举状态的类，有关示例，请参阅 IEnumerator）。
没用过yield之前，看这句话肯定是一头雾水，只有在业务开发中踩过坑，才能体会到yield所带来的快感。
3. 从IL入手
为了方便探究原理，我来写一个不能再简单的例子。

 public static void Main(string[] args)
 {
     var list = GetList(new int[] { 1, 2, 3, 4, 5 });
 }

 public static IEnumerable GetList(int[] nums)
 {
     foreach (var num in nums)
     {
  yield return num;
     }
 }


对，就是这么简单，接下来用ILSpy反编译打开这其中的神秘面纱。

从截图中看最让人好奇的有两点。
<1> 无缘无故的多了一个叫做d__1 类
好奇心驱使着我看一下这个类到底都有些什么？由于IL代码太多，我做一下精简，从下面的IL代码中可以发现，果然是实现了IEnumerable接口，如果你了解设计模式中的迭代器模式，那这里的MoveNext,Current是不是非常熟悉？

.class nested private auto ansi sealed beforefieldinit 'd__1'
	extends [mscorlib]System.Object
	implements class [mscorlib]System.Collections.Generic.IEnumerable`1,
	    [mscorlib]System.Collections.IEnumerable,
	    class [mscorlib]System.Collections.Generic.IEnumerator`1,
	    [mscorlib]System.IDisposable,
	    [mscorlib]System.Collections.IEnumerator
{
	.method private final hidebysig newslot virtual 
		instance bool MoveNext () cil managed 
	{
		...
	} // end of method 'd__1'::MoveNext

	.method private final hidebysig specialname newslot virtual 
		instance int32 'System.Collections.Generic.IEnumerator.get_Current' () cil managed 
	{
		...
	} // end of method 'd__1'::'System.Collections.Generic.IEnumerator.get_Current'


	.method private final hidebysig specialname newslot virtual 
		instance object System.Collections.IEnumerator.get_Current () cil managed 
	{
		...
	} // end of method 'd__1'::System.Collections.IEnumerator.get_Current

	.method private final hidebysig newslot virtual 
		instance class [mscorlib]System.Collections.Generic.IEnumerator`1 'System.Collections.Generic.IEnumerable.GetEnumerator' () cil managed 
	{
		...
	} // end of method 'd__1'::'System.Collections.Generic.IEnumerable.GetEnumerator'

} // end of class d__1


<2> GetList方法体现在会变成啥样？

.method public hidebysig static 
	class [mscorlib]System.Collections.Generic.IEnumerable`1 GetList (
		int32[] nums
	) cil managed 
{
	// (no C# code)
	IL_0000: ldc.i4.s -2
	IL_0002: newobj instance void ConsoleApp2.Program/'d__1'::.ctor(int32)
	IL_0007: dup
	IL_0008: ldarg.0
	IL_0009: stfld int32[] ConsoleApp2.Program/'d__1'::'<>3__nums'
	IL_000e: ret
} // end of method Program::GetList


可以看到这地方做了一个new ConsoleApp2.Program/'d__1’操作，然后进行了<>3__nums=0，最后再把这个迭代类返回出来，这就解释了为什么你的GetList可以是IEnumerable而不报错。
4. 打回C#代码
你可能会说，你说了这么多有啥用？ IL代码我也看不懂，如果能回写成C#代码那就了，还好回写成C#代码不算太难。。。
namespace ConsoleApp2
{
    class GetListEnumerable : IEnumerable, IEnumerator
    {
 private int state;
 private int current;
 private int threadID;
 public int[] nums;
 public int[] s1_nums;
 public int s2;
 public int num53;

 public GetListEnumerable(int state)
 {
     this.state = state;
     this.threadID = Environment.CurrentManagedThreadId;
 }
 public int Current => current;

 public IEnumerator GetEnumerator()
 {
     GetListEnumerable rangeEnumerable;

     if (state == -2 && threadID == Environment.CurrentManagedThreadId)
     {
  state = 0;
  rangeEnumerable = this;
     }
     else
     {
  rangeEnumerable = new GetListEnumerable(0);
     }

     rangeEnumerable.nums = nums;
     return rangeEnumerable;
 }

 public bool MoveNext()
 {
     switch (state)
     {
  case 0:

      state = -1;
      s1_nums = nums;
      s2 = 0;
      num53 = s1_nums[s2];
      current = num53;
      state = 1;
      return true;
  case 1:
      state = -1;
      s2++;

      if (s2 < s1_nums.Length)
      {
   num53 = s1_nums[s2];
   current = num53;
   state = 1;
   return true;
      }

      s1_nums = null;
      return false;
     }
     return false;
 }
 object IEnumerator.Current => Current;
 public void Dispose() { }
 public void Reset() { }
 IEnumerator IEnumerable.GetEnumerator() { return this.GetEnumerator(); }
    }
}


接下来GetList就可以是另一种写法了，做一个new GetListEnumerable 即可。

到目前为止，我觉得这个yield你应该彻底的懂了，否则就是我的失败(┬＿┬)…
二：yield到底有什么好处
以我自己几年开发经验(不想把自己说的太老(┬＿┬))来看，有如下两点好处。
1. 现阶段还不清楚用什么集合来承载这些数据
这话什么意思？同样的一堆集合数据，你可以用List承载，你也可以用SortList，HashSet甚至还可以用Dictionary承载，对吧，你当时定义方法的时候返回值那里是一定要先定义好接收集合，但这个接收集合真的合适吗？你当时也是不知道的。 如果你还不明白，我举个例子：
    public static class Program
    {
 public static void Main(string[] args)
 {
     //哈哈，我最后想要HashSet。。。因为我要做高效的集合去重
     var hashSet1 = new HashSet(GetList(new int[] { 1, 2, 3, 4, 5 }));
     var hashSet2 = new HashSet(GetList2(new int[] { 1, 2, 3, 4, 5 }));
 }

 //编码阶段就预先决定了用List承载
 public static List GetList(int[] nums)
 {
     return nums.Where(num => num % 2 == 0).ToList();
 }

 //编码阶段还没想好用什么集合承载，有可能是HashSet,SortList，鬼知道呢？
 public static IEnumerable GetList2(int[] nums)
 {
     foreach (var num in nums)
     {
  if (num % 2 == 0) yield return num;
     }
 }
    }

先看代码中的注释，从上面例子中可以看到我真正想要的是HashSet，而此时hashSet2 比 hashSet1 少了一个中转过程，无形中这就大大提高了代码性能，对不对?
hashSet1 其实是  int[] -> List -> HashSet 的过程。
hashSet2 其实是  int[] -> HashSet 的过程。
2. 可以让我无限制的叠加筛选塑形条件
这个又是什么意思呢？ 有时候方法调用栈是特别深的，你无法对一个集合在最底层进行整体一次性筛选，而是在每个方法中实行追加式筛选塑性，请看如下示例代码。

    public static class Program
    {
 public static void Main(string[] args)
 {
     var nums = M1(true).ToList();
 }

 public static IEnumerable M1(bool desc)
 {
     return desc ? M2(2).OrderByDescending(m => m) : M2(2).OrderBy(m => m);
 }

 public static IEnumerable M2(int mod)
 {
     return M3(0, 10).Where(m => m % mod == 0);
 }

 public static IEnumerable M3(int start, int end)
 {
     var nums = new int[] { 1, 2, 3, 4, 5 };
     return nums.Where(i => i > start && i < end);
 }
    }


上面的M1,M2,M3方法就是实现了这么一种操作，最后使用ToList一次性输出，由于没有中间商，所以灵活性和性能可想而知。
三：总结
函数式编程将会是以后的主流方向，C#中几乎所有的新特性都是为了给函数式编程提供便利性，而这个yield就是C#函数式编程中的一个基柱，你还可以补看Enumerable中的各种扩展方法增加一下我的说法可信度。
 static IEnumerable TakeWhileIterator(IEnumerable source, Func predicate) {
     foreach (TSource element in source) {
  if (!predicate(element)) break;
  yield return element;
     }
 } 

static IEnumerable WhereIterator(IEnumerable source, Func predicate) {
     int index = -1;
     foreach (TSource element in source) {
  checked { index++; }
  if (predicate(element, index)) yield return element;
     }
 }


好了，本篇就说到这里，希望对你有帮助。

多角度让你彻底明白yield语法糖的用法和原理及在C#函数式编程中的作用

C/C++/C#相关栏目本月热门文章