C++并行多线程求和

源自于C++并发编程的代码，自己的理解和说明

#include
#include
#include//提供std::accumulate
#include

template
struct accumulate_block
{
	void operator()(Iterator first, Iterator last, T& result)
	{
		result = std::accumulate(first, last, result);
	}
};
template
T parallel_accumulate(Iterator first, Iterator last, T init)
{
	unsigned long const length = std::distance(first, last);//返回从 first 到 last 的路程。
	if (!length)
		return init;
	unsigned long const min_per_thread = 25;
	unsigned long const max_threads = (length + min_per_thread - 1) / min_per_thread;//如果是十个元素，那么结果为1
	unsigned long const hardware_threads = std::thread::hardware_concurrency();//返回值支持的并发线程数。若该值非良定义或不可计算，则返回 0，我电脑为16
	unsigned long const num_threads = std::min(hardware_threads != 0 ? hardware_threads : 2, max_threads);//16返回16,min(16,1)，返回1，num_threads为1
	unsigned long const block_size = length / num_threads;//10/1，那么size为10
	std::vector results(num_threads);//1
	std::vectorthreads(num_threads - 1);//以thread对象为元素，初始化为1-1,0
	Iterator block_start = first;//一开始，开始迭代器位置为开始first
	for (unsigned long i = 0; i < (num_threads - 1); ++i)//1-1,0，10个元素会直接跳过
	{
		Iterator block_end = block_start;//结束迭代器
		std::advance(block_end, block_size);//advance第一个参数迭代器，第二个参数移动的元素个数，在这里的作用是划分，让结束迭代器移动，这样开始和结束中间的元素就是线程处理的元素
		threads[i] = std::thread(accumulate_block(),block_start, block_end, std::ref(results[i]));//一个函数对象，两个参数，构造了匿名thread对象开启线程
		block_start = block_end;//增加，当上一个线程开启完毕后让开始迭代器赋值为上一个的末尾迭代器
	}
	accumulate_block() ( block_start, last, results[num_threads - 1] );//10个元素则在这里就计算完，55，如果数据很大那么最后一块数据会在这里求和也就是主线程也要负责一部分运算
	for (auto& entry : threads)
		entry.join();

	return std::accumulate(results.begin(), results.end(), init);//accumulate求和算法，init为起始值，在多线程求出每一堆元素的和后放入result，这一步操作是让他们的值加起来
}

int main() {
	int n[33]{ 10,9,8,7,6,5,4,3,2,1,1,10,25,65,98,32,31,14,15,15,18,18,9,5,9,8,8,8,8,9,6,6,55 };
	std::cout << parallel_accumulate(n, n + 33, 0) << std::endl;
}
//仔细看来也很简单，关注线程和result，以及开始与结束迭代器的移动。
//31，32,33有详细的介绍

乍看一下很多，不过不用害怕，只要有一点多线程处理的经验就很快能看懂，我们一步一步来讲

1、首先我们看头文件，标准输入输出流，thread线程管理对象，numeric提供std::accumulate少造点轮子

2、accumulate_block类不过是重载了()做仿函数，线程调用的时候，不是重点

3、parallel_accumulate是我们调用和主要设计的模板函数，函数的参数是一个开始迭代器，结束迭代器，和一个初始值，当然，这些都是模板。

4、最初写的时候我是以10个元素的数组举例，不过这样看不出不什么，所以我们以33个元素为例，首先length初始化为distance，返回迭代器之间的路程，如果没有意义则之间结束

5、先设置min_per_thread为25，max_threads的值为合适的最大线程数，我们这里得到2；hardware_threads调用了std::thread::hardware_concurrency();成员函数，返回的是所在环境支持的线程数；

6、num_threads重点来了这是一个嵌套，我们从内到外看，三目运算符，得到线程数即16，然后max_threads为2，min取最小这，所以num_threads为2

7、block_size的目的是每一个线程处理的元素数量，我们这里是33/2也就是16

8、std::vectorthreads(num_threads - 1)这里的值是2-1，创建一个线程，因为主线程也是一个线程，会计算。

9、Iterator block_start = first;定义开始迭代器为传入的参数

10、进入循环，循环次数为线程数，即0<2-1，所以只会循环一次，也就只创建一个线程

11、Iterator block_end = block_start;//结束迭代器
std::advance(block_end, block_size);

结束迭代器先赋值为开始的一样，然后开始调用库函数进行移动，移动的元素个数为先前计算的线程一次处理的元素个数

12、开启线程，然后开始迭代器被赋值为之前的结束迭代器的位置，以此循环。

13，离开循环，主线程调用先前写好的accumulate_block()匿名构造，传入引用，获取和。

14、for循环确保所有线程正确停止，或等待

16，把线程求和放到results容器的值求和，就得到最终的整个序列的值

更多还是得自己理解，注释试着理解

C++并行多线程求和

C/C++/C#相关栏目本月热门文章