ed子进程时,Linux使用写时复制
os.fork。展示:
import multiprocessing as mpimport numpy as npimport loggingimport oslogger = mp.log_to_stderr(logging.WARNING)def free_memory(): total = 0 with open('/proc/meminfo', 'r') as f: for line in f: line = line.strip() if any(line.startswith(field) for field in ('MemFree', 'Buffers', 'Cached')): field, amount, unit = line.split() amount = int(amount) if unit != 'kB': raise ValueError( 'Unknown unit {u!r} in /proc/meminfo'.format(u = unit)) total += amount return totaldef worker(i): x = data[i,:].sum() # Exercise access to data logger.warn('Free memory: {m}'.format(m = free_memory()))def main(): procs = [mp.Process(target = worker, args = (i, )) for i in range(4)] for proc in procs: proc.start() for proc in procs: proc.join()logger.warn('Initial free: {m}'.format(m = free_memory()))N = 15000data = np.ones((N,N))logger.warn('After allocating data: {m}'.format(m = free_memory()))if __name__ == '__main__': main()产生了
[WARNING/MainProcess] Initial free: 2522340[WARNING/MainProcess] After allocating data: 763248[WARNING/Process-1] Free memory: 760852[WARNING/Process-2] Free memory: 757652[WARNING/Process-3] Free memory: 757264[WARNING/Process-4] Free memory: 756760
这表明最初大约有2.5GB的可用内存。分配15000x15000的数组后
float64,有763248 KB的可用空间。因为15000 ** 2 *8个字节= 1.8GB,而内存的下降2.5GB-0.763248GB也大约是1.8GB,所以这大概是有道理的。
现在,在生成每个进程之后,再次报告可用内存为〜750MB。可用内存没有明显减少,因此我得出结论,系统必须使用写时复制。
结论:如果不需要修改数据,则在
__main__模块的全局级别定义数据是一种方便的(至少在Linux上)内存友好的方式,可以在子进程之间共享数据。



