【Python进阶】实现多进程,pool.map()方法的使用
例子1(最简单的):
import time
from multiprocessing.pool import Pool
def numsCheng(i):
return i * 2
if __name__ == '__main__':
time1 = time.time()
nums_list = [1, 2, 3, 4, 5, 6, 7, 8, 9]
max_processes = multiprocessing.cpu_count()
print(f"Max number of processes: {max_processes}")
pool = Pool(processes=max_processes )#全进程运行
result = pool.map(numsCheng, nums_list)
pool.close() # 关闭进程池,不再接受新的进程
pool.join() # 主进程阻塞等待子进程的退出
print(result)
time2 = time.time()
print("计算用时:", time2-time1)
输出为:
[2, 4, 6, 8, 10, 12, 14, 16, 18]
计算用时: 0.21639275550842285
例子2(自己修改后的):
import matplotlib.pyplot as plt
from PIL import Image
from multiprocessing import Pool, freeze_support,cpu_count
if
__name__ == '__main__':#加这两行防止报错
freeze_support()
base_dir = r"C:\\Users\jie\Desktop\轨迹_大模型任务\数据集level1"
for sub_dir1 in os.listdir(base_dir):
if sub_dir1 == "量测场景":
multi_tuple = []#存放n个元组的数据,用于map
sub_dir1_path = os.path.join(base_dir, sub_dir1) # ./数据集level1/量测场景/目录
for filename in os.listdir(sub_dir1_path):
file_path = os.path.join(base_dir, sub_dir1, filename)
print(file_path) # C:\\Users\jie\Desktop\轨迹_大模型任务\数据集level1\关联表\关联结果-0.csv
# 提取匹配部分,根据量测得到对应的关联表
match = re.search(r"-.*\.csv$", filename) # -0.csv
connect_name = match.group()
# 提取数字部分
match = re.search(r"\d+", filename)
number = match.group()
scene_dir = 'new_data/场景' + number
middle_dir = '场景' + number
pic_cj_dir = scene_dir + "/predict_picture"
if not os.path.exists(pic_cj_dir):
os.makedirs(pic_cj_dir)
csv_cj_dir = scene_dir + "/predict_csv"
if not os.path.exists(csv_cj_dir):
os.makedirs(csv_cj_dir)
connect_name = "关联结果" + connect_name
connect_path = os.path.join(base_dir, '关联表', connect_name)
print("正在写入:", middle_dir)
multi_tuple.append((file_path, connect_path, middle_dir))
# getMaxLineData(file_path, connect_path, middle_dir)#middle_dir为场景i
num_processes = cpu_count()#8;
print("---->>>>>>>> 采用多线程,cpu数目是",num_processes,"个 <<<<<<<--------")
pool = Pool(processes=num_processes)
result = pool.map(getMaxLineData, multi_tuple)#1048附近有数据
pool.close() # 关闭进程池,不再接受新的进程
pool.join() # 主进程阻塞等待子进程的退出
中间报错了一个,如下所示:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable
解决方法:
加如下代码即可
if __name__ == '__main__':
freeze_support()
as_completed是Python中concurrent.futures模块提供的函数,用于异步获取并发任务的结果,按照任务实际完成顺序返回对应的Future对象,而非提交顺序。
核心功能与语法
as_completed(fs,
timeout=None)接受一个包含多个Future对象的可迭代对象fs,返回一个生成器。当任务完成时,生成器会按完成顺序逐个生成对应的Future对象。若指定timeout参数,超时会引发TimeoutError。
1 2使用场景与优势
非阻塞处理:无需等待所有任务完成,可即时处理已完成任务的结果。 1 2
动态响应:适用于任务执行时间差异较大的场景(如I/O密集型操作),优先处理耗时短的任务。 1 3
异常处理:通过future.result()捕获单个任务的异常,避免整体任务中断。 1 示例代码python Copy Code from concurrent.futures import ThreadPoolExecutor,
as_completeddef task(n): import time time.sleep(n) return n * n
with ThreadPoolExecutor() as executor: futures =
{executor.submit(task, num): num for num in } for future in
as_completed(futures): num = futures[future] try: result =
future.result() print(f"Task {num} completed with result: {result}“)
except Exception as e: print(f"Task {num} generated an exception:
{e}”) 输出可能为:text Copy Code Task 2 completed with result: 4 Task 3 completed with
result: 9 Task 5 completed with result: 25
参考
https://blog.csdn.net/qq_27390023/article/details/147346585