python3.6 多进程+协程的配合 提升爬虫效率?

2022-09-20 13:57:22

上篇博客协程asyncio+aiohttp模块异步采集数据,效率比多线程或多进程高很多。是否还能再优化,利用多进程+协程将效率翻倍呢?


代码:

(参照L瑜的文章:<python中多进程+协程的使用以及为什么要用它>写了如下测试,异步部分没有用gevent而是asyncio)

importasyncioimportthreadingimportmultiprocessingfrommultiprocessingimportQueue ,Pool,Process#importaiohttp
importos

asyncdefhello(name):print('hello {}{}**********{}'.format(name,os.getpid(),threading.current_thread()))#await asyncio.sleep(int(name))await asyncio.sleep(1)print('end:{}{}'.format(name,os.getpid()))defprocess_start(*namelist):
    tasks=[]
    loop=asyncio.get_event_loop()fornameinnamelist:
        tasks.append(asyncio.ensure_future(hello(name)))
    loop.run_until_complete(asyncio.wait(tasks))deftask_start(namelist):
    i=0lst=[]
    flag=10whilenamelist:
        i+=1l=namelist.pop()
        lst.append(l)ifi==flag:
            p=Process(target=process_start,args=lst)
            p.start()#p.join()lst=[]
            i=0ifnamelist!=[]:
        p=Process(target=process_start,args=lst)
        p.start()#p.join()if__name__=='__main__':
    namelist=list('0123456789'*10)
    task_start(namelist)'''loop=asyncio.get_event_loop()tasks=[]namelist=list('0123456789'*10)for i in namelist:    tasks.append(asyncio.ensure_future(hello(i)))loop.run_until_complete(asyncio.wait(tasks))'''
谁有多核的电脑运行下哈 我这单核的跑不出来效率

  • 作者:Aries8842
  • 原文链接:https://blog.csdn.net/Aries8842/article/details/78957348
    更新时间:2022-09-20 13:57:22