performance - python multiprocessing efficiency -


timer function calculate cost time between multi_core_1 , multi_core_2

multi_core_1

results = p.map_async(deal, urls) 

multi_core_2

for url in urls:     results = p.map_async(deal, url) 

code

# !/usr/bin/env python # -*- coding:utf-8 -*-  import time import logging functools import wraps multiprocessing.dummy import pool, queue, manager, freeze_support import requests  urls = [     'http://www.baidu.com',     'http://home.baidu.com/',     ```     100 urls ]   def timer(func):     @wraps(func)     def wrapper(*args, **kwargs):         t = time.time()         = func(*args, **kwargs)         logging.warn('%s cost %s' % (func.__name__, (time.time()-t)))         return     return wrapper   def deal(url):     return requests.get(url).status_code   @timer def multi_core_1():     freeze_support()     p = pool(8)     results = p.map_async(deal, urls)     p.close()     p.join()   @timer def multi_core_2():     freeze_support()     p = pool(8)     url in urls:         results = p.map_async(deal, url)     p.close()     p.join()   if __name__ == '__main__':     multi_core_1()     multi_core_2() 

result

> python test.py warning:root:multi_core_1 cost 1.3149404525756836 warning:root:multi_core_2 cost 0.2142746448516845 

question

so wonder how multi_core_2() can faster multi_core_1()

in second function you're using map_async wrong.

map_async takes function apply , iterable.

when pass string iterable, treats each character in string element. each url in list tries apply deal function each character individually ('h', 't', 't', etc). fails in requests.get outright, fails, doesn't have load page , faster; it's broken code doesn't work, though.

you're assigning results on each loop iteration, it'll overwrite on each new url , contain error codes last url string.

before checking function performance, make sure function works intended.


Comments