# PlayWright 微软维护的Web自动化测试工具,和 Selenium 比较类似。也可以用来做爬虫。 ```python D:\Python\Python-env\playwright\Scripts\playwright.exe codegen https://platform.openai.com/usage ``` ## 等待操作 `timeout=0` 永不超时 ```python clicked_selector = '[role="dialog"]' await page.wait_for_selector(clicked_selector,timeout=0) ``` ### 等待某个元素消失 ```python clicked_selector = '[role="dialog"]' await page.wait_for_selector(clicked_selector,timeout=0,state='hidden') ``` ### 等待页面加载完成 ```python page.wait_for_load_state("networkidle") ``` 还可以选择其他状态[python+playwright 学习-79 设置全局导航超时和全局查找元素超时](https://www.cnblogs.com/yoyoketang/p/17663469.html) ## 截图 `full_page = True` 是截取全图,但不会包含控制台。 ```python page.screenshot(path=f"D:\\{time.time()}.png",full_page = True) ``` ## 文件上传 ```python async with page.expect_file_chooser() as file_info: await page.get_by_text("Upload file").click() file_chooser = await file_info.value await file_chooser.set_files("myfile.pdf") await file_chooser.set_files(r"D:\\testfile.pdf") ``` [python+playwright 学习-21.文件上传-优雅处理](https://www.cnblogs.com/yoyoketang/p/17175183.html) ## 文件下载 ```python async with page.expect_download() as download_info: await page.get_by_role("button").click() # 确保文件下载成功 download = await download_info.value # 执行其他操作 ``` ## 整分启动 ```python import time def sleep_to_next_time(target_min,CURRENT_TIME=time.localtime()): """ target_min: """ # 设定开始操作的时间为下一分钟的00秒 CURRENT_MINUTE = CURRENT_TIME.tm_min NEXT_MINUTE = (CURRENT_MINUTE+target_min)%60 if NEXT_MINUTE < CURRENT_MINUTE: # TODO 注意当前时间为59分时,如果目标时间点是 1 分钟以后,上面的计算方法会导致 time.sleep() 异常 print("please set another tagert time.") exit() NEXT_TIME = time.struct_time( (CURRENT_TIME.tm_year,CURRENT_TIME.tm_mon, CURRENT_TIME.tm_mday,CURRENT_TIME.tm_hour, NEXT_MINUTE ,0,CURRENT_TIME.tm_wday, CURRENT_TIME.tm_yday,CURRENT_TIME.tm_isdst)) WAIT_TIME = time.mktime(NEXT_TIME) - time.mktime(CURRENT_TIME) if WAIT_TIME > 0: print(f"sleep for {WAIT_TIME}s.") time.sleep(WAIT_TIME) print(f"sleep is end.") else: print(f"can't sleep for {WAIT_TIME}s.") sleep_to_next_time(2) ``` ## 无头模式 运行时不显示窗口 ```python from playwright.sync_api import sync_playwright, expect TEST_URL = "" with sync_playwright() as p: browser = p.chromium.launch(headless=True) context = browser.new_context() page = context.new_page() # 访问浏览器页面 page.goto(TEST_URL) ``` ## 无痕模式 默认启用的就是这种模式,所以每次都要重新登录 ## 获取 Cookies ```python from playwright.sync_api import sync_playwright, expect """ https://www.cnblogs.com/yoyoketang/p/17662694.html """ TEST_URL = "" with sync_playwright() as p: browser = p.chromium.launch(headless=True) context = browser.new_context() page = context.new_page() # 访问浏览器页面 page.goto(TEST_URL) # 打印所有 print(context.cookies()) # 打印当前页面相关的 Cookies print(context.cookies(page.url)) print(page.context.cookies()) ``` ## 本地浏览器 ```python from getpass import getuser from playwright.sync_api import sync_playwright ''' 打开本地的浏览器 https://blog.csdn.net/weixin_59938810/article/details/124350045 ''' # C:\Users\admin\AppData\Local\Microsoft\Edge\User Data # C:\\Users\\{getuser()}\\AppData\\Local\\Google\\Chrome\\User Data __USER_DATE_DIR_PATH__ = f"C:\\Users\\{getuser()}\\AppData\\Local\\Microsoft\\Edge\\User Data" # chrome.exe 的地址 # C:\Users\admin\AppData\Local\Microsoft\Edge\Application\msedge.exe # C:\Program Files\Google\Chrome\Application\chrome.exe __EXECUTABLE_PATH__ = r"C:\Users\admin\AppData\Local\Microsoft\Edge\Application\msedge.exe" playwright = sync_playwright().start() browser = playwright.chromium.launch_persistent_context( # 指定本机用户缓存地址 user_data_dir=__USER_DATE_DIR_PATH__, # 指定本机google客户端exe的路径 executable_path=__EXECUTABLE_PATH__, # 要想通过这个下载文件这个必然要开 默认是False accept_downloads=True, # 设置不是无头模式 headless=False, bypass_csp=True, slow_mo=10, #跳过检测 args=['--disable-blink-features=AutomationControlled'] ) page = browser.new_page() page.goto("https://platform.openai.com/usage") ``` ## 异步技巧 ### 同时启动多个浏览器实例 如果希望在一台机器上同时测试多个,新建一个 bat 脚本 `startChromes.bat` ```bash @echo off setlocal enabledelayedexpansion set chromePath="C:\Program Files\Google\Chrome\Application" if "%1"=="" ( echo the number of chrome not be set,default 1 set /a num_browsers=1 )else ( set /a num_browsers=%1 ) set current_port=12345 for /l %%i in (1, 1, %num_browsers%) do ( start "" %chromePath%\chrome.exe --remote-debugging-port=!current_port! --incognito --start-maximized --user-data-dir="C:\selenium\chrome%%i" --auto-open-devtools-for-tabs --new-window set /a current_port+=1 ) set /a num_browsers=%num_browsers% python -m main %num_browsers% cd .. @REM exit /b ``` `--auto-open-devtools-for-tabs` 自动打开开发者页面 ### async & await ```python from playwright.async_api import async_playwright import asyncio import random import sys # 浏览器的端口号,请与 star.bat 保持一致 START_PORT = 12345 # 测试的起始URL TEST_URL = "https://cn.bing.com/" # 获取测试所需的并发数,以此生成浏览器实例 BOWERS_NUM = int(sys.argv[1]) async def do_preparations(page,bowser_index): # 随机休眠 0-5 秒 random.seed(bowser_index) await asyncio.sleep(random.uniform(0,5)) await page.goto(TEST_URL) await page.get_by_role("button").click() await page.screenshot(path=f"D:\\{bowser_index}_{time.time()}.png",full_page = True) async def run(): playwright = await async_playwright().start() bowsers = [await playwright.chromium.connect_over_cdp(f'http://localhost:{START_PORT+i}/') for i in range(BOWERS_NUM)] contexts = [await bowser.new_context() for bowser in bowsers] pages = await asyncio.gather(*(context.new_page() for context in contexts)) preparations_tasks = [do_preparations(page,bowser_index) for bowser_index,page in enumerate(pages)] await asyncio.gather(*preparations_tasks) asyncio.run(run()) ``` ## 参考 [上海-悠悠 博客园 标签:python+playwright](https://www.cnblogs.com/yoyoketang/tag/python%2Bplaywright/):非常系统地讲解了许多注意事项 [python+playwright 学习-3.页面操作 Action](https://www.cnblogs.com/yoyoketang/p/17140617.html):系统总结了常用的页面操作 [python+playwright 学习-43 Pyinstaller 打包生成独立的可执行文件。](https://www.cnblogs.com/yoyoketang/p/17274719.html):重点是如何在打包中包含浏览器。