# PlayWright
微软维护的Web自动化测试工具,和 Selenium 比较类似。也可以用来做爬虫。
```python
D:\Python\Python-env\playwright\Scripts\playwright.exe codegen https://platform.openai.com/usage
```
## 等待操作
`timeout=0` 永不超时
```python
clicked_selector = '[role="dialog"]'
await page.wait_for_selector(clicked_selector,timeout=0)
```
### 等待某个元素消失
```python
clicked_selector = '[role="dialog"]'
await page.wait_for_selector(clicked_selector,timeout=0,state='hidden')
```
### 等待页面加载完成
```python
page.wait_for_load_state("networkidle")
```
还可以选择其他状态[python+playwright 学习-79 设置全局导航超时和全局查找元素超时](https://www.cnblogs.com/yoyoketang/p/17663469.html)
## 截图
`full_page = True` 是截取全图,但不会包含控制台。
```python
page.screenshot(path=f"D:\\{time.time()}.png",full_page = True)
```
## 文件上传
```python
async with page.expect_file_chooser() as file_info:
await page.get_by_text("Upload file").click()
file_chooser = await file_info.value
await file_chooser.set_files("myfile.pdf")
await file_chooser.set_files(r"D:\\testfile.pdf")
```
[python+playwright 学习-21.文件上传-优雅处理](https://www.cnblogs.com/yoyoketang/p/17175183.html)
## 文件下载
```python
async with page.expect_download() as download_info:
await page.get_by_role("button").click()
# 确保文件下载成功
download = await download_info.value
# 执行其他操作
```
## 整分启动
```python
import time
def sleep_to_next_time(target_min,CURRENT_TIME=time.localtime()):
"""
target_min:
"""
# 设定开始操作的时间为下一分钟的00秒
CURRENT_MINUTE = CURRENT_TIME.tm_min
NEXT_MINUTE = (CURRENT_MINUTE+target_min)%60
if NEXT_MINUTE < CURRENT_MINUTE:
# TODO 注意当前时间为59分时,如果目标时间点是 1 分钟以后,上面的计算方法会导致 time.sleep() 异常
print("please set another tagert time.")
exit()
NEXT_TIME = time.struct_time(
(CURRENT_TIME.tm_year,CURRENT_TIME.tm_mon,
CURRENT_TIME.tm_mday,CURRENT_TIME.tm_hour,
NEXT_MINUTE
,0,CURRENT_TIME.tm_wday,
CURRENT_TIME.tm_yday,CURRENT_TIME.tm_isdst))
WAIT_TIME = time.mktime(NEXT_TIME) - time.mktime(CURRENT_TIME)
if WAIT_TIME > 0:
print(f"sleep for {WAIT_TIME}s.")
time.sleep(WAIT_TIME)
print(f"sleep is end.")
else:
print(f"can't sleep for {WAIT_TIME}s.")
sleep_to_next_time(2)
```
## 无头模式
运行时不显示窗口
```python
from playwright.sync_api import sync_playwright, expect
TEST_URL = ""
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
context = browser.new_context()
page = context.new_page()
# 访问浏览器页面
page.goto(TEST_URL)
```
## 无痕模式
默认启用的就是这种模式,所以每次都要重新登录
## 获取 Cookies
```python
from playwright.sync_api import sync_playwright, expect
"""
https://www.cnblogs.com/yoyoketang/p/17662694.html
"""
TEST_URL = ""
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
context = browser.new_context()
page = context.new_page()
# 访问浏览器页面
page.goto(TEST_URL)
# 打印所有
print(context.cookies())
# 打印当前页面相关的 Cookies
print(context.cookies(page.url))
print(page.context.cookies())
```
## 本地浏览器
```python
from getpass import getuser
from playwright.sync_api import sync_playwright
'''
打开本地的浏览器
https://blog.csdn.net/weixin_59938810/article/details/124350045
'''
# C:\Users\admin\AppData\Local\Microsoft\Edge\User Data
# C:\\Users\\{getuser()}\\AppData\\Local\\Google\\Chrome\\User Data
__USER_DATE_DIR_PATH__ = f"C:\\Users\\{getuser()}\\AppData\\Local\\Microsoft\\Edge\\User Data"
# chrome.exe 的地址
# C:\Users\admin\AppData\Local\Microsoft\Edge\Application\msedge.exe
# C:\Program Files\Google\Chrome\Application\chrome.exe
__EXECUTABLE_PATH__ = r"C:\Users\admin\AppData\Local\Microsoft\Edge\Application\msedge.exe"
playwright = sync_playwright().start()
browser = playwright.chromium.launch_persistent_context(
# 指定本机用户缓存地址
user_data_dir=__USER_DATE_DIR_PATH__,
# 指定本机google客户端exe的路径
executable_path=__EXECUTABLE_PATH__,
# 要想通过这个下载文件这个必然要开 默认是False
accept_downloads=True,
# 设置不是无头模式
headless=False,
bypass_csp=True,
slow_mo=10,
#跳过检测
args=['--disable-blink-features=AutomationControlled']
)
page = browser.new_page()
page.goto("https://platform.openai.com/usage")
```
## 异步技巧
### 同时启动多个浏览器实例
如果希望在一台机器上同时测试多个,新建一个 bat 脚本 `startChromes.bat`
```bash
@echo off
setlocal enabledelayedexpansion
set chromePath="C:\Program Files\Google\Chrome\Application"
if "%1"=="" (
echo the number of chrome not be set,default 1
set /a num_browsers=1
)else (
set /a num_browsers=%1
)
set current_port=12345
for /l %%i in (1, 1, %num_browsers%) do (
start "" %chromePath%\chrome.exe --remote-debugging-port=!current_port! --incognito --start-maximized --user-data-dir="C:\selenium\chrome%%i" --auto-open-devtools-for-tabs --new-window
set /a current_port+=1
)
set /a num_browsers=%num_browsers%
python -m main %num_browsers%
cd ..
@REM exit /b
```
`--auto-open-devtools-for-tabs` 自动打开开发者页面
### async & await
```python
from playwright.async_api import async_playwright
import asyncio
import random
import sys
# 浏览器的端口号,请与 star.bat 保持一致
START_PORT = 12345
# 测试的起始URL
TEST_URL = "https://cn.bing.com/"
# 获取测试所需的并发数,以此生成浏览器实例
BOWERS_NUM = int(sys.argv[1])
async def do_preparations(page,bowser_index):
# 随机休眠 0-5 秒
random.seed(bowser_index)
await asyncio.sleep(random.uniform(0,5))
await page.goto(TEST_URL)
await page.get_by_role("button").click()
await page.screenshot(path=f"D:\\{bowser_index}_{time.time()}.png",full_page = True)
async def run():
playwright = await async_playwright().start()
bowsers = [await playwright.chromium.connect_over_cdp(f'http://localhost:{START_PORT+i}/') for i in range(BOWERS_NUM)]
contexts = [await bowser.new_context() for bowser in bowsers]
pages = await asyncio.gather(*(context.new_page() for context in contexts))
preparations_tasks = [do_preparations(page,bowser_index) for bowser_index,page in enumerate(pages)]
await asyncio.gather(*preparations_tasks)
asyncio.run(run())
```
## 参考
[上海-悠悠 博客园 标签:python+playwright](https://www.cnblogs.com/yoyoketang/tag/python%2Bplaywright/):非常系统地讲解了许多注意事项
[python+playwright 学习-3.页面操作 Action](https://www.cnblogs.com/yoyoketang/p/17140617.html):系统总结了常用的页面操作
[python+playwright 学习-43 Pyinstaller 打包生成独立的可执行文件。](https://www.cnblogs.com/yoyoketang/p/17274719.html):重点是如何在打包中包含浏览器。