playwright

优点

Selenium需要通过WebDriver操作浏览器；Playwright通过开发者工具与浏览器交互，安装简洁，不需要安装各种Driver。
Playwright几乎支持所有语言，且不依赖于各种Driver，通过调用内置浏览器所以启动速度更快。
Selenium基于HTTP协议（单向通讯），Playwright基于Websocket（双向通讯）可自动获取浏览器实际情况。
比如使用selenium时，操作元素需要对每个元素进行智能查询等待等，而Playwright为自动等待：
- 等待元素出现（定位元素时，自动等待30s，时间可以自定义，单位毫秒）
- 等待事件发生
Playwright速度比selenium快很多，还支持异步方式
支持使用API的方式发送请求

限制

不支持旧版Edge和IE11。Playwright不支持传统的Microsoft Edge或IE11，支持新的Microsoft Edge (在Chromium上)。
在真实移动设备上测试: Playwright使用桌面浏览器来模拟移动设备。

安装

#升级pip
pip install --upgrade pip
#安装playwright模块
pip install playwright
#安装主流浏览器依赖,时间可能较久
playwright install

测试

录制代码，输入下面的命令，启动一个浏览器，一个代码记录器，然后再浏览器的所有步骤都自动记录到了代码记录器中

1	python -m playwright codegen

录制代码如下

from playwright.sync_api import Playwright, sync_playwright, expect
def run(playwright: Playwright) -> None:
    browser = playwright.chromium.launch(headless=False)
    context = browser.new_context()
    page = context.new_page()
    page.goto("https://www.baidu.com/")
    page.locator("input[name=\"wd\"]").click()
    page.locator("input[name=\"wd\"]").fill("python")
    page.get_by_role("button", name="百度一下").click()
    page.wait_for_url("https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=1&rsv_idx=1&tn=baidu&wd=python&fenlei=256&rsv_pq=0xc3da98d700012600&rsv_t=a363ozUooWOMdrOI3S3PH3JauszohenVsQYNmRX6SyweDX91MOi0p89Sb4HG&rqlang=en&rsv_enter=0&rsv_dl=tb&rsv_sug3=6&rsv_sug1=1&rsv_sug7=100&rsv_btype=i&inputT=1846&rsv_sug4=1847&rsv_jmp=fail")
    # ---------------------
    context.close()
    browser.close()
with sync_playwright() as playwright:
    run(playwright)

通过以上代码可以了解到：

playwright支持同步和异步两种使用方法
不需要为每个浏览器下载webdriver
相比selenium多了一层context抽象
支持无头浏览器，且较为推荐（headless默认值为True）
可以使用传统定位方式（CSS，XPATH等），也有自定义的新的定位方式（如文字定位）
没有使用selenium的先定位元素，再进行操作的方式，而是在操作方法中传入了元素定位，定位和操作同时进行（其实也playwright也提供了单独的定位方法，作为可选）
很多方法使用了with的上下文语法
当然更多的人愿意在Pycharm中手写用例

playwright基本概念

来做这里

PlayWright的核心概念包括：

Browser

一个Browser是一个Chromium, Firefox 或 WebKit（plarywright支持的三种浏览器）的实例plarywright脚本通常以启动浏览器实例开始，以关闭浏览器结束。浏览器实例可以在headless（没有 GUI）或head模式下启动。Browser实例创建：

from playwright.sync_api import sync_playwright
 
with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    browser.close()

启动browser实例是比较耗费资源的，plarywright做的就是如何通过一个browser实例最大化多个BrowserContext的性能。
API:Browser

BrowserContext

一个BrowserContex就像是一个独立的匿名模式会话（session），非常轻量，但是又完全隔离。
（译者注：每个browser实例可有多个BrowserContex，且完全隔离。比如可以在两个BrowserContext中登录两个不同的账号，也可以在两个 context 中使用不同的代理。）
context创建：

1 2	browser = playwright.chromium.launch() context = browser.new_context()

context还可用于模拟涉及移动设备、权限、区域设置和配色方案的多页面场景，如移动端context创建：

from playwright.sync_api import sync_playwright
 
with sync_playwright() as p:
    iphone_11 = p.devices['iPhone 11 Pro']
    browser = p.webkit.launch(headless=False)
    context = browser.new_context(
        **iphone_11,
        locale='de-DE',
        geolocation={ 'longitude': 12.492507, 'latitude': 41.889938 },
        permissions=['geolocation']
    )
    browser.close()

API:

Page 和 Frame

一个BrowserContext可以有多个page，每个page代表一个tab或者一个弹窗。page用于导航到URL并与page内的内容交互。创建page:

page = context.new_page()
 
# Navigate explicitly, similar to entering a URL in the browser.
page.goto('http://example.com')
# Fill an input.
page.fill('#search', 'query')
 
# Navigate implicitly by clicking a link.
page.click('#submit')
# Expect a new url.
print(page.url)
 
# Page can navigate from the script - this will be picked up by Playwright.
# window.location.href = 'https://example.com'

一个page可以有多个frame对象，但只有一个主frame，所有page-level的操作(比如click），都是作用在主frame上的。page的其他frame会打上iframe HTML标签，这些frame可以在内部操作实现访问。

# 通过name属性获取frame
frame = page.frame('frame-login')
 
# 通过URL获取frame
frame = page.frame(url=r'.*domain.*')
 
# 通过其他选择器(selector)获取frame
frame_element_handle = page.query_selector('.frame-class')
frame = frame_element_handle.content_frame()
 
# 与frame交互
frame.fill('#username-input', 'John')

在录制模式下，会自动识别是否是frame内的操作，不好定位frame时，那么可以使用录制模式来找。

API：

Selector

playwright可以通过 CSS selector, XPath selector, HTML 属性（比如 id, data-test-id）或者是文本内容定位元素。
除了xpath selector外，所有selector默认都是指向shadow DOM，如果要指向常规DOM，可使用*:light。不过通常不需要。

# Using data-test-id= selector engine
page.click('data-test-id=foo')
 
# CSS and XPath selector engines are automatically detected
page.click('div')
page.click('//html/body/div')
 
# Find node by text substring
page.click('text=Hello w')
 
# Explicit CSS and XPath notation
page.click('css=div')
page.click('xpath=//html/body/div')
 
# Only search light DOM, outside WebComponent shadow DOM:
page.click('css:light=div')
 
# 不同的selector可组合使用，用 >>连接
# Click an element with text 'Sign Up' inside of a #free-month-promo.
page.click('#free-month-promo >> text=Sign Up')
 
# Capture textContent of a section that contains an element with text 'Selectors'.
section_text = page.eval_on_selector('*css=section >> text=Selectors', 'e => e.textContent')

详细：

Element selectors | Playwright Python

Auto-waiting

playwright在执行操作之前对元素执行一系列可操作性检查，以确保这些行动按预期运行。它会自动等待（auto-wait）所有相关检查通过，然后才执行请求的操作。如果所需的检查未在给定的范围内通过timeout，则操作将失败并显示TimeoutError
如 page.click(selector, **kwargs) 和 page.fill(selector, value, **kwargs) 这样的操作会执行auto-wait ，等待元素变成可见（visible）和可操作（ actionable）。例如，click将会:
- 等待selectorx选定元素出现在 DOM 中
- 待它变得可见（visible）：有非空的边界框且没有 visibility:hidden
- 等待它停止移动：例如，等待 css 过渡（css transition）完成
- 将元素滚动到视图中
- 等待它在动作点接收点事件：例如，等待元素不被其他元素遮挡
- 如果在上述任何检查期间元素被分离，则重试

# Playwright waits for #search element to be in the DOM
page.fill('#search', 'query')
 
# Playwright waits for element to stop animating
# and accept clicks.
page.click('#search')
 
#也可显示执行等待动作
 
# Wait for #search to appear in the DOM.
page.wait_for_selector('#search', state='attached')
# Wait for #promo to become visible, for example with `visibility:visible`.
page.wait_for_selector('#promo')
 
# Wait for #details to become hidden, for example with `display:none`.
page.wait_for_selector('#details', state='hidden')
# Wait for #promo to be removed from the DOM.
page.wait_for_selector('#promo', state='detached')

Execution context

API page.evaluate(expression, **kwargs) 可以用来运行web页面中的 JavaScript函数，并将结果返回到plarywright环境中。浏览器的全局变量，如 window 和 document， 可用于 evaluate。

href = page.evaluate('() => document.location.href')
 
# if the result is a Promise or if the function is asynchronous evaluate will automatically wait until it's resolved
 
status = page.evaluate("""async () => {
  response = fetch(location.href)
  return response.status
}""")

Evaluation Argument

page.evaluate(expression, **kwargs) 方法接收单个可选参数。此参数可以是Serializable值和JSHandle或ElementHandle实例的混合。句柄会自动转换为它们所代表的值

result = page.evaluate("([x, y]) => Promise.resolve(x * y)", [7, 8])
print(result) # prints "56"
 
 
print(page.evaluate("1 + 2")) # prints "3"
x = 10
print(page.evaluate(f"1 + {x}")) # prints "11"
 
 
body_handle = page.query_selector("body")
html = page.evaluate("([body, suffix]) => body.innerHTML + suffix", [body_handle, "hello"])
body_handle.dispose()
 
 
# A primitive value.
page.evaluate('num => num', 42)
 
# An array.
page.evaluate('array => array.length', [1, 2, 3])
 
# An object.
page.evaluate('object => object.foo', { 'foo': 'bar' })
 
# A single handle.
button = page.query_selector('button')
page.evaluate('button => button.textContent', button)
 
# Alternative notation using elementHandle.evaluate.
button.evaluate('(button, from) => button.textContent.substring(from)', 5)
 
# Object with multiple handles.
button1 = page.query_selector('.button1')
button2 = page.query_selector('.button2')
page.evaluate("""o => o.button1.textContent + o.button2.textContent""",
    { 'button1': button1, 'button2': button2 })
 
# Object destructuring works. Note that property names must match
# between the destructured object and the argument.
# Also note the required parenthesis.
page.evaluate("""
    ({ button1, button2 }) => button1.textContent + button2.textContent""",
    { 'button1': button1, 'button2': button2 })
 
# Array works as well. Arbitrary names can be used for destructuring.
# Note the required parenthesis.
page.evaluate("""
    ([b1, b2]) => b1.textContent + b2.textContent""",
    [button1, button2])
 
# Any non-cyclic mix of serializables and handles works.
page.evaluate("""
    x => x.button1.textContent + x.list[0].textContent + String(x.foo)""",
    { 'button1': button1, 'list': [button2], 'foo': None })

结合pytest

testcas\conftest.py

import pytest
from playwright.sync_api import sync_playwright
from py._xmlgen import html


@pytest.fixture()
def browser():
    playwrigh = sync_playwright().start()
    browser = playwrigh.chromium.launch(headless=False)

    # 返回数据
    yield browser

    # 实现用例后置
    browser.close()
    playwrigh.stop()


@pytest.mark.hookwrapper
def pytest_runtest_makereport(item, call):
    outcome = yield
    report = outcome.get_result()
    report.description = str(item.function.__doc__)
    report.nodeid = report.nodeid.encode("utf-8").decode("unicode_escape")  #

def pytest_html_results_table_header(cells):
    cells.insert(1, html.th('用例名称'))
    cells.insert(2, html.th('Test_nodeid'))
    cells.pop(2)


def pytest_html_results_table_row(report, cells):
    cells.insert(1, html.td(report.description))
    cells.insert(2, html.td(report.nodeid))
    cells.pop(2)


def pytest_html_results_table_html(report, data):
    if report.passed:
        del data[:]
        data.append(html.div('通过的用例未捕获日志输出.', class_='empty log'))


def pytest_html_report_title(report):
    report.title = "pytest示例项目测试报告"

testcase\test1.py，page.request.get可以直接发送请求

import pytest


class TestClassName:
    @pytest.mark.usefixtures("browser")
    def test_func_name1(self, browser):
        context = browser.new_context()
        page = context.new_page()
        # 发送http请求
        resp = page.request.get("http://www.kuaidi100.com/query?type=")
        print(resp.text())
        page.goto("https://www.baidu.com/")
        assert page.title() == "百度一下，你就知道"
        
        page.locator("input[name=\"wd\"]").click()
        page.locator("input[name=\"wd\"]").fill("python")
        page.get_by_role("button", name="百度一下").click()
        context.close()

    @pytest.mark.usefixtures("browser")
    def test_func_name1_1(self, browser):
        context = browser.new_context()
        page = context.new_page()
        page.goto("https://www.baidu.com/")
        assert page.title() == "百度一下，你就知道1"
        page.locator("input[name=\"wd\"]").click()
        page.locator("input[name=\"wd\"]").fill("python")
        page.get_by_role("button", name="百度一下").click()
        context.close()

执行用例

# 批量运行用例
pytest -s testcase\  --html=report.html --self-contained-html --capture=sys

# 多线程运行用例 
pip install pytest-multithreading -i https://pypi.douban.com/simple
pytest -s testcase/ --th 10 --html=report.html --self-contained-html --capture=sys

查看执行结果

检查元素可见性

在元素定位过程中，经常出现元素出现了，但是实际定位不到，这时候可以检查dom元素的可见性

def find_el(page, el, timeout=10000):
    try:
        # 等待元素出现到dom中
        element = page.wait_for_selector(el, state="attached", timeout=timeout)
        # 等待元素可见
        element.wait_for_element_state("visible", timeout=timeout)
        return element
    except Exception as e:
        pass
        return None

其他

如何集成到CI上待实践
关于多机并行，可以多进程去启动，也可以在CI上新建几个节点去执行

施坤的博客

web自动化神器playwright

playwright

优点

限制

安装

测试

playwright基本概念

Browser

BrowserContext

Page 和 Frame

Selector

Auto-waiting

Execution context

Evaluation Argument

结合pytest

检查元素可见性

其他