一、相关背景

有些网站需求提供网页截图功能，例如反馈意见时需要带上屏幕截图，又或者说将项目中统计报表的界面的数据定时发送等。部分情况下是使用PhantomJs实现，但是存在退出进程无法清理干净、容易被反爬虫等问题。同时Phantomjs已经目前也已经停止更新与维护。

Headerless Browser（无头的浏览器）是浏览器的无界面状态，可以在不打开浏览器GUI的情况下，使用浏览器支持的性能。而Chrome Headless相比于其他的浏览器，可以更便捷的运行web自动化，编写爬虫、截图等。十分方便的满足了网页截图的业务需要。

二、selenium+chrome headless

Selenium 是一个用于 Web 应用程序测试的工具。它的优点在于，浏览器能打开的页面，使用 selenium 就一定能获取到。配合chrome headless可以很好的完成网页截图的业务功能。

三、代码分析

查看selenium中driver.get()方法的具体实现：

主要调用的org.openqa.selenium.remote.RemoteWebDriver的get方法：

    public void get(String url) {
        this.execute("get", ImmutableMap.of("url", url));
    }

查看execute方法的具体实现：

protected Response execute(String driverCommand, Map<String, ?> parameters) {
        Command command = new Command(this.sessionId, driverCommand, parameters);
        long start = System.currentTimeMillis();
        String currentName = Thread.currentThread().getName();
        Thread.currentThread().setName(String.format("Forwarding %s on session %s to remote", driverCommand, this.sessionId));

        Response response;
        try {
            this.log(this.sessionId, command.getName(), command, RemoteWebDriver.When.BEFORE);
            response = this.executor.execute(command);
            this.log(this.sessionId, command.getName(), command, RemoteWebDriver.When.AFTER);
            Object value;
            if (response == null) {
                value = null;
                return (Response)value;
            }

            value = this.converter.apply(response.getValue());
            response.setValue(value);
        } catch (SessionNotFoundException var17) {
            throw var17;
        } catch (Exception var18) {
            this.log(this.sessionId, command.getName(), command, RemoteWebDriver.When.EXCEPTION);
            String errorMessage = "Error communicating with the remote browser. It may have died.";
            if (driverCommand.equals("newSession")) {
                errorMessage = "Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.";
            }

            UnreachableBrowserException ube = new UnreachableBrowserException(errorMessage, var18);
            if (this.getSessionId() != null) {
                ube.addInfo("Session ID", this.getSessionId().toString());
            }

            if (this.getCapabilities() != null) {
                ube.addInfo("Capabilities", this.getCapabilities().toString());
            }

            throw ube;
        } finally {
            Thread.currentThread().setName(currentName);
        }

        try {
            this.errorHandler.throwIfResponseFailed(response, System.currentTimeMillis() - start);
        } catch (WebDriverException var16) {
            if (parameters != null && parameters.containsKey("using") && parameters.containsKey("value")) {
                var16.addInfo("*** Element info", String.format("{Using=%s, value=%s}", parameters.get("using"), parameters.get("value")));
            }

            var16.addInfo("Driver info", this.getClass().getName());
            if (this.getSessionId() != null) {
                var16.addInfo("Session ID", this.getSessionId().toString());
            }

            if (this.getCapabilities() != null) {
                var16.addInfo("Capabilities", this.getCapabilities().toString());
            }

            Throwables.propagate(var16);
        }

        return response;
    }

相关的URL参数封装在parameters中进行传输，首先封装在Command对象中，然后再调用DriverCommandExecutor的execute方法：

Command command = new Command(this.sessionId, driverCommand, parameters);

public Response execute(Command command) throws IOException {
        if ("newSession".equals(command.getName())) {
            this.service.start();
        }

        Response var2;
        try {
            var2 = super.execute(command);
        } catch (Throwable var7) {
            Throwable rootCause = Throwables.getRootCause(var7);
            if (rootCause instanceof ConnectException && "Connection refused".equals(rootCause.getMessage()) && !this.service.isRunning()) {
                throw new WebDriverException("The driver server has unexpectedly died!", var7);
            }

            Throwables.propagateIfPossible(var7);
            throw new WebDriverException(var7);
        } finally {
            if ("quit".equals(command.getName())) {
                this.service.stop();
            }

        }

        return var2;
    }

然后调用了HttpCommandExecutor中的execute方法，Comand对象中包含了之前相关的请求参数，包括之前的URL：

public Response execute(Command command) throws IOException {
        if (command.getSessionId() == null) {
            if ("quit".equals(command.getName())) {
                return new Response();
            }

            if (!"getAllSessions".equals(command.getName()) && !"newSession".equals(command.getName())) {
                throw new SessionNotFoundException("Session ID is null. Using WebDriver after calling quit()?");
            }
        }

        HttpRequest httpRequest = this.commandCodec.encode(command);

        try {
            this.log("profiler", new HttpProfilerLogEntry(command.getName(), true));
            HttpResponse httpResponse = this.client.execute(httpRequest, true);
            this.log("profiler", new HttpProfilerLogEntry(command.getName(), false));
            Response response = this.responseCodec.decode(httpResponse);
            if (response.getSessionId() == null && httpResponse.getTargetHost() != null) {
                String sessionId = HttpSessionId.getSessionId(httpResponse.getTargetHost());
                response.setSessionId(sessionId);
            }

            if ("quit".equals(command.getName())) {
                this.client.close();
            }

            return response;
        } catch (UnsupportedCommandException var6) {
            if (var6.getMessage() != null && !"".equals(var6.getMessage())) {
                throw var6;
            } else {
                throw new UnsupportedOperationException("No information from server. Command name was: " + command.getName(), var6.getCause());
            }
        }
    }

然后在org.openqa.selenium.remote.http.HttpRequest的encode方法封装http请求,封装完httpRequest后，后面就是转换相关的参数模拟浏览器访问发起请求了：

public HttpRequest encode(Command command) {
        JsonHttpCommandCodec.CommandSpec spec = (JsonHttpCommandCodec.CommandSpec)this.nameToSpec.get(command.getName());
        if (spec == null) {
            throw new UnsupportedCommandException(command.getName());
        } else {
            String uri = this.buildUri(command, spec);
            HttpRequest request = new HttpRequest(spec.method, uri);
            if (HttpMethod.POST == spec.method) {
                String content = this.beanToJsonConverter.convert(command.getParameters());
                byte[] data = content.getBytes(Charsets.UTF_8);
                request.setHeader("Content-Length", String.valueOf(data.length));
                request.setHeader("Content-Type", MediaType.JSON_UTF_8.toString());
                request.setContent(data);
            }

            if (HttpMethod.GET == spec.method) {
                request.setHeader("Cache-Control", "no-cache");
            }

            return request;
        }
    }

整个过程中没有对传入的url进行相关的安全检查。底层实际上是通过org.apache.httpcomponents.httpclient来发起请求的。加上Java网络请求支持的协议，还可以使用file协议进行任意文件读取：

这里尝试对url参数传入file:///etc/passwd,成功截屏相关的文件内容：

四、其他

除此之外，Java类库cdp4j也存在类似的问题（cdp4j具有清晰简洁的API，可自动执行基于Chrome / Chromium的浏览器。它使用Google Chrome DevTools协议来自动化基于Chrome / Chromium的浏览器。）

<dependency>
    <groupId>io.webfolder</groupId>
    <artifactId>cdp4j</artifactId>
    <version>2.2.1</version>
</dependency>

例如如下例子：

        ArrayList<String> arguments= new ArrayList<String>();
        //如果添加此行就不会弹出google浏览器
        //arguments.add("--headless");
        Launcher launcher = new Launcher();
        //第一个参数是本地谷歌浏览器的可执行地址
        try (SessionFactory factory = launcher.launch(Arrays.asList("--disable-gpu", "--headless"));
             Session session = factory.create()) {
            //这个参数是你想要爬取的网址
            session.navigate("********");
            //等待加载完毕
            session.waitDocumentReady();
            //获得爬取的数据
            String content = (String) session.getProperty("//body", "outerText");
            System.out.println("---------");
            System.out.println(content);
        }

如果navigate调用的url用户可控的话，那么存在ssrf风险,同样的也支持file协议：

综上，在使用第三方jar进行相关的业务实现时，要结合实际的场景过滤/检查用户可控的参数内容。避免产生不必要的安全风险。同时在进行黑盒测试时，对于网页截图类的业务场景，也是需要覆盖测试的风险点。

安全研究

JavaWeb网页截图中的ssrf

一、相关背景

二、selenium+chrome headless

三、代码分析

四、其他

一、相关背景

二、selenium+chrome headless

三、代码分析

四、其他

为您推荐

Ollama AI 框架中的严重缺陷可能导致 DoS、模型盗窃和中毒

Python网络安全:最强工具,保护你的网络世界

美国大选进入冲刺阶段！网络安全问题再成关注焦点！

PTZOptics相机的零日漏洞正在被广泛利用

谷歌警告安卓系统中存在被主动利用的 CVE-2024-43093 漏洞

我院助力宜宾市中小企业网络安全建设