Compare commits

...

128 Commits
4.5 ... master

Author SHA1 Message Date
Antoni Sawicki b098b3632a README update 2024-01-06 02:13:22 -08:00
Antoni Sawicki 5ec0266f75 update README 2024-01-06 02:10:01 -08:00
Antoni Sawicki 7d9cec6297 add some anti crawler detection 2024-01-02 21:55:39 -08:00
Antoni Sawicki 43df501d97 add useragent override flag 2024-01-02 21:54:59 -08:00
Antoni Sawicki 85d64a3577 copyright update 2024-01-02 02:09:38 -08:00
Antoni Sawicki 993e5723ed more mod updates 2024-01-02 02:08:14 -08:00
Antoni Sawicki 9056c798d6 go mod update 2024-01-02 02:00:30 -08:00
Antoni Sawicki 8fd65b3e47 readme update 2022-12-10 03:59:30 -08:00
Antoni Sawicki 8716a983ce readme update 2022-12-10 03:42:13 -08:00
Antoni Sawicki 4757cfd32b readme update 2022-12-10 03:38:56 -08:00
Antoni Sawicki 79e6f3a17e readme update 2022-12-10 03:38:22 -08:00
Antoni Sawicki d8f5c6fb28 print my own IP addresses 2022-12-08 01:49:46 -08:00
Antoni Sawicki c816ef712a readme update 2022-12-08 01:03:04 -08:00
Antoni Sawicki d28924583c move png to first case 2022-12-08 00:56:24 -08:00
Antoni Sawicki 1c17b39ea5 add jpeg encoding 2022-12-08 00:55:56 -08:00
Antoni Sawicki ecb2cc0c06 add todos for error handling 2022-12-08 00:28:28 -08:00
Antoni Sawicki b571df7a37 single function to handle context errors 2022-12-08 00:18:13 -08:00
Antoni Sawicki 78c568ac09 remove debug flag, never used 2022-12-07 23:54:18 -08:00
Antoni Sawicki 04a755749e close template files 2022-12-07 23:50:58 -08:00
Antoni Sawicki e983f244f8 move flags top top var() 2022-12-07 23:44:09 -08:00
Antoni Sawicki 9b76c045d6 fix html template embed 2022-11-30 02:00:53 -08:00
Antoni Sawicki c8e391274a update dependencies 2022-11-30 00:02:59 -08:00
Antoni Sawicki 7b1274b9d4 ver bump 2022-11-30 00:01:01 -08:00
Antoni Sawicki 0128b3ff8e add delay/sleep flag 2022-11-29 23:59:43 -08:00
Antoni Sawicki 6de3fad580
Merge pull request #101 from tenox7/fastgif
Fastgif
2022-11-29 23:45:53 -08:00
Antoni Sawicki d7dcb58adc readme update 2022-11-29 23:45:01 -08:00
Antoni Sawicki e5f83225f7 update readme 2022-11-29 07:02:06 -08:00
Antoni Sawicki 36803c4312 update readme 2022-11-29 06:57:31 -08:00
Antoni Sawicki 40e081be77 refactor fastgif code for readability 2022-11-29 05:50:17 -08:00
Antoni Sawicki 1c18fb9b81
Merge pull request #99 from mahiuchun/master
Introduce "fastgif" image type
2022-11-29 04:58:01 -08:00
Antoni Sawicki 363fbcd225 remove 32bit windows build 2022-11-24 02:11:05 -08:00
Hill Ma bf7e7bfb2c improve fastgif 2022-11-10 20:39:41 -08:00
Hill Ma fec812bc32 Add "fastgif" image type that provides much faster (but lower quality) GIF quantization.
The default image type is changed to PNG as it is the most "trouble free" choice.
2022-11-09 21:32:46 -08:00
Hill Ma a238a0ea6f gif: faster median cut quantization 2022-11-09 20:48:18 -08:00
Antoni Sawicki 444b7b31d7 update copyrights 2022-11-06 01:41:40 -08:00
Antoni Sawicki 76e72d5368 bump version 2022-11-06 01:39:03 -08:00
Antoni Sawicki c4d9833707 add todo 2022-11-06 01:38:31 -08:00
Antoni Sawicki 9cd286add8 if/else to switch 2022-11-06 01:20:18 -08:00
Antoni Sawicki fdfbe80024 more fortunate name static->builtin 2022-11-06 01:12:26 -08:00
Antoni Sawicki d602124ed6 replace out/req with more common w/r 2022-11-06 01:11:49 -08:00
Antoni Sawicki 6e5e829b02 rename w to rq so that w can be used for http writer 2022-11-06 01:07:21 -08:00
Antoni Sawicki f978d91ba9 make wrp request have methods for readability 2022-11-06 01:04:00 -08:00
Antoni Sawicki 9b15feacb2 fix chromedp scrollbar capture 2022-11-06 01:47:59 -07:00
Antoni Sawicki 08a89c6097 update dependencies 2022-11-06 00:51:44 -07:00
Antoni Sawicki fa3bd3f8fb print time took to encode gif 2022-11-06 00:27:10 -07:00
Antoni Sawicki 55f4e45b4c reverse number of colors dropdown 2022-11-05 23:36:25 -07:00
Antoni Sawicki 9e77aa7261
Merge pull request #97 from DrJosh9000/master
Replace statik with embed
2022-03-16 20:52:28 -07:00
Josh Deprez 8a9870d8e2 Remove unneeded go:generate directive 2022-03-17 02:34:22 +00:00
Josh Deprez 7c33bc67dc Replace statik with embed 2022-03-17 02:27:34 +00:00
Antoni Sawicki 48c4ab8254 add 32bit windows target 2021-03-22 06:28:37 -07:00
Antoni Sawicki bc5f8cabb1 readme update 2021-03-08 05:20:11 -08:00
Antoni Sawicki cc16d6c3b9 readme update 2021-03-08 05:17:57 -08:00
Antoni Sawicki 1d26a451ea hide arrow keys to save some space 2021-03-08 05:14:18 -08:00
Antoni Sawicki cfb608c1f3 rename Scale to Zoom 2021-03-08 05:02:22 -08:00
Antoni Sawicki d1fcd30db8 fix markdown for scale 2021-03-08 04:48:51 -08:00
Antoni Sawicki c7811a6886 add support for Stop and Reload buttons 2021-03-08 04:41:13 -08:00
Antoni Sawicki 3b88b8665b tidy up 2021-03-08 04:26:09 -08:00
Antoni Sawicki ad84f6d087 go mod update 2021-03-08 04:18:53 -08:00
Antoni Sawicki 97d1443d8c
Update README.md 2021-02-08 08:07:09 -08:00
Antoni Sawicki 82bbd4bdaa
Update README.md 2020-11-13 17:56:59 -08:00
Antoni Sawicki 36d0cdcb0a
Update README.md 2020-11-13 17:56:33 -08:00
Antoni Sawicki f5be172d43
Update README.md 2020-11-13 17:55:23 -08:00
Antoni Sawicki cf8b85e15a
Update README.md 2020-11-11 07:37:13 -08:00
Antoni Sawicki 700c4aa495 rearrange functions to be groupped together 2020-11-05 08:05:50 -08:00
Antoni Sawicki f05dde8188 remove methods as they don't change state, wrpReq is an input to a function 2020-11-05 07:43:06 -08:00
Antoni Sawicki 2a22cfd755 add more camels 2020-10-31 08:51:20 -07:00
Antoni Sawicki 56ac414405 change input boxes to dropdowns 2020-10-31 08:36:14 -07:00
Antoni Sawicki a79f477948 add support for built-in wrp.html 2020-10-29 07:16:14 -07:00
Antoni Sawicki c036841c0a revert sleep setting 2020-10-27 04:39:14 -07:00
Antoni Sawicki 878f43af75 add initial support for ui/html template 2020-10-27 04:35:57 -07:00
Antoni Sawicki b54ebbf9e5 move chromedp.Sleep to just before screenshot to fix render issue 2020-10-27 04:05:17 -07:00
Antoni Sawicki 5dc4699ac9 add missing return 2020-10-27 03:44:43 -07:00
Antoni Sawicki f6e1f3ee88 readme update 2020-10-26 02:15:29 -07:00
Antoni Sawicki 8aaf435225 readme update 2020-10-26 02:08:44 -07:00
Antoni Sawicki 4fa913a9dd readme update 2020-10-26 02:06:45 -07:00
Antoni Sawicki 4302731bc8 readme update 2020-10-26 02:01:32 -07:00
Antoni Sawicki d6b33ad140 remove else to improve readability 2020-10-26 01:49:03 -07:00
Antoni Sawicki 3dddb70be0 remove else to improve readability 2020-10-26 01:42:11 -07:00
Antoni Sawicki 3ff226e1df reverse headed and headless to improve readability 2020-10-26 01:37:25 -07:00
Antoni Sawicki 1ddf005a23 if to switch to improve readability 2020-10-26 01:32:09 -07:00
Antoni Sawicki 69efa1fb92 if to switch to improve readability 2020-10-26 01:27:21 -07:00
Antoni Sawicki 2c2fbd11a6 refactor stuff to improve readability 2020-10-26 01:21:42 -07:00
Antoni Sawicki 259f998787
Merge pull request #73 from debounce2/master
add support for monochrome 2 colors
2020-09-29 12:21:47 -07:00
debounce 1ab9124a4f add halfgone to go modules 2020-09-29 11:39:42 -07:00
Antoni Sawicki 4d0c8b9e7e reference old archive repo 2020-09-28 22:00:20 -07:00
Antoni Sawicki fa25e816a7 moved old folder to separate repo 2020-09-28 21:58:28 -07:00
debounce 36427fac64 add support for monochrome 2 colors 2020-09-28 12:19:27 -07:00
Antoni Sawicki ac594cdebd
Update README.md 2020-04-28 04:50:55 -07:00
Antoni Sawicki 11b5ce9b6d
Update README.md 2020-04-28 04:40:02 -07:00
Antoni Sawicki b8ae1ceba5
readme update for ACI 2020-04-28 04:37:15 -07:00
Antoni Sawicki 3224c63fd1
readme update 2020-04-27 19:22:39 -07:00
Antoni Sawicki 733be4a14a
readme update 2020-04-27 18:58:45 -07:00
Antoni Sawicki 4533e38a31 readme update 2020-04-27 15:52:11 -07:00
Antoni Sawicki d4043f0b7d update readme to add Chromium 2020-04-27 15:49:59 -07:00
Antoni Sawicki c64380dd72 args print with %q 2020-04-27 12:29:18 -07:00
Antoni Sawicki 4d911cb330 add printing image type / geometry 2020-04-27 12:22:59 -07:00
Antoni Sawicki a3beaf4b14 fix typo 2020-04-27 12:19:57 -07:00
Antoni Sawicki 311bb829da print args to aid debugging 2020-04-27 12:12:39 -07:00
Antoni Sawicki 41dfa7dae2 readme update 2020-04-27 01:56:56 -07:00
Antoni Sawicki 889561aeb0 Add Cloud Run info 2020-04-27 01:49:55 -07:00
Antoni Sawicki b90300ba2d allow env var PORT to override listen address flag for Cloud RUN, etc 2020-04-27 01:05:02 -07:00
Antoni Sawicki c80cb876ce readme update 2020-04-26 02:30:16 -07:00
Antoni Sawicki ef04d2da72 more fortunate log entry for wrpreq/from 2020-04-26 01:33:20 -07:00
Antoni Sawicki 2e9773f705 break down footer fprintf to multiple lines 2020-04-26 01:32:08 -07:00
Antoni Sawicki 0957fedaee change image size from MB to KB 2020-04-26 01:25:28 -07:00
Antoni Sawicki 60ca1a0d50 lame attempt at restarting cancelled context 2020-04-26 00:23:31 -07:00
Antoni Sawicki 0c728b08fe readme update 2020-04-26 00:18:27 -07:00
Antoni Sawicki 81b47eb59c readme update 2020-04-26 00:17:18 -07:00
Antoni Sawicki 15ebf497b8 readme update 2020-04-26 00:15:27 -07:00
Antoni Sawicki 9215ed57c0 readme update 2020-04-26 00:14:33 -07:00
Antoni Sawicki ba0b521762 add go module cruft 2020-04-26 00:07:50 -07:00
Antoni Sawicki 34b25be7d7 rename wrpReq variable names to be more readable 2020-04-25 23:56:14 -07:00
Antoni Sawicki c4e3671468 add some comments 2020-04-24 03:09:20 -07:00
Antoni Sawicki f73c778b7c add some comments 2020-04-24 03:06:21 -07:00
Antoni Sawicki c9cedb7f81 split capture with kbdmouse function 2020-04-24 02:45:34 -07:00
Antoni Sawicki f69a6e5219 include http req and out in wrpReq sturct 2020-04-24 00:47:11 -07:00
Antoni Sawicki a258f603b3 add chrome requirement info 2020-04-23 03:30:06 -07:00
Antoni Sawicki b30458930b copyright update 2020-04-23 03:27:16 -07:00
Antoni Sawicki 9fca2704dc gitignore update 2020-04-23 03:26:56 -07:00
Antoni Sawicki 62b11cb216 fix css computed style property type 2020-04-23 03:25:39 -07:00
Antoni Sawicki 78f9598af5
Update README.md 2020-03-23 19:54:03 -07:00
Antoni Sawicki 5ce1c2456f
Update README.md 2020-03-23 19:47:35 -07:00
Antoni Sawicki c93c2c883e
Merge pull request #65 from khawkins98/patch-1
Add a troubleshooting section
2020-02-11 22:20:14 -08:00
Ken Hawkins d7a47d366b
Add a troubleshooting section
I'm an idiot and it took me a minute to realise that my download was not broken and I should open this in the terminal. Some text like this in the README.md might help others.
2020-02-11 20:20:46 +01:00
Antoni Sawicki ffcaca4907 readme update 2019-12-25 03:42:15 -08:00
Antoni Sawicki 260840adb5 readme update 2019-12-25 03:39:07 -08:00
Antoni Sawicki fcd746aa9a readme updates 2019-12-25 03:35:48 -08:00
Antoni Sawicki e2c06b2e7b change int64 to float64 for chromedp.MouseClickXY 2019-12-25 03:04:46 -08:00
12 changed files with 720 additions and 2359 deletions

5
.gitignore vendored
View File

@ -1 +1,4 @@
wrp-*
wrp-*
wrp
wrp.exe
statik

3
Makefile Normal file → Executable file
View File

@ -3,11 +3,12 @@ all: wrp
wrp: wrp.go
go build wrp.go
cross:
cross:
GOOS=linux GOARCH=amd64 go build -a -o wrp-amd64-linux wrp.go
GOOS=freebsd GOARCH=amd64 go build -a -o wrp-amd64-freebsd wrp.go
GOOS=openbsd GOARCH=amd64 go build -a -o wrp-amd64-openbsd wrp.go
GOOS=darwin GOARCH=amd64 go build -a -o wrp-amd64-macos wrp.go
GOOS=darwin GOARCH=arm64 go build -a -o wrp-arm64-macos wrp.go
GOOS=windows GOARCH=amd64 go build -a -o wrp-amd64-windows.exe wrp.go
GOOS=linux GOARCH=arm go build -a -o wrp-arm-linux wrp.go
GOOS=linux GOARCH=arm64 go build -a -o wrp-arm64-linux wrp.go

145
README.md
View File

@ -4,64 +4,149 @@ A browser-in-browser "proxy" server that allows to use historical / vintage web
![Internet Explorer 1.5 doing Gmail](wrp.png)
## Usage
## Usage Instructions
1. [Download a WRP binary](https://github.com/tenox7/wrp/releases/) and run it on a machine that will become your WRP gateway/server.
2. Point your legacy browser to `http://address:port` of WRP server. Do not set or use it as a "http proxy server".
3. Type a search string or a http/https URL and click Go.
4. Adjust your screen width/height/scale/#colors to fit in your old browser.
5. Scroll web page by clicking on the in-image scroll bar.
6. Do not use client browser history-back, instead use **Bk** button in the app.
7. To send keystrokes, fill **K** input box and press Go. There also are buttons for backspace, enter and arrow keys.
8. Experimentally you can set height **H** to `0` to render in to one tall image without the vertical scrollbar. Note it will be large, slow to process, download and display on client browser.
* [Download a WRP binary](https://github.com/tenox7/wrp/releases/) and run it on a machine that will become your WRP gateway/server. This should be modern hardware, OS and Google Chrome / Chromium Browser is required to be preinstalled. Do not try to run WRP on an old machine like Windows XP or 98.
* Make sure you have disabled firewall or open port WRP is listening on (by default 8080).
* Point your legacy browser to `http://address:port` of the WRP server. Do not set or use it as a "proxy server".
* Type a search string or a full http/https URL and click **Go**.
* Adjust your screen **W**idth/**H**eight/**S**cale/**C**olors to fit in your old browser.
* Scroll web page by clicking on the in-image scroll bar.
* WRP also allows **a single tall image without the vertical scrollbar** and use client scrolling. To enable this, simply height **H** to `0` . However this should not be used with old and low spec clients. Such tall images will be very large, take a lot of memory and long time to process, especially for GIFs.
* Do not use client browser history-back, instead use **Bk** button in the app.
* You can re-capture page screenshot without reloading by using **St** (Stop). This is useful if page didn't render fully before screenshot is taken.
* You can also reload and re-capture current page with **Re** (Reload).
* To send keystrokes, fill **K** input box and press **Go**. There also are buttons for backspace, enter and arrow keys.
* Prefer PNG over GIF if your browser supports it. PNG is much faster, whereas GIF requires a lot of additional processing on both client and server to encode/decode. Jpeg encoding is also quite fast.
* GIF images are by default encoded with 216 colors, "web safe" palette. This uses an ultra fast but not very accurate color mapping algorithm. If you want better color representation switch to 256 color mode.
## UI explanation
The first unnamed input box is either search (google) or URL starting with http/https
**Go** instructs browser to navigate to the url or perform search
**Bk** is History Back
**St** is Stop, also re-capture screenshot without refreshing page, for example if page
render takes a long time or it changes periodically
**Re** is Reload
**W** is width in pixels, adjust it to get rid of horizontal scroll bar
**H** is height in pixels, adjust it to get rid of vertical scroll bar.
It can also be set to 0 to produce one very tall image and use
client scroll. This 0 size is experimental, buggy and should be
used with PNG and lots of memory on a client side.
**Z** is zoom or scale
**C** is colors, for GIF images only (unused in PNG, JPG)
**K** is keystroke input, you can type some letters in it and when you click Go it will be typed in the remote browser.
**Bs** is backspace
**Rt** is return / enter
**< ^ v >** are arrow keys, typically for navigating a map, buggy.
### UI Customization
WRP supports customizing it's own UI using HTML Template file. Download [wrp.html](wrp.html) place in the same directory with wrp binary customize it to your liking.
## Docker
docker hub:
```shell
docker run -d -p 8080:8080 tenox7/wrp
$ docker run -d -p 80:8080 tenox7/wrp
```
gcr.io:
## Google Cloud Run
```shell
docker run -d -p 8080:8080 gcr.io/tenox7/wrp:latest
$ gcloud run deploy --platform managed --image=gcr.io/tenox7/wrp:latest --memory=2Gi --args='-t=png','-g=1280x0x256'
```
Or from [Gcloud Console](https://console.cloud.google.com/run). Use `gcr.io/tenox7/wrp:latest` as container image URL.
Note that unfortunately GCR forces https. Your browser support of encryption protocols and certification authorities will vary.
## Azure Container Instances
```shell
$ az container create --resource-group wrp --name wrp --image gcr.io/tenox7/wrp:latest --cpu 1 --memory 2 --ports 80 --protocol tcp --os-type Linux --ip-address Public --command-line '/wrp -l :80 -t png -g 1280x0x256'
```
Or from the [Azure Console](https://portal.azure.com/#create/Microsoft.ContainerInstances). Use `gcr.io/tenox7/wrp:latest` or `tenox7/wrp:latest` for image name.
Fortunately ACI allows port 80 without encryption.
## Flags
```flags
-l listen address:port, default :8080
-t image type gif (default) or png, when using PNG number of colors is ignored
-g image geometry, WxHXC, height can be 0 for unlimited, default 1152x600x256"
-h headed mode, display browser window on the server
-d chromedp debug logging
-n do not free maps and gif images after use
```text
-l listen address:port (default :8080)
-t image type gif, png or jpg (default gif)
-g image geometry, WxHxC, height can be 0 for unlimited (default 1152x600x216)
C (number of colors) is only used for GIF
-q Jpeg image quality, default 80%
-h headless mode, hide browser window on the server (default true)
-d chromedp debug logging (default false)
-n do not free maps and images after use (default false)
-ui html template file (default "wrp.html")
-ua user agent, override the default "headless" agent
-s delay/sleep after page is rendered before screenshot is taken (default 2s)
```
## Minimal Requirements
* Server Gateway should run on a modern hardware/os that supports memory hungry Chrome.
* Client Browser needs to support `HTML FORMs` and `ISMAP`. Typically Mosaic 2.0 would be minimum version for forms. However ISMAP was supported since 0.6B, so if you manually enter url using `?url=...`, you can use the ealier version.
* Server/Gateway requires modern hardware and operating system that is supported by [Go language](https://github.com/golang/go/wiki/MinimumRequirements) and Chrome/Chromium Browser, which must be installed.
* Client Browser needs to support `HTML FORMs` and `ISMAP`. Typically [Mosaic 2.0](http://www.ncsa.illinois.edu/enabling/mosaic/versions) would be minimum version for forms. However ISMAP was supported since 0.6B, so if you manually enter url using `?url=...`, you can use the earlier version.
## Troubleshooting
### I can't get it to run
This program does not have a GUI and is run from the command line. After downloading, you may need to enable executable bit on Unix systems, for example:
```shell
$ cd ~/Downloads
$ chmod +x wrp-amd64-macos
$ ./wrp-amd64-macos
```
### Websites are blocking headless browsers
This is a well known issue. WRP has some provisions to work around it, but it's a cat and mouse game. The first and
foremost recommendation is to change `User Agent`, so that it doesn't say "headless". Add `-ua="my agent"` to override the default one.
Obtain your regular desktop browser user agent and specify it as the flag. For example
```shell
$ wrp -ua="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"
```
## History
* In 2014, version 1.0 started as a *cgi-bin* script, adaptation of `webkit2png.py` and `pcidade.py`, [blog post](https://virtuallyfun.com/2014/03/03/surfing-modern-web-with-ancient-browsers/).
* Later in 2014, version 2.0 became a stand alone http-proxy server, also support for both Linux/MacOS, [another post](https://virtuallyfun.com/wordpress/2014/03/11/web-rendering-proxy-update//).
* In 2016 the whole internet migrated to HTTPS/SSL/TLS and WRP largely stopped working. Python code became unmaintainable and mostly unportable (especially to Windows, even WSL).
* In 2019 WRP 3.0 has been rewritten in Golang/Chromedp as browser-in-browser instead of http proxy.
* Later in 2019, WRP 4.0 has been completely refactored to use mouse clicks instead using a href nodes. Also in 4.1 added sending keystrokes in to input boxes. You can now login to Gmail. Also now runs as a Docker container.
* Version 1.0 (2014) started as a *cgi-bin* script, adaptation of `webkit2png.py` and `pcidade.py`, [blog post](https://virtuallyfun.com/2014/03/03/surfing-modern-web-with-ancient-browsers/).
* Version 2.0 became a stand alone http-proxy server, supporting both Linux and MacOS, [another post](https://virtuallyfun.com/wordpress/2014/03/11/web-rendering-proxy-update//).
* In 2016 thanks to EFF/Certbot the whole internet migrated to HTTPS/SSL/TLS and WRP largely stopped working. Python code became unmaintainable and there was no easy way to make it work on Windows, even under WSL.
* Version 3.0 (2019) has been rewritten in [Go](https://golang.org/) using [Chromedp](https://github.com/chromedp) as browser-in-browser instead of http-proxy. The initial version was [less than 100 lines of code](https://gist.github.com/tenox7/b0f03c039b0a8b67f6c1bf47e2dd0df0).
* Version 4.0 has been completely refactored to use mouse clicks via imagemap instead parsing a href nodes.
* Version 4.1 added sending keystrokes in to input boxes. You can now login to Gmail. Also now runs as a Docker container and on Cloud Run/Azure Containers.
* Version 4.5 introduces rendering whole pages in to a single tall image with client scrolling.
* Version 4.6 adds blazing fast gif encoding by [Hill Ma](https://github.com/mahiuchun).
## Credits
## Credits
* Uses [chromedp](https://github.com/chromedp), thanks to [mvdan](https://github.com/mvdan) for dealing with my issues
* Uses [go-quantize](https://github.com/ericpauley/go-quantize), thanks to [ericpauley](https://github.com/ericpauley) for developing the missing go quantizer
* Thanks to Jason Stevens of [Fun With Virtualization](https://virtuallyfun.com/) for graciously hosting my rumblings
* Thanks to [claunia](https://github.com/claunia/) for help with the Python/Webkit version in the past
* Thanks to [Hill Ma](https://github.com/mahiuchun) for ultra fast gif encoding algorithm
* Historical Python/Webkit versions and prior art can be seen in [wrp-old](https://github.com/tenox7/wrp-old) repo
## Legal Stuff
License: Apache 2.0
Copyright (c) 2013-2018 Antoni Sawicki
Copyright (c) 2019 Google LLC
Copyright (c) 2019-2024 Google LLC

20
go.mod Normal file
View File

@ -0,0 +1,20 @@
module github.com/tenox7/wrp
go 1.21.5
require (
github.com/MaxHalford/halfgone v0.0.0-20171017091812-482157b86ccb
github.com/chromedp/cdproto v0.0.0-20231205062650-00455a960d61
github.com/chromedp/chromedp v0.9.3
github.com/soniakeys/quant v1.0.0
)
require (
github.com/chromedp/sysutil v1.0.0 // indirect
github.com/gobwas/httphead v0.1.0 // indirect
github.com/gobwas/pool v0.2.1 // indirect
github.com/gobwas/ws v1.3.1 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
golang.org/x/sys v0.15.0 // indirect
)

29
go.sum Normal file
View File

@ -0,0 +1,29 @@
github.com/MaxHalford/halfgone v0.0.0-20171017091812-482157b86ccb h1:YQ+d0g0P0F/06oDoeEgDHeZCIrnKgLxXcqYOpe8sTuU=
github.com/MaxHalford/halfgone v0.0.0-20171017091812-482157b86ccb/go.mod h1:J86XzS1wgzJPjpQmpriJ+SetP17JSQUd9l+HWQK86jA=
github.com/chromedp/cdproto v0.0.0-20231011050154-1d073bb38998/go.mod h1:GKljq0VrfU4D5yc+2qA6OVr8pmO/MBbPEWqWQ/oqGEs=
github.com/chromedp/cdproto v0.0.0-20231205062650-00455a960d61 h1:XD280QPATe9jaz20dylKe3vBsNcH1w3mkssGY0lidn8=
github.com/chromedp/cdproto v0.0.0-20231205062650-00455a960d61/go.mod h1:GKljq0VrfU4D5yc+2qA6OVr8pmO/MBbPEWqWQ/oqGEs=
github.com/chromedp/chromedp v0.9.3 h1:Wq58e0dZOdHsxaj9Owmfcf+ibtpYN1N0FWVbaxa/esg=
github.com/chromedp/chromedp v0.9.3/go.mod h1:NipeUkUcuzIdFbBP8eNNvl9upcceOfWzoJn6cRe4ksA=
github.com/chromedp/sysutil v1.0.0 h1:+ZxhTpfpZlmchB58ih/LBHX52ky7w2VhQVKQMucy3Ic=
github.com/chromedp/sysutil v1.0.0/go.mod h1:kgWmDdq8fTzXYcKIBqIYvRRTnYb9aNS9moAV0xufSww=
github.com/gobwas/httphead v0.1.0 h1:exrUm0f4YX0L7EBwZHuCF4GDp8aJfVeBrlLQrs6NqWU=
github.com/gobwas/httphead v0.1.0/go.mod h1:O/RXo79gxV8G+RqlR/otEwx4Q36zl9rqC5u12GKvMCM=
github.com/gobwas/pool v0.2.1 h1:xfeeEhW7pwmX8nuLVlqbzVc7udMDrwetjEv+TZIz1og=
github.com/gobwas/pool v0.2.1/go.mod h1:q8bcK0KcYlCgd9e7WYLm9LpyS+YeLd8JVDW6WezmKEw=
github.com/gobwas/ws v1.3.0/go.mod h1:hRKAFb8wOxFROYNsT1bqfWnhX+b5MFeJM9r2ZSwg/KY=
github.com/gobwas/ws v1.3.1 h1:Qi34dfLMWJbiKaNbDVzM9x27nZBjmkaW6i4+Ku+pGVU=
github.com/gobwas/ws v1.3.1/go.mod h1:hRKAFb8wOxFROYNsT1bqfWnhX+b5MFeJM9r2ZSwg/KY=
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
github.com/ledongthuc/pdf v0.0.0-20220302134840-0c2507a12d80 h1:6Yzfa6GP0rIo/kULo2bwGEkFvCePZ3qHDDTC3/J9Swo=
github.com/ledongthuc/pdf v0.0.0-20220302134840-0c2507a12d80/go.mod h1:imJHygn/1yfhB7XSJJKlFZKl/J+dCPAknuiaGOshXAs=
github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0=
github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc=
github.com/orisano/pixelmatch v0.0.0-20220722002657-fb0b55479cde h1:x0TT0RDC7UhAVbbWWBzr41ElhJx5tXPWkIHA2HWPRuw=
github.com/orisano/pixelmatch v0.0.0-20220722002657-fb0b55479cde/go.mod h1:nZgzbfBr3hhjoZnS66nKrHmduYNpc34ny7RK4z5/HM0=
github.com/soniakeys/quant v1.0.0 h1:N1um9ktjbkZVcywBVAAYpZYSHxEfJGzshHCxx/DaI0Y=
github.com/soniakeys/quant v1.0.0/go.mod h1:HI1k023QuVbD4H8i9YdfZP2munIHU4QpjsImz6Y6zds=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.15.0 h1:h48lPFYpsTvQJZF4EKyI4aLHaev3CxivZmv7yZig9pc=
golang.org/x/sys v0.15.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=

View File

@ -1,9 +0,0 @@
Historical versions of WRP and prior art
License: GNU
Copyright (c) 2013-2018 Antoni Sawicki
Copyright (c) 2012-2013 picidae.net
Copyright (c) 2004-2013 Paul Hammond
Copyright (c) 2017-2018 Natalia Portillo
Copyright (c) 2018 //gir.st/

View File

@ -1,392 +0,0 @@
#!/usr/bin/env python
# picidae.py - makes screenshots of webpages
# and analyzes the webpage structure and writes image-maps of the links
# as well as forms that are placed on the exact position of the old form.
# It is a part of the art project www.picidae.net
# http://www.picidae.net
#
# This script is based on webkit2png from Paul Hammond.
# It was extended by picidae.net
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
__version__ = "1.0"
import sys
#print "hello ... "
try:
import Foundation
import WebKit
import AppKit
import objc
import urllib
except ImportError:
print "Cannot find pyobjc library files. Are you sure it is installed?"
sys.exit()
#try:
# from optparse import OptionParser
#except ImportError:
# print "OptionParser not imported"
# sys.exit()
from optparse import OptionParser
class AppDelegate (Foundation.NSObject):
# what happens when the app starts up
def applicationDidFinishLaunching_(self, aNotification):
webview = aNotification.object().windows()[0].contentView()
webview.frameLoadDelegate().getURL(webview)
class WebkitLoad (Foundation.NSObject, WebKit.protocols.WebFrameLoadDelegate):
# what happens if something goes wrong while loading
def webView_didFailLoadWithError_forFrame_(self,webview,error,frame):
print " ... something went wrong 1"
self.getURL(webview)
def webView_didFailProvisionalLoadWithError_forFrame_(self,webview,error,frame):
print " ... something went wrong 2"
self.getURL(webview)
def makeFilename(self,URL,options):
# make the filename
if options.filename:
filename = options.filename
elif options.md5:
try:
import md5
except ImportError:
print "--md5 requires python md5 library"
AppKit.NSApplication.sharedApplication().terminate_(None)
filename = md5.new(URL).hexdigest()
else:
import re
filename = re.sub('\W','',URL);
filename = re.sub('^http','',filename);
if options.datestamp:
import time
now = time.strftime("%Y%m%d")
filename = now + "-" + filename
import os
dir = os.path.abspath(os.path.expanduser(options.dir))
return os.path.join(dir,filename)
def saveImages(self,bitmapdata,filename,options):
# save the fullsize png
if options.fullsize:
bitmapdata.representationUsingType_properties_(AppKit.NSPNGFileType,None).writeToFile_atomically_(filename + ".png",objc.YES)
if options.thumb or options.clipped:
# work out how big the thumbnail is
width = bitmapdata.pixelsWide()
height = bitmapdata.pixelsHigh()
thumbWidth = (width * options.scale)
thumbHeight = (height * options.scale)
# make the thumbnails in a scratch image
scratch = AppKit.NSImage.alloc().initWithSize_(
Foundation.NSMakeSize(thumbWidth,thumbHeight))
scratch.lockFocus()
AppKit.NSGraphicsContext.currentContext().setImageInterpolation_(
AppKit.NSImageInterpolationHigh)
thumbRect = Foundation.NSMakeRect(0.0, 0.0, thumbWidth, thumbHeight)
clipRect = Foundation.NSMakeRect(0.0,
thumbHeight-options.clipheight,
options.clipwidth, options.clipheight)
bitmapdata.drawInRect_(thumbRect)
thumbOutput = AppKit.NSBitmapImageRep.alloc().initWithFocusedViewRect_(thumbRect)
clipOutput = AppKit.NSBitmapImageRep.alloc().initWithFocusedViewRect_(clipRect)
scratch.unlockFocus()
# save the thumbnails as pngs
if options.thumb:
thumbOutput.representationUsingType_properties_(
AppKit.NSPNGFileType,None
).writeToFile_atomically_(filename + "-thumb.png",objc.YES)
if options.clipped:
clipOutput.representationUsingType_properties_(
AppKit.NSPNGFileType,None
).writeToFile_atomically_(filename + "-clipped.png",objc.YES)
def getURL(self,webview):
if self.urls:
if self.urls[0] == '-':
url = sys.stdin.readline().rstrip()
if not url: AppKit.NSApplication.sharedApplication().terminate_(None)
else:
url = self.urls.pop(0)
else:
AppKit.NSApplication.sharedApplication().terminate_(None)
#print "<urlcall href=\"\" />", url, "..."
#print "<urlcall href=\"%s\" />" % (url)
self.resetWebview(webview)
webview.mainFrame().loadRequest_(Foundation.NSURLRequest.requestWithURL_(Foundation.NSURL.URLWithString_(url)))
if not webview.mainFrame().provisionalDataSource():
print "<nosuccess />"
self.getURL(webview)
def resetWebview(self,webview):
rect = Foundation.NSMakeRect(0,0,self.options.initWidth,self.options.initHeight)
webview.window().setContentSize_((self.options.initWidth,self.options.initHeight))
webview.setFrame_(rect)
def resizeWebview(self,view):
view.window().display()
view.window().setContentSize_(view.bounds().size)
view.setFrame_(view.bounds())
def captureView(self,view):
view.lockFocus()
bitmapdata = AppKit.NSBitmapImageRep.alloc()
bitmapdata.initWithFocusedViewRect_(view.bounds())
view.unlockFocus()
return bitmapdata
# what happens when the page has finished loading
def webView_didFinishLoadForFrame_(self,webview,frame):
# don't care about subframes
if (frame == webview.mainFrame()):
view = frame.frameView().documentView()
self.resizeWebview(view)
URL = frame.dataSource().initialRequest().URL().absoluteString()
filename = self.makeFilename(URL, self.options)
bitmapdata = self.captureView(view)
self.saveImages(bitmapdata,filename,self.options)
# ----------------------------------
# picidae my stuff
#print "url"
print "<page>"
print frame.dataSource().request().URL().absoluteString()
print "</page>"
# Analyse HTML and get links
xmloutput = "<map name=\"map\">\r";
domdocument = frame.DOMDocument()
domnodelist = domdocument.getElementsByTagName_('A')
i = 0
while i < domnodelist.length():
# linkvalue
value = domnodelist.item_(i).valueForKey_('href')
# position-rect
myrect = domnodelist.item_(i).boundingBox()
xmin = Foundation.NSMinX(myrect)
ymin = Foundation.NSMinY(myrect)
xmax = Foundation.NSMaxX(myrect)
ymax = Foundation.NSMaxY(myrect)
# print Link
prefix = ""
xmloutput += "<area shape=\"rect\" coords=\"%i,%i,%i,%i\" alt=\"\"><![CDATA[%s%s]]></area>\r" % (xmin, ymin, xmax, ymax, prefix, value)
i += 1
#print "</map>"
xmloutput += "</map>"
f = open(filename +'.xml', 'w+')
f.write(xmloutput)
f.close()
# ----------------------------------
# get forms
xmloutput = "<forms>\r";
xmloutput += "<page><![CDATA["
xmloutput += frame.dataSource().request().URL().absoluteString()
xmloutput += "]]></page>\r"
domdocument = frame.DOMDocument()
domnodelist = domdocument.getElementsByTagName_('form')
i = 0
while i < domnodelist.length():
# form
action = domnodelist.item_(i).valueForKey_('action')
method = domnodelist.item_(i).valueForKey_('method')
xmloutput += "<form method=\"%s\" ><action><![CDATA[%s]]></action>\r" % (method, action)
# form fields
fieldlist = domnodelist.item_(i).getElementsByTagName_('input')
j=0
while j < fieldlist.length():
# values
type = fieldlist.item_(j).valueForKey_('type')
name = fieldlist.item_(j).valueForKey_('name')
formvalue = fieldlist.item_(j).valueForKey_('value')
size = fieldlist.item_(j).valueForKey_('size')
checked = fieldlist.item_(j).valueForKey_('checked')
# write output
xmloutput += "\t<input "
if (type):
xmloutput += "type=\"%s\" " % (type)
if (name):
xmloutput += "name=\"%s\" " % (name)
if (size):
xmloutput += "size=\"%s\" " % (size)
if (type and type != "hidden"):
myrect = fieldlist.item_(j).boundingBox()
xmin = Foundation.NSMinX(myrect)
ymin = Foundation.NSMinY(myrect)
xmax = Foundation.NSMaxX(myrect)
ymax = Foundation.NSMaxY(myrect)
height = ymax - ymin
width = xmax - xmin
if (type == "radio" or type == "checkbox"):
xmin -= 3
ymin -= 3
xmloutput += "style=\"position:absolute;top:%i;left:%i;width:%i;height:%i;\" " % (ymin, xmin, width, height)
if (checked):
xmloutput += "checked=\"%s\" " % (checked)
xmloutput += "><![CDATA["
if (formvalue and type!="text" and type!="password"):
#xmloutput += urllib.quote(formvalue)
dummy=10
xmloutput += "]]></input>\r"
j += 1
xmloutput += "</form>\r"
i += 1
xmloutput += "</forms>"
f = open(filename +'.form.xml', 'w+')
f.write(xmloutput)
f.close()
# End picidae
# ----------------------------------
#print " ... done"
self.getURL(webview)
#trying to give back the real url
def main():
# parse the command line
usage = """%prog [options] [http://example.net/ ...]
examples:
%prog http://google.com/ # screengrab google
%prog -W 1000 -H 1000 http://google.com/ # bigger screengrab of google
%prog -T http://google.com/ # just the thumbnail screengrab
%prog -TF http://google.com/ # just thumbnail and fullsize grab
%prog -o foo http://google.com/ # save images as "foo-thumb.png" etc
%prog - # screengrab urls from stdin"""
cmdparser = OptionParser(usage, version=("webkit2png "+__version__))
# TODO: add quiet/verbose options
cmdparser.add_option("-W", "--width",type="float",default=800.0,
help="initial (and minimum) width of browser (default: 800)")
cmdparser.add_option("-H", "--height",type="float",default=600.0,
help="initial (and minimum) height of browser (default: 600)")
cmdparser.add_option("--clipwidth",type="float",default=200.0,
help="width of clipped thumbnail (default: 200)",
metavar="WIDTH")
cmdparser.add_option("--clipheight",type="float",default=150.0,
help="height of clipped thumbnail (default: 150)",
metavar="HEIGHT")
cmdparser.add_option("-s", "--scale",type="float",default=0.25,
help="scale factor for thumbnails (default: 0.25)")
cmdparser.add_option("-m", "--md5", action="store_true",
help="use md5 hash for filename (like del.icio.us)")
cmdparser.add_option("-o", "--filename", type="string",default="",
metavar="NAME", help="save images as NAME.png,NAME-thumb.png etc")
cmdparser.add_option("-F", "--fullsize", action="store_true",
help="only create fullsize screenshot")
cmdparser.add_option("-T", "--thumb", action="store_true",
help="only create thumbnail sreenshot")
cmdparser.add_option("-C", "--clipped", action="store_true",
help="only create clipped thumbnail screenshot")
cmdparser.add_option("-d", "--datestamp", action="store_true",
help="include date in filename")
cmdparser.add_option("-D", "--dir",type="string",default="./",
help="directory to place images into")
(options, args) = cmdparser.parse_args()
if len(args) == 0:
cmdparser.print_help()
return
if options.filename:
if len(args) != 1 or args[0] == "-":
print "--filename option requires exactly one url"
return
if options.scale == 0:
cmdparser.error("scale cannot be zero")
# make sure we're outputing something
if not (options.fullsize or options.thumb or options.clipped):
options.fullsize = True
options.thumb = True
options.clipped = True
# work out the initial size of the browser window
# (this might need to be larger so clipped image is right size)
options.initWidth = (options.clipwidth / options.scale)
options.initHeight = (options.clipheight / options.scale)
if options.width>options.initWidth:
options.initWidth = options.width
if options.height>options.initHeight:
options.initHeight = options.height
app = AppKit.NSApplication.sharedApplication()
# create an app delegate
delegate = AppDelegate.alloc().init()
AppKit.NSApp().setDelegate_(delegate)
# create a window
rect = Foundation.NSMakeRect(-16000,-16000,100,100)
win = AppKit.NSWindow.alloc()
win.initWithContentRect_styleMask_backing_defer_ (rect,
AppKit.NSBorderlessWindowMask, 2, 0)
# create a webview object
webview = WebKit.WebView.alloc()
webview.initWithFrame_(rect)
# turn off scrolling so the content is actually x wide and not x-15
webview.mainFrame().frameView().setAllowsScrolling_(objc.NO)
# add the webview to the window
win.setContentView_(webview)
# create a LoadDelegate
loaddelegate = WebkitLoad.alloc().init()
loaddelegate.options = options
loaddelegate.urls = args
webview.setFrameLoadDelegate_(loaddelegate)
app.run()
if __name__ == '__main__' : main()

View File

@ -1,506 +0,0 @@
#!/usr/bin/python
# webkit2png - makes screenshots of web pages
# http://www.paulhammond.org/webkit2png
__version__ = "dev"
# Copyright (c) 2004-2013 Paul Hammond
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
import sys
import optparse
import re
import os
try:
import Foundation
import WebKit
import AppKit
import Quartz
import objc
except ImportError:
print "Cannot find pyobjc library files. Are you sure it is installed?"
sys.exit()
class AppDelegate(Foundation.NSObject):
# what happens when the app starts up
def applicationDidFinishLaunching_(self, aNotification):
webview = aNotification.object().windows()[0].contentView()
webview.frameLoadDelegate().getURL(webview)
self.performSelector_withObject_afterDelay_("timeout:", None,
self.timeout)
def timeout_(self, obj):
Foundation.NSLog("timed out!")
AppKit.NSApplication.sharedApplication().terminate_(None)
class Webkit2PngScriptBridge(Foundation.NSObject):
def init(self):
self = super(Webkit2PngScriptBridge, self).init()
self.is_stopped = False
self.start_callback = False
return self
def stop(self):
self.is_stopped = True
def start(self):
self.is_stopped = False
self.start_callback()
def isSelectorExcludedFromWebScript_(self, sel):
if sel in ['stop', 'start']:
return False
else:
return True
class WebkitLoad (Foundation.NSObject, WebKit.protocols.WebFrameLoadDelegate):
# what happens if something goes wrong while loading
def webView_didFailLoadWithError_forFrame_(self, webview, error, frame):
if error.code() == Foundation.NSURLErrorCancelled:
return
print " ... something went wrong: "+error.localizedDescription()
self.getURL(webview)
def webView_didFailProvisionalLoadWithError_forFrame_(self, webview, error,
frame):
if error.code() == Foundation.NSURLErrorCancelled:
return
print " ... something went wrong: "+error.localizedDescription()
self.getURL(webview)
def makeFilename(self, URL, options):
# make the filename
if options.filename:
filename = options.filename
elif options.md5:
try:
import md5
except ImportError:
print "--md5 requires python md5 library"
AppKit.NSApplication.sharedApplication().terminate_(None)
filename = md5.new(URL).hexdigest()
else:
filename = re.sub('^https?', '', URL)
filename = re.sub('\W', '', filename)
if options.datestamp:
import time
now = time.strftime("%Y%m%d")
filename = now + "-" + filename
dir = os.path.abspath(os.path.expanduser(options.dir))
if not os.path.exists(options.dir):
os.makedirs(dir)
return os.path.join(dir, filename)
def saveImages(self, bitmapdata, filename, options):
# save the fullsize png
if options.fullsize:
bitmapdata.representationUsingType_properties_(
AppKit.NSPNGFileType,
None
).writeToFile_atomically_(filename + "-full.png", objc.YES)
if options.thumb or options.clipped:
# work out how big the thumbnail is
width = bitmapdata.pixelsWide()
height = bitmapdata.pixelsHigh()
thumbWidth = (width * options.scale)
thumbHeight = (height * options.scale)
# make the thumbnails in a scratch image
scratch = AppKit.NSImage.alloc().initWithSize_(
Foundation.NSMakeSize(thumbWidth, thumbHeight))
scratch.lockFocus()
AppKit.NSGraphicsContext.currentContext().setImageInterpolation_(
AppKit.NSImageInterpolationHigh)
thumbRect = Foundation.NSMakeRect(0.0, 0.0, thumbWidth,
thumbHeight)
clipRect = Foundation.NSMakeRect(
0.0,
thumbHeight-options.clipheight,
options.clipwidth,
options.clipheight)
bitmapdata.drawInRect_(thumbRect)
thumbOutput = AppKit.NSBitmapImageRep.alloc()\
.initWithFocusedViewRect_(thumbRect)
clipOutput = AppKit.NSBitmapImageRep.alloc()\
.initWithFocusedViewRect_(clipRect)
scratch.unlockFocus()
# save the thumbnails as pngs
if options.thumb:
thumbOutput.representationUsingType_properties_(
AppKit.NSPNGFileType, None).writeToFile_atomically_(
filename + "-thumb.png", objc.YES)
if options.clipped:
clipOutput.representationUsingType_properties_(
AppKit.NSPNGFileType, None).writeToFile_atomically_(
filename + "-clipped.png", objc.YES)
def getURL(self, webview):
if self.urls:
if self.urls[0] == '-':
url = sys.stdin.readline().rstrip()
if not url:
AppKit.NSApplication.sharedApplication().terminate_(None)
else:
url = self.urls.pop(0)
else:
AppKit.NSApplication.sharedApplication().terminate_(None)
nsurl = Foundation.NSURL.URLWithString_(url)
if not (nsurl and nsurl.scheme()):
nsurl = Foundation.NSURL.alloc().initFileURLWithPath_(url)
nsurl = nsurl.absoluteURL()
if self.options.ignore_ssl_check:
Foundation.NSURLRequest.setAllowsAnyHTTPSCertificate_forHost_(objc.YES, nsurl.host())
print "Fetching", nsurl, "..."
self.resetWebview(webview)
scriptobject = webview.windowScriptObject()
scriptobject.setValue_forKey_(Webkit2PngScriptBridge.alloc().init(),
'webkit2png')
webview.mainFrame().loadRequest_(Foundation.NSURLRequest.requestWithURL_(nsurl))
if not webview.mainFrame().provisionalDataSource():
print " ... not a proper url?"
self.getURL(webview)
def resetWebview(self, webview):
rect = Foundation.NSMakeRect(0, 0, self.options.initWidth,
self.options.initHeight)
window = webview.window()
window.setContentSize_((self.options.initWidth,
self.options.initHeight))
if self.options.transparent:
window.setOpaque_(objc.NO)
window.setBackgroundColor_(AppKit.NSColor.clearColor())
webview.setDrawsBackground_(objc.NO)
webview.setFrame_(rect)
def captureView(self, view):
bounds = view.bounds()
if bounds.size.height > self.options.UNSAFE_max_height:
print >> sys.stderr, "Error: page height greater than %s, " \
"clipping to avoid crashing windowserver." % \
self.options.UNSAFE_max_height
bounds.size.height = self.options.UNSAFE_max_height
if bounds.size.width > self.options.UNSAFE_max_width:
print >> sys.stderr, "Error: page width greater than %s, " \
"clipping to avoid crashing windowserver." % \
self.options.UNSAFE_max_width
bounds.size.width = self.options.UNSAFE_max_width
view.window().display()
view.window().setContentSize_(
Foundation.NSSize(self.options.initWidth, self.options.initHeight))
view.setFrame_(bounds)
if hasattr(view, "bitmapImageRepForCachingDisplayInRect_"):
bitmapdata = view.bitmapImageRepForCachingDisplayInRect_(bounds)
view.cacheDisplayInRect_toBitmapImageRep_(bounds, bitmapdata)
else:
view.lockFocus()
bitmapdata = AppKit.NSBitmapImageRep.alloc()
bitmapdata.initWithFocusedViewRect_(bounds)
view.unlockFocus()
return bitmapdata
# what happens when the page has finished loading
def webView_didFinishLoadForFrame_(self, webview, frame):
# don't care about subframes
if (frame == webview.mainFrame()):
scriptobject = webview.windowScriptObject()
if self.options.js:
scriptobject.evaluateWebScript_(self.options.js)
bridge = scriptobject.valueForKey_('webkit2png')
def doGrab():
Foundation.NSTimer.\
scheduledTimerWithTimeInterval_target_selector_userInfo_repeats_(
self.options.delay, self, self.doGrab, webview, False)
if bridge.is_stopped:
bridge.start_callback = doGrab
else:
doGrab()
def doGrab(self, timer):
webview = timer.userInfo()
frame = webview.mainFrame()
view = frame.frameView().documentView()
URL = webview.mainFrame().dataSource().initialRequest().URL()\
.absoluteString()
filename = self.makeFilename(URL, self.options)
bitmapdata = self.captureView(view)
if self.options.selector:
doc = frame.DOMDocument()
el = doc.querySelector_(self.options.selector)
if not el:
print " ... no element matching %s found?" % \
self.options.selector
self.getURL(webview)
return
left, top = 0, 0
parent = el
while parent:
left += parent.offsetLeft()
top += parent.offsetTop()
parent = parent.offsetParent()
zoom = self.options.zoom
cropRect = view.window().convertRectToBacking_(Foundation.NSMakeRect(
zoom * left, zoom * top,
zoom * el.offsetWidth(), zoom * el.offsetHeight()))
cropped = Quartz.CGImageCreateWithImageInRect(
bitmapdata.CGImage(), cropRect)
bitmapdata = AppKit.NSBitmapImageRep.alloc().initWithCGImage_(
cropped)
Quartz.CGImageRelease(cropped)
self.saveImages(bitmapdata, filename, self.options)
print " ... done"
self.getURL(webview)
def main():
# parse the command line
usage = """%prog [options] [http://example.net/ ...]
Examples:
%prog http://google.com/ # screengrab google
%prog -W 1000 -H 1000 http://google.com/ # bigger screengrab of google
%prog -T http://google.com/ # just the thumbnail screengrab
%prog -TF http://google.com/ # just thumbnail and fullsize grab
%prog -o foo http://google.com/ # save images as "foo-thumb.png" etc
%prog - # screengrab urls from stdin
%prog /path/to/file.html # screengrab local html file
%prog -h | less # full documentation"""
cmdparser = optparse.OptionParser(usage,
version=("webkit2png " + __version__))
# TODO: add quiet/verbose options
cmdparser.add_option("--debug", action="store_true",
help=optparse.SUPPRESS_HELP)
# warning: setting these too high can crash your window server
cmdparser.add_option("--UNSAFE-max-height", type="int", default=30000,
help=optparse.SUPPRESS_HELP)
cmdparser.add_option("--UNSAFE-max-width", type="int", default=30000,
help=optparse.SUPPRESS_HELP)
group = optparse.OptionGroup(cmdparser, "Network Options")
group.add_option("--timeout", type="float", default=60.0,
help="page load timeout (default: 60)")
group.add_option("--user-agent", type="string", default=False,
help="set user agent header")
group.add_option("--ignore-ssl-check", action="store_true", default=False,
help="ignore SSL Certificate name mismatches")
cmdparser.add_option_group(group)
group = optparse.OptionGroup(cmdparser, "Browser Window Options")
group.add_option(
"-W", "--width", type="float", default=800.0,
help="initial (and minimum) width of browser (default: 800)")
group.add_option(
"-H", "--height", type="float", default=600.0,
help="initial (and minimum) height of browser (default: 600)")
group.add_option(
"-z", "--zoom", type="float", default=1.0,
help='zoom level of browser, equivalent to "Zoom In" and "Zoom Out" '
'in "View" menu (default: 1.0)')
group.add_option(
"--selector", type="string",
help="CSS selector for a single element to capture (first matching "
"element will be used)")
cmdparser.add_option_group(group)
group = optparse.OptionGroup(cmdparser, "Output size options")
group.add_option(
"-F", "--fullsize", action="store_true",
help="only create fullsize screenshot")
group.add_option(
"-T", "--thumb", action="store_true",
help="only create thumbnail sreenshot")
group.add_option(
"-C", "--clipped", action="store_true",
help="only create clipped thumbnail screenshot")
group.add_option(
"--clipwidth", type="float", default=200.0,
help="width of clipped thumbnail (default: 200)",
metavar="WIDTH")
group.add_option(
"--clipheight", type="float", default=150.0,
help="height of clipped thumbnail (default: 150)",
metavar="HEIGHT")
group.add_option(
"-s", "--scale", type="float", default=0.25,
help="scale factor for thumbnails (default: 0.25)")
cmdparser.add_option_group(group)
group = optparse.OptionGroup(cmdparser, "Output filename options")
group.add_option(
"-D", "--dir", type="string", default="./",
help="directory to place images into")
group.add_option(
"-o", "--filename", type="string", default="",
metavar="NAME", help="save images as NAME-full.png,NAME-thumb.png etc")
group.add_option(
"-m", "--md5", action="store_true",
help="use md5 hash for filename (like del.icio.us)")
group.add_option(
"-d", "--datestamp", action="store_true",
help="include date in filename")
cmdparser.add_option_group(group)
group = optparse.OptionGroup(cmdparser, "Web page functionality")
group.add_option(
"--delay", type="float", default=0,
help="delay between page load finishing and screenshot")
group.add_option(
"--js", type="string", default=None,
help="JavaScript to execute when the window finishes loading"
"(example: --js='document.bgColor=\"red\";'). "
"If you need to wait for asynchronous code to finish before "
"capturing the screenshot, call webkit2png.stop() before the "
"async code runs, then webkit2png.start() to capture the image.")
group.add_option(
"--noimages", action="store_true",
help=optparse.SUPPRESS_HELP)
group.add_option(
"--no-images", action="store_true",
help="don't load images")
group.add_option(
"--nojs", action="store_true",
help=optparse.SUPPRESS_HELP)
group.add_option(
"--no-js", action="store_true",
help="disable JavaScript support")
group.add_option(
"--transparent", action="store_true",
help="render output on a transparent background (requires a web "
"page with a transparent background)", default=False)
cmdparser.add_option_group(group)
(options, args) = cmdparser.parse_args()
if len(args) == 0:
cmdparser.print_usage()
return
if options.filename:
if len(args) != 1 or args[0] == "-":
print "--filename option requires exactly one url"
return
# deprecated options
if options.nojs:
print >> sys.stderr, 'Warning: --nojs will be removed in ' \
'webkit2png 1.0. Please use --no-js.'
options.no_js = True
if options.noimages:
print >> sys.stderr, 'Warning: --noimages will be removed in ' \
'webkit2png 1.0. Please use --no-images.'
options.no_images = True
if options.scale == 0:
cmdparser.error("scale cannot be zero")
# make sure we're outputing something
if not (options.fullsize or options.thumb or options.clipped):
options.fullsize = True
options.thumb = True
options.clipped = True
# work out the initial size of the browser window
# (this might need to be larger so clipped image is right size)
options.initWidth = (options.clipwidth / options.scale)
options.initHeight = (options.clipheight / options.scale)
options.width *= options.zoom
if options.width > options.initWidth:
options.initWidth = options.width
if options.height > options.initHeight:
options.initHeight = options.height
# Hide the dock icon (needs to run before NSApplication.sharedApplication)
AppKit.NSBundle.mainBundle().infoDictionary()['LSBackgroundOnly'] = '1'
app = AppKit.NSApplication.sharedApplication()
# create an app delegate
delegate = AppDelegate.alloc().init()
delegate.timeout = options.timeout
AppKit.NSApp().setDelegate_(delegate)
# create a window
rect = Foundation.NSMakeRect(0, 0, 100, 100)
win = AppKit.NSWindow.alloc()
win.initWithContentRect_styleMask_backing_defer_(
rect, AppKit.NSBorderlessWindowMask, 2, 0)
if options.debug:
win.orderFrontRegardless()
# create a webview object
webview = WebKit.WebView.alloc()
webview.initWithFrame_(rect)
# turn off scrolling so the content is actually x wide and not x-15
webview.mainFrame().frameView().setAllowsScrolling_(objc.NO)
if options.user_agent:
webview.setCustomUserAgent_(options.user_agent)
else:
webkit_version = Foundation.NSBundle.bundleForClass_(WebKit.WebView)\
.objectForInfoDictionaryKey_(WebKit.kCFBundleVersionKey)[1:]
webview.setApplicationNameForUserAgent_(
"Like-Version/6.0 Safari/%s webkit2png/%s" % (webkit_version, __version__))
webview.setPreferencesIdentifier_('webkit2png')
webview.preferences().setLoadsImagesAutomatically_(not options.no_images)
webview.preferences().setJavaScriptEnabled_(not options.no_js)
if options.zoom != 1.0:
webview._setZoomMultiplier_isTextOnly_(options.zoom, False)
# add the webview to the window
win.setContentView_(webview)
# create a LoadDelegate
loaddelegate = WebkitLoad.alloc().init()
loaddelegate.options = options
loaddelegate.urls = args
webview.setFrameLoadDelegate_(loaddelegate)
app.run()
if __name__ == '__main__':
main()

View File

@ -1,212 +0,0 @@
#!/usr/bin/env python
# webrender.py - recursively render web pages to a gif+imagemap of clickable links
# caveat: this script requires to be run as a regular user and cannot run as a daemon
# from apache cgi-bin, you can use python built in http server instead
# usage:
# create cgi-bin directory, copy webrender.py to cgi-bin and chmod 755
# python -m CGIHTTPServer 8000
# navigate web browser to http://x.x.x.x:8000/cgi-bin/webrender.py
# the webrender-xxx.gif images are created in the CWD of the http server
__version__ = "1.0"
#
# This program is based on the software picidae.py 1.0 from http://www.picidae.net
# It was modified by Antoni Sawicki
#
# This program is based on the software webkit2png 0.4 from Paul Hammond.
# It was extended by picidae.net
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
try:
import sys
import os
import glob
import random
import Foundation
import WebKit
import AppKit
import objc
import string
import urllib
import socket
import cgi
import cgitb; cgitb.enable() # for trubleshooting
except ImportError:
print "Cannot find pyobjc library files. Are you sure it is installed?"
sys.exit()
from optparse import OptionParser
class AppDelegate (Foundation.NSObject):
# what happens when the app starts up
def applicationDidFinishLaunching_(self, aNotification):
webview = aNotification.object().windows()[0].contentView()
webview.frameLoadDelegate().getURL(webview)
class WebkitLoad (Foundation.NSObject, WebKit.protocols.WebFrameLoadDelegate):
# what happens if something goes wrong while loading
def webView_didFailLoadWithError_forFrame_(self,webview,error,frame):
print " ... something went wrong 1: " + error.localizedDescription()
self.getURL(webview)
def webView_didFailProvisionalLoadWithError_forFrame_(self,webview,error,frame):
print " ... something went wrong 2: " + error.localizedDescription()
self.getURL(webview)
def getURL(self,webview):
if self.urls:
if self.urls[0] == '-':
url = sys.stdin.readline().rstrip()
if not url: AppKit.NSApplication.sharedApplication().terminate_(None)
else:
url = self.urls.pop(0)
else:
AppKit.NSApplication.sharedApplication().terminate_(None)
self.resetWebview(webview)
webview.mainFrame().loadRequest_(Foundation.NSURLRequest.requestWithURL_(Foundation.NSURL.URLWithString_(url)))
if not webview.mainFrame().provisionalDataSource():
print "<nosuccess />"
self.getURL(webview)
def resetWebview(self,webview):
rect = Foundation.NSMakeRect(0,0,1024,768)
webview.window().setContentSize_((1024,768))
webview.setFrame_(rect)
def resizeWebview(self,view):
view.window().display()
view.window().setContentSize_(view.bounds().size)
view.setFrame_(view.bounds())
def captureView(self,view):
view.lockFocus()
bitmapdata = AppKit.NSBitmapImageRep.alloc()
bitmapdata.initWithFocusedViewRect_(view.bounds())
view.unlockFocus()
return bitmapdata
# what happens when the page has finished loading
def webView_didFinishLoadForFrame_(self,webview,frame):
# don't care about subframes
if (frame == webview.mainFrame()):
view = frame.frameView().documentView()
self.resizeWebview(view)
URL = frame.dataSource().initialRequest().URL().absoluteString()
for fl in glob.glob("webrender-*.gif"):
os.remove(fl)
GIF = "webrender-%s.gif" % (random.randrange(0,1000))
bitmapdata = self.captureView(view)
bitmapdata.representationUsingType_properties_(AppKit.NSGIFFileType,None).writeToFile_atomically_(GIF,objc.YES)
myurl = "http://%s:%s%s" % (socket.gethostbyname(socket.gethostname()), os.getenv("SERVER_PORT"), os.getenv("SCRIPT_NAME"))
print "Content-type: text/html\r\n\r\n"
print "<!-- webrender.py by Antoni Sawicki -->"
print "<html><head><title>Webrender - %s</title></head><body><table border=\"0\"><tr>" % (URL)
print "<td><form action=\"%s\">" % (myurl)
print "<input type=\"text\" name=\"url\" value=\"%s\" size=\"80\">" % (URL)
print "<input type=\"submit\" value=\"go\">"
print "</form></td><td>"
print "<form action=\"%s\">" % (myurl)
print "<input type=\"text\" name=\"search\" value=\"\" size=\"20\">"
print "<input type=\"submit\" value=\"search\">"
print "</form></td></tr></table>"
print "<img src=\"../%s\" alt=\"webrender\" usemap=\"#map\" border=\"0\">" % (GIF)
# Analyse HTML and get links
print "<map name=\"map\">";
domdocument = frame.DOMDocument()
domnodelist = domdocument.getElementsByTagName_('A')
i = 0
while i < domnodelist.length():
# linkvalue
value = domnodelist.item_(i).valueForKey_('href')
# position-rect
myrect = domnodelist.item_(i).boundingBox()
xmin = Foundation.NSMinX(myrect)
ymin = Foundation.NSMinY(myrect)
xmax = Foundation.NSMaxX(myrect)
ymax = Foundation.NSMaxY(myrect)
# print Link
escval = string.replace( string.replace(value, "?", "TNXQUE"), "&", "TNXAMP" )
print "<area shape=\"rect\" coords=\"%i,%i,%i,%i\" alt=\"\" href=\"%s?url=%s\"></area>" % (xmin, ymin, xmax, ymax, myurl, escval)
i += 1
print "</map>"
print "</body></html>"
self.getURL(webview)
def main():
# obtain url from cgi input
form = cgi.FieldStorage()
rawurl = form.getfirst("url", "http://www.google.com")
rawsearch = form.getfirst("search")
if rawsearch:
url = "http://www.google.com/search?q=%s" % (rawsearch)
else:
url = string.replace( string.replace(rawurl, "TNXAMP", "&"), "TNXQUE", "?")
AppKit.NSApplicationLoad();
app = AppKit.NSApplication.sharedApplication()
# create an app delegate
delegate = AppDelegate.alloc().init()
AppKit.NSApp().setDelegate_(delegate)
# create a window
rect = Foundation.NSMakeRect(-16000,-16000,100,100)
win = AppKit.NSWindow.alloc()
win.initWithContentRect_styleMask_backing_defer_ (rect, AppKit.NSBorderlessWindowMask, 2, 0)
# create a webview object
webview = WebKit.WebView.alloc()
webview.initWithFrame_(rect)
# turn off scrolling so the content is actually x wide and not x-15
webview.mainFrame().frameView().setAllowsScrolling_(objc.NO)
# add the webview to the window
win.setContentView_(webview)
# create a LoadDelegate
loaddelegate = WebkitLoad.alloc().init()
loaddelegate.options = [""]
loaddelegate.urls = [url]
webview.setFrameLoadDelegate_(loaddelegate)
app.run()
if __name__ == '__main__' : main()

View File

@ -1,931 +0,0 @@
#!/usr/bin/env python2.7
# wrp.py - Web Rendering Proxy - https://github.com/tenox7/wrp
# A HTTP proxy service that renders the requested URL in to a image associated
# with an imagemap of clickable links. This is an adaptation of previous works by
# picidae.net and Paul Hammond.
__version__ = "2.0"
#
# This program is based on the software picidae.py from picidae.net
# It was modified by Antoni Sawicki and Natalia Portillo
#
# This program is based on the software webkit2png from Paul Hammond.
# It was extended by picidae.net
#
# Copyright (c) 2013-2018 Antoni Sawicki
# Copyright (c) 2012-2013 picidae.net
# Copyright (c) 2004-2013 Paul Hammond
# Copyright (c) 2017-2018 Natalia Portillo
# Copyright (c) 2018 //gir.st/
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# Configuration options:
PORT = 8080
WIDTH = 1024
HEIGHT = 768
ISMAP = False # ISMAP=True is Server side for Mosaic 1.1 and up. HTML 3.2 supports Client side maps (ISMAP=False)
WAIT = 1 # sleep for 1 second to allow javascript renders
QUALITY = 75 # For JPEG: image quality 0-100; For PNG: sets compression level (leftmost digit 0 fastest, 9 best)
AUTOWIDTH = True # Check for browser width using javascript
FORMAT = "AUTO" # AUTO = GIF for mac OS, JPG for rest; PNG, GIF, JPG as supported values.
SSLSTRIP = True # enable to automatically downgrade secure requests
# PythonMagick configuration options
MK_MONOCHROME = False # Convert the render to a black and white dithered image
MK_GRAYSCALE = False # Convert the render to a grayscal dithered image
MK_COLORS = 0 # Reduce number of colors in the image. 0 for not reducing. Less than 256 works in grayscale also.
MK_DITHER = False # Dither the image to reduce size. GIFs will always be dithered. Ignored if MK_COLORS is not set.
import re
import random
import os
import time
import string
import urllib
import socket
import SocketServer
import SimpleHTTPServer
import threading
import Queue
import sys
import logging
import StringIO
import subprocess
try:
import PythonMagick
HasMagick = True
except ImportError:
HasMagick = False
# Request queue (URLs go in here)
REQ = Queue.Queue()
# Response queue (dummy response objects)
RESP = Queue.Queue()
# Renders dictionary
RENDERS = {}
#######################
### Linux CODEPATH ###
#######################
if sys.platform.startswith('linux') or sys.platform.startswith('freebsd'):
try:
from PyQt5.QtCore import *
from PyQt5.QtGui import *
from PyQt5.QtWebKit import *
from PyQt5.QtWebKitWidgets import *
from PyQt5.QtNetwork import *
from PyQt5.QtWidgets import *
IsPyQt5 = True
except ImportError:
from PyQt4.QtCore import *
from PyQt4.QtGui import *
from PyQt4.QtWebKit import *
from PyQt4.QtNetwork import *
IsPyQt5 = False
# claunia: Check how to use this in macOS
logging.basicConfig(filename='/dev/stdout', level=logging.WARN, )
logger = logging.getLogger('wrp')
# Class for Website-Rendering. Uses QWebPage, which
# requires a running QtGui to work.
class WebkitRenderer(QObject):
def __init__(self, **kwargs):
"""Sets default values for the properties."""
if not QApplication.instance():
raise RuntimeError(self.__class__.__name__ + \
" requires a running QApplication instance")
QObject.__init__(self)
# Initialize default properties
self.width = kwargs.get('width', 0)
self.height = kwargs.get('height', 0)
self.timeout = kwargs.get('timeout', 0)
self.wait = kwargs.get('wait', 0)
self.logger = kwargs.get('logger', None)
# Set this to true if you want to capture flash.
# Not that your desktop must be large enough for
# fitting the whole window.
self.grabWholeWindow = kwargs.get('grabWholeWindow', False)
# Set some default options for QWebPage
self.qWebSettings = {
QWebSettings.JavascriptEnabled : True,
QWebSettings.PluginsEnabled : True,
QWebSettings.PrivateBrowsingEnabled : True,
QWebSettings.JavascriptCanOpenWindows : False
}
def render(self, url):
"""Renders the given URL into a QImage object"""
# We have to use this helper object because
# QApplication.processEvents may be called, causing
# this method to get called while it has not returned yet.
helper = _WebkitRendererHelper(self)
helper._window.resize(self.width, self.height)
image = helper.render(url)
# Bind helper instance to this image to prevent the
# object from being cleaned up (and with it the QWebPage, etc)
# before the data has been used.
image.helper = helper
return image
class _WebkitRendererHelper(QObject):
"""This helper class is doing the real work. It is required to
allow WebkitRenderer.render() to be called "asynchronously"
(but always from Qt's GUI thread).
"""
def __init__(self, parent):
"""Copies the properties from the parent (WebkitRenderer) object,
creates the required instances of QWebPage, QWebView and QMainWindow
and registers some Slots.
"""
QObject.__init__(self)
# Copy properties from parent
for key, value in parent.__dict__.items():
setattr(self, key, value)
# Create and connect required PyQt4 objects
self._page = CustomWebPage(logger=self.logger)
self._view = QWebView()
self._view.setPage(self._page)
self._window = QMainWindow()
self._window.setCentralWidget(self._view)
# Import QWebSettings
for key, value in self.qWebSettings.iteritems():
self._page.settings().setAttribute(key, value)
# Connect required event listeners
if IsPyQt5:
self._page.loadFinished.connect(self._on_load_finished)
self._page.loadStarted.connect(self._on_load_started)
self._page.networkAccessManager().sslErrors.connect(self._on_ssl_errors)
self._page.networkAccessManager().finished.connect(self._on_each_reply)
else:
self.connect(self._page, SIGNAL("loadFinished(bool)"), self._on_load_finished)
self.connect(self._page, SIGNAL("loadStarted()"), self._on_load_started)
self.connect(self._page.networkAccessManager(),
SIGNAL("sslErrors(QNetworkReply *,const QList<QSslError>&)"),
self._on_ssl_errors)
self.connect(self._page.networkAccessManager(),
SIGNAL("finished(QNetworkReply *)"),
self._on_each_reply)
# The way we will use this, it seems to be unesseccary to have Scrollbars enabled
self._page.mainFrame().setScrollBarPolicy(Qt.Horizontal, Qt.ScrollBarAlwaysOff)
self._page.mainFrame().setScrollBarPolicy(Qt.Vertical, Qt.ScrollBarAlwaysOff)
self._page.settings().setUserStyleSheetUrl(
QUrl("data:text/css,html,body{overflow-y:hidden !important;}"))
# Show this widget
# self._window.show()
def __del__(self):
"""Clean up Qt4 objects. """
self._window.close()
del self._window
del self._view
del self._page
def render(self, url):
"""The real worker. Loads the page (_load_page) and awaits
the end of the given 'delay'. While it is waiting outstanding
QApplication events are processed.
After the given delay, the Window or Widget (depends
on the value of 'grabWholeWindow' is drawn into a QPixmap
"""
self._load_page(url, self.width, self.height, self.timeout)
# Wait for end of timer. In this time, process
# other outstanding Qt events.
if self.wait > 0:
if self.logger: self.logger.debug("Waiting %d seconds " % self.wait)
waitToTime = time.time() + self.wait
while time.time() < waitToTime:
if QApplication.hasPendingEvents():
QApplication.processEvents()
if self.grabWholeWindow:
# Note that this does not fully ensure that the
# window still has the focus when the screen is
# grabbed. This might result in a race condition.
self._view.activateWindow()
if IsPyQt5:
image = QScreen.grabWindow(self._window.winId())
else:
image = QPixmap.grabWindow(self._window.winId())
else:
if IsPyQt5:
image = QWidget.grab(self._window)
else:
image = QPixmap.grabWidget(self._window)
httpout = WebkitRenderer.httpout
frame = self._view.page().currentFrame()
web_url = frame.url().toString()
# Write URL map
httpout.write("<!-- Web Rendering Proxy v%s by Antoni Sawicki -->\n"
% (__version__))
httpout.write("<!-- Request for [%s] frame [%s] -->\n"
% (WebkitRenderer.req_url, web_url))
# Get title
httpout.write("<HTML><HEAD>")
for ttl in frame.findAllElements('title'):
httpout.write((u"<TITLE>%s</TITLE>"
% ttl.toPlainText()).encode('utf-8', errors='ignore'))
break # Don't repeat bad HTML coding with several title marks
httpout.write("</HEAD>\n<BODY>\n")
if AUTOWIDTH:
httpout.write("<script>document.write('<span style=\"display: none;\"><img src=\"http://width-' + document.body.clientWidth + '-px.jpg\" width=\"0\" height=\"0\"></span>');</script>\n")
if ISMAP == True:
httpout.write("<A HREF=\"http://%s\">"
"<IMG SRC=\"http://%s\" ALT=\"wrp-render\" ISMAP>\n"
"</A>\n" % (WebkitRenderer.req_map, WebkitRenderer.req_img))
mapfile = StringIO.StringIO()
mapfile.write("default %s\n" % (web_url))
else:
httpout.write("<IMG SRC=\"http://%s\" ALT=\"wrp-render\" USEMAP=\"#map\">\n"
"<MAP NAME=\"map\">\n" % (WebkitRenderer.req_img))
for x in frame.findAllElements('a'):
turl = QUrl(web_url).resolved(QUrl(x.attribute('href'))).toString()
xmin, ymin, xmax, ymax = x.geometry().getCoords()
if ISMAP == True:
mapfile.write("rect %s %i,%i %i,%i\n".decode('utf-8', errors='ignore') % (turl, xmin, ymin, xmax, ymax))
else:
httpout.write(("<AREA SHAPE=\"RECT\""
" COORDS=\"%i,%i,%i,%i\""
" ALT=\"%s\" HREF=\"%s\">\n".decode('utf-8', errors='ignore')
% (xmin, ymin, xmax, ymax, turl, turl)).encode("utf-8"))
if ISMAP != True:
httpout.write("</MAP>\n")
httpout.write("</BODY>\n</HTML>\n")
if ISMAP == True:
RENDERS[WebkitRenderer.req_map] = mapfile
return image
def _load_page(self, url, width, height, timeout):
"""
This method implements the logic for retrieving and displaying
the requested page.
"""
# This is an event-based application. So we have to wait until
# "loadFinished(bool)" raised.
cancelAt = time.time() + timeout
self.__loading = True
self.__loadingResult = False # Default
self._page.mainFrame().load(QUrl(url))
while self.__loading:
if timeout > 0 and time.time() >= cancelAt:
raise RuntimeError("Request timed out on %s" % url)
while QApplication.hasPendingEvents() and self.__loading:
QCoreApplication.processEvents()
if self.logger: self.logger.debug("Processing result")
if self.__loading_result == False:
if self.logger: self.logger.warning("Failed to load %s" % url)
# Set initial viewport (the size of the "window")
size = self._page.mainFrame().contentsSize()
if self.logger: self.logger.debug("contentsSize: %s", size)
if width > 0:
size.setWidth(width)
if height > 0:
size.setHeight(height)
self._window.resize(size)
def _on_each_reply(self, reply):
"""Logs each requested uri"""
self.logger.debug("Received %s" % (reply.url().toString()))
# Eventhandler for "loadStarted()" signal
def _on_load_started(self):
"""Slot that sets the '__loading' property to true."""
if self.logger: self.logger.debug("loading started")
self.__loading = True
# Eventhandler for "loadFinished(bool)" signal
def _on_load_finished(self, result):
"""Slot that sets the '__loading' property to false and stores
the result code in '__loading_result'.
"""
if self.logger: self.logger.debug("loading finished with result %s", result)
self.__loading = False
self.__loading_result = result
# Eventhandler for "sslErrors(QNetworkReply *,const QList<QSslError>&)" signal
def _on_ssl_errors(self, reply, errors):
"""Slot that writes SSL warnings into the log but ignores them."""
for e in errors:
if self.logger: self.logger.warn("SSL: " + e.errorString())
reply.ignoreSslErrors()
class CustomWebPage(QWebPage):
def __init__(self, **kwargs):
super(CustomWebPage, self).__init__()
self.logger = kwargs.get('logger', None)
def javaScriptAlert(self, frame, message):
if self.logger: self.logger.debug('Alert: %s', message)
def javaScriptConfirm(self, frame, message):
if self.logger: self.logger.debug('Confirm: %s', message)
return False
def javaScriptPrompt(self, frame, message, default, result):
"""This function is called whenever a JavaScript program running inside frame tries to
prompt the user for input. The program may provide an optional message, msg, as well
as a default value for the input in defaultValue.
If the prompt was cancelled by the user the implementation should return false;
otherwise the result should be written to result and true should be returned.
If the prompt was not cancelled by the user, the implementation should return true and
the result string must not be null.
"""
if self.logger: self.logger.debug('Prompt: %s (%s)' % (message, default))
return False
def shouldInterruptJavaScript(self):
"""This function is called when a JavaScript program is running for a long period of
time. If the user wanted to stop the JavaScript the implementation should return
true; otherwise false.
"""
if self.logger: self.logger.debug("WebKit ask to interrupt JavaScript")
return True
#===============================================================================
def init_qtgui(display=None, style=None, qtargs=None):
"""Initiates the QApplication environment using the given args."""
if QApplication.instance():
logger.debug("QApplication has already been instantiated. \
Ignoring given arguments and returning existing QApplication.")
return QApplication.instance()
qtargs2 = [sys.argv[0]]
if display:
qtargs2.append('-display')
qtargs2.append(display)
# Also export DISPLAY var as this may be used
# by flash plugin
os.environ["DISPLAY"] = display
if style:
qtargs2.append('-style')
qtargs2.append(style)
qtargs2.extend(qtargs or [])
return QApplication(qtargs2)
# Technically, this is a QtGui application, because QWebPage requires it
# to be. But because we will have no user interaction, and rendering can
# not start before 'app.exec_()' is called, we have to trigger our "main"
# by a timer event.
def __main_qt():
# Render the page.
# If this method times out or loading failed, a
# RuntimeException is thrown
try:
while True:
req = REQ.get()
WebkitRenderer.httpout = req[0]
WebkitRenderer.req_url = req[1]
WebkitRenderer.req_img = req[2]
WebkitRenderer.req_map = req[3]
if WebkitRenderer.req_url == "http://wrp.stop/" or WebkitRenderer.req_url == "http://www.wrp.stop/":
print ">>> Terminate Request Received"
QApplication.exit(0)
break
# Initialize WebkitRenderer object
renderer = WebkitRenderer()
renderer.logger = logger
renderer.width = WIDTH
renderer.height = HEIGHT
renderer.timeout = 60
renderer.wait = WAIT
renderer.grabWholeWindow = False
image = renderer.render(WebkitRenderer.req_url)
qBuffer = QBuffer()
if HasMagick:
image.save(qBuffer, 'png', QUALITY)
blob = PythonMagick.Blob(qBuffer.buffer().data())
mimg = PythonMagick.Image(blob)
mimg.quality(QUALITY)
if FORMAT=="GIF" and not MK_MONOCHROME and not MK_GRAYSCALE and not MK_DITHER and MK_COLORS != 0 and not MK_COLORS <= 256:
mimg.quantizeColors(256)
mimg.quantizeDither()
mimg.quantize()
if MK_MONOCHROME:
mimg.quantizeColorSpace(PythonMagick.ColorspaceType.GRAYColorspace)
mimg.quantizeColors(2)
mimg.quantizeDither()
mimg.quantize()
mimg.monochrome()
elif MK_GRAYSCALE:
mimg.quantizeColorSpace(PythonMagick.ColorspaceType.GRAYColorspace)
if MK_COLORS > 0 and MK_COLORS < 256:
mimg.quantizeColors(MK_COLORS)
else:
mimg.quantizeColors(256)
mimg.quantizeDither()
mimg.quantize()
else:
if MK_COLORS > 0:
mimg.quantizeColors(MK_COLORS)
if MK_DITHER:
mimg.quantizeDither()
mimg.quantize()
if FORMAT=="AUTO" or FORMAT=="JPG":
mimg.write(blob, "jpg")
elif FORMAT=="PNG":
mimg.write(blob, "png")
elif FORMAT=="GIF":
mimg.write(blob, "gif")
output = StringIO.StringIO()
output.write(blob.data)
else:
if FORMAT=="AUTO" or FORMAT=="JPG":
image.save(qBuffer, 'jpg', QUALITY)
elif FORMAT=="PNG":
image.save(qBuffer, 'png', QUALITY)
output = StringIO.StringIO()
output.write(qBuffer.buffer().data())
RENDERS[req[2]] = output
del renderer
print ">>> done: %s [%d kb]..." % (WebkitRenderer.req_img, output.len/1024)
RESP.put('')
QApplication.exit(0)
except RuntimeError, e:
logger.error("main: %s" % e)
print >> sys.stderr, e
QApplication.exit(1)
######################
### macOS CODEPATH ###
######################
elif sys.platform == "darwin":
import Foundation
import WebKit
import AppKit
import objc
class AppDelegate(Foundation.NSObject):
# what happens when the app starts up
def applicationDidFinishLaunching_(self, aNotification):
webview = aNotification.object().windows()[0].contentView()
webview.frameLoadDelegate().getURL(webview)
class WebkitLoad(Foundation.NSObject, WebKit.protocols.WebFrameLoadDelegate):
# what happens if something goes wrong while loading
def webView_didFailLoadWithError_forFrame_(self, webview, error, frame):
if error.code() == Foundation.NSURLErrorCancelled:
return
print " ... something went wrong 1: " + error.localizedDescription()
AppKit.NSApplication.sharedApplication().terminate_(None)
def webView_didFailProvisionalLoadWithError_forFrame_(self, webview, error, frame):
if error.code() == Foundation.NSURLErrorCancelled:
return
print " ... something went wrong 2: " + error.localizedDescription()
AppKit.NSApplication.sharedApplication().terminate_(None)
def getURL(self, webview):
req = REQ.get()
WebkitLoad.httpout = req[0]
WebkitLoad.req_url = req[1]
WebkitLoad.req_img = req[2]
WebkitLoad.req_map = req[3]
if WebkitLoad.req_url == "http://wrp.stop/" or WebkitLoad.req_url == "http://www.wrp.stop/":
print ">>> Terminate Request Received"
AppKit.NSApplication.sharedApplication().terminate_(None)
nsurl = Foundation.NSURL.URLWithString_(WebkitLoad.req_url)
if not (nsurl and nsurl.scheme()):
nsurl = Foundation.NSURL.alloc().initFileURLWithPath_(WebkitLoad.req_url)
nsurl = nsurl.absoluteURL()
Foundation.NSURLRequest.setAllowsAnyHTTPSCertificate_forHost_(objc.YES, nsurl.host())
self.resetWebview(webview)
webview.mainFrame().loadRequest_(Foundation.NSURLRequest.requestWithURL_(nsurl))
if not webview.mainFrame().provisionalDataSource():
print " ... not a proper url?"
RESP.put('')
self.getURL(webview)
def resetWebview(self, webview):
rect = Foundation.NSMakeRect(0, 0, WIDTH, HEIGHT)
webview.window().setContentSize_((WIDTH, HEIGHT))
webview.setFrame_(rect)
def captureView(self, view):
view.window().display()
view.window().setContentSize_(view.bounds().size)
view.setFrame_(view.bounds())
if hasattr(view, "bitmapImageRepForCachingDisplayInRect_"):
bitmapdata = view.bitmapImageRepForCachingDisplayInRect_(view.bounds())
view.cacheDisplayInRect_toBitmapImageRep_(view.bounds(), bitmapdata)
else:
view.lockFocus()
bitmapdata = AppKit.NSBitmapImageRep.alloc()
bitmapdata.initWithFocusedViewRect_(view.bounds())
view.unlockFocus()
return bitmapdata
# what happens when the page has finished loading
def webView_didFinishLoadForFrame_(self, webview, frame):
# don't care about subframes
if frame == webview.mainFrame():
view = frame.frameView().documentView()
output = StringIO.StringIO()
if HasMagick:
output.write(self.captureView(view).representationUsingType_properties_(
AppKit.NSPNGFileType, None))
blob = PythonMagick.Blob(output)
mimg = PythonMagick.Image(blob)
mimg.quality(QUALITY)
if FORMAT=="GIF" and not MK_MONOCHROME and not MK_GRAYSCALE and not MK_DITHER and MK_COLORS != 0 and not MK_COLORS <= 256:
mimg.quantizeColors(256)
mimg.quantizeDither()
mimg.quantize()
if MK_MONOCHROME:
mimg.quantizeColorSpace(PythonMagick.ColorspaceType.GRAYColorspace)
mimg.quantizeColors(2)
mimg.quantizeDither()
mimg.quantize()
mimg.monochrome()
elif MK_GRAYSCALE:
mimg.quantizeColorSpace(PythonMagick.ColorspaceType.GRAYColorspace)
if MK_COLORS > 0 and MK_COLORS < 256:
mimg.quantizeColors(MK_COLORS)
else:
mimg.quantizeColors(256)
mimg.quantizeDither()
mimg.quantize()
else:
if MK_COLORS > 0:
mimg.quantizeColors(MK_COLORS)
if MK_DITHER:
mimg.quantizeDither()
mimg.quantize()
if FORMAT=="JPG":
mimg.write(blob, "jpg")
elif FORMAT=="PNG":
mimg.write(blob, "png")
elif FORMAT=="AUTO" or FORMAT=="GIF":
mimg.write(blob, "gif")
output = StringIO.StringIO()
output.write(blob.data)
else:
if FORMAT=="AUTO" or FORMAT=="GIF":
output.write(self.captureView(view).representationUsingType_properties_(
AppKit.NSGIFFileType, None))
elif FORMAT=="JPG":
output.write(self.captureView(view).representationUsingType_properties_(
AppKit.NSJPEGFileType, None))
elif FORMAT=="PNG":
output.write(self.captureView(view).representationUsingType_properties_(
AppKit.NSPNGFileType, None))
RENDERS[WebkitLoad.req_img] = output
# url of the rendered page
web_url = frame.dataSource().initialRequest().URL().absoluteString()
httpout = WebkitLoad.httpout
httpout.write("<!-- Web Rendering Proxy v%s by Antoni Sawicki -->\n"
% (__version__))
httpout.write("<!-- Request for [%s] frame [%s] -->\n"
% (WebkitLoad.req_url, web_url))
domdocument = frame.DOMDocument()
# Get title
httpout.write("<HTML><HEAD>")
httpout.write((u"<TITLE>%s</TITLE>"
% domdocument.title()).encode('utf-8', errors='ignore'))
httpout.write("</HEAD>\n<BODY>\n")
if AUTOWIDTH:
httpout.write("<script>document.write('<span style=\"display: none;\"><img src=\"http://width-' + document.body.clientWidth + '-px.jpg\" width=\"0\" height=\"0\"></span>');</script>\n")
if ISMAP == True:
httpout.write("<A HREF=\"http://%s\">"
"<IMG SRC=\"http://%s\" ALT=\"wrp-render\" ISMAP>\n"
"</A>\n" % (WebkitLoad.req_map, WebkitLoad.req_img))
mapfile = StringIO.StringIO()
mapfile.write("default %s\n" % (web_url))
else:
httpout.write("<IMG SRC=\"http://%s\" ALT=\"wrp-render\" USEMAP=\"#map\">\n"
"<MAP NAME=\"map\">\n" % (WebkitLoad.req_img))
domnodelist = domdocument.getElementsByTagName_('A')
i = 0
while i < domnodelist.length():
turl = domnodelist.item_(i).valueForKey_('href')
#TODO: crashes? validate url? insert web_url if wrong?
myrect = domnodelist.item_(i).boundingBox()
xmin = Foundation.NSMinX(myrect)
ymin = Foundation.NSMinY(myrect)
xmax = Foundation.NSMaxX(myrect)
ymax = Foundation.NSMaxY(myrect)
if ISMAP == True:
mapfile.write("rect %s %i,%i %i,%i\n".decode('utf-8', errors='ignore') % (turl, xmin, ymin, xmax, ymax))
else:
httpout.write("<AREA SHAPE=\"RECT\""
" COORDS=\"%i,%i,%i,%i\""
" ALT=\"%s\" HREF=\"%s\">\n".decode('utf-8', errors='ignore')
% (xmin, ymin, xmax, ymax, turl, turl))
i += 1
if ISMAP != True:
httpout.write("</MAP>\n")
httpout.write("</BODY>\n</HTML>\n")
if ISMAP == True:
RENDERS[WebkitLoad.req_map] = mapfile
# Return to Proxy thread and Loop...
RESP.put('')
self.getURL(webview)
def main_cocoa():
# Launch NS Application
AppKit.NSApplicationLoad()
app = AppKit.NSApplication.sharedApplication()
delegate = AppDelegate.alloc().init()
AppKit.NSApp().setDelegate_(delegate)
AppKit.NSBundle.mainBundle().infoDictionary()['NSAppTransportSecurity'] = \
dict(NSAllowsArbitraryLoads=True)
rect = Foundation.NSMakeRect(-16000, -16000, 100, 100)
win = AppKit.NSWindow.alloc()
win.initWithContentRect_styleMask_backing_defer_(rect, AppKit.NSBorderlessWindowMask, 2, 0)
webview = WebKit.WebView.alloc()
webview.initWithFrame_(rect)
webview.mainFrame().frameView().setAllowsScrolling_(objc.NO)
webkit_version = Foundation.NSBundle.bundleForClass_(WebKit.WebView). \
objectForInfoDictionaryKey_(WebKit.kCFBundleVersionKey)[1:]
webview.setApplicationNameForUserAgent_("Like-Version/6.0 Safari/%s wrp/%s"
% (webkit_version, __version__))
win.setContentView_(webview)
loaddelegate = WebkitLoad.alloc().init()
loaddelegate.options = [""]
webview.setFrameLoadDelegate_(loaddelegate)
app.run()
#######################
### COMMON CODEPATH ###
#######################
class Proxy(SimpleHTTPServer.SimpleHTTPRequestHandler):
def do_GET(self):
req_url = self.path
httpout = self.wfile
map_re = re.match(r"http://(wrp-\d+\.map).*?(\d+),(\d+)", req_url)
wid_re = re.match(r"http://(width-[0-9]+-px\.jpg).*", req_url)
gif_re = re.match(r"http://(wrp-\d+\.gif).*", req_url)
jpg_re = re.match(r"http://(wrp-\d+\.jpg).*", req_url)
png_re = re.match(r"http://(wrp-\d+\.png).*", req_url)
# Serve Rendered GIF
if gif_re:
img = gif_re.group(1)
print ">>> request for rendered gif image... %s [%d kb]" \
% (img, RENDERS[img].len/1024)
self.send_response(200, 'OK')
self.send_header('Content-type', 'image/gif')
self.end_headers()
httpout.write(RENDERS[img].getvalue())
del RENDERS[img]
elif jpg_re:
img = jpg_re.group(1)
print ">>> request for rendered jpg image... %s [%d kb]" \
% (img, RENDERS[img].len/1024)
self.send_response(200, 'OK')
self.send_header('Content-type', 'image/jpeg')
self.end_headers()
httpout.write(RENDERS[img].getvalue())
del RENDERS[img]
elif png_re:
img = png_re.group(1)
print ">>> request for rendered png image... %s [%d kb]" \
% (img, RENDERS[img].len/1024)
self.send_response(200, 'OK')
self.send_header('Content-type', 'image/png')
self.end_headers()
httpout.write(RENDERS[img].getvalue())
del RENDERS[img]
elif wid_re:
global WIDTH
try:
wid = req_url.split("-")
WIDTH = int(wid[1])
print ">>> width request: %d" % WIDTH
except:
print ">>> width request error" % WIDTH
self.send_error(404, "Width request")
self.end_headers()
# Process ISMAP Request
elif map_re:
map = map_re.group(1)
req_x = int(map_re.group(2))
req_y = int(map_re.group(3))
print ">>> ISMAP request... %s [%d,%d] " % (map, req_x, req_y)
mapf = RENDERS[map]
mapf.seek(0)
goto_url = "none"
for line in mapf.readlines():
if re.match(r"(\S+)", line).group(1) == "default":
default_url = re.match(r"\S+\s+(\S+)", line).group(1)
elif re.match(r"(\S+)", line).group(1) == "rect":
try:
rect = re.match(r"(\S+)\s+(\S+)\s+(\d+),(\d+)\s+(\d+),(\d+)", line)
min_x = int(rect.group(3))
min_y = int(rect.group(4))
max_x = int(rect.group(5))
max_y = int(rect.group(6))
if (req_x >= min_x) and \
(req_x <= max_x) and \
(req_y >= min_y) and \
(req_y <= max_y):
goto_url = rect.group(2)
except AttributeError:
pass
if goto_url == "none":
goto_url = default_url
print ">>> ISMAP redirect: %s\n" % (goto_url)
self.send_response(302, "Found")
self.send_header("Location", goto_url)
self.send_header("Content-type", "text/html")
self.end_headers()
httpout.write("<HTML><BODY><A HREF=\"%s\">%s</A></BODY></HTML>\n"
% (goto_url, goto_url))
# Process a web page request and generate image
else:
print ">>> URL request... " + req_url
if req_url == "http://wrp.stop/" or req_url == "http://www.wrp.stop/":
REQ.put((httpout, req_url, "", ""))
RESP.get()
else:
reqst = urllib.urlopen(req_url)
if reqst.info().type == "text/html" or reqst.info().type == "application/xhtml+xml":
# If an error occurs, send error headers to the requester
if reqst.getcode() >= 400:
self.send_response(reqst.getcode())
for hdr in reqst.info():
self.send_header(hdr, reqst.info()[hdr])
self.end_headers()
else:
self.send_response(200, 'OK')
self.send_header('Content-type', 'text/html')
self.end_headers()
rnd = random.randrange(0, 1000)
if FORMAT == "GIF":
req_extension = ".gif"
elif FORMAT == "JPG":
req_extension = ".jpg"
elif FORMAT == "PNG":
req_extension = ".png"
elif (sys.platform.startswith('linux') or sys.platform.startswitch('freebsd')) and FORMAT == "AUTO":
req_extension = ".jpg"
elif sys.platform == "darwin" and FORMAT == "AUTO":
req_extension = ".gif"
req_img = "wrp-%s%s" % (rnd, req_extension)
req_map = "wrp-%s.map" % (rnd)
# To WebKit Thread
REQ.put((httpout, req_url, req_img, req_map))
# Wait for completition
RESP.get()
# If the requested file is not HTML or XHTML, just return it as is.
else:
self.send_response(reqst.getcode())
for hdr in reqst.info():
self.send_header(hdr, reqst.info()[hdr])
self.end_headers()
httpout.write(reqst.read())
def run_proxy():
httpd = SocketServer.TCPServer(('', PORT), Proxy)
print "Web Rendering Proxy v%s serving at port: %s" % (__version__, PORT)
while 1:
httpd.serve_forever()
def main():
if(FORMAT != "AUTO" and FORMAT != "GIF" and FORMAT != "JPG" and FORMAT != "PNG"):
sys.exit("Unsupported image format \"%s\". Exiting." % FORMAT)
if (sys.platform.startswith('linux') or sys.platform.startswith('freebsd')) and FORMAT == "GIF" and not HasMagick:
sys.exit("GIF format is not supported on this platform. Exiting.")
# run traffic through sslstrip as a quick workaround for getting SSL webpages to work
# NOTE: modern browsers are doing their best to stop this kind of 'attack'. Firefox
# supports an about:config flag test.currentTimeOffsetSeconds(int) = 12000000, which
# you can use to circumvent those checks.
if SSLSTRIP:
try:
subprocess.check_output(["pidof", "sslstrip"])
except:
subprocess.Popen(["sslstrip"], stdout=open(os.devnull,'w'), stderr=subprocess.STDOUT) # runs on port 10000 by default
QNetworkProxy.setApplicationProxy(QNetworkProxy(QNetworkProxy.HttpProxy, "localhost", 10000))
# Launch Proxy Thread
threading.Thread(target=run_proxy).start()
if sys.platform.startswith('linux') or sys.platform.startswith('freebsd'):
import signal
try:
import PyQt5.QtCore
except ImportError:
import PyQt4.QtCore
# Initialize Qt-Application, but make this script
# abortable via CTRL-C
app = init_qtgui(display=None, style=None)
signal.signal(signal.SIGINT, signal.SIG_DFL)
QTimer.singleShot(0, __main_qt)
sys.exit(app.exec_())
elif sys.platform == "darwin":
main_cocoa()
else:
sys.exit("Unsupported platform: %s. Exiting." % sys.platform)
if __name__ == '__main__': main()

769
wrp.go
View File

@ -2,7 +2,7 @@
// WRP - Web Rendering Proxy
//
// Copyright (c) 2013-2018 Antoni Sawicki
// Copyright (c) 2019 Google LLC
// Copyright (c) 2019-2024 Google LLC
//
package main
@ -10,14 +10,21 @@ package main
import (
"bytes"
"context"
"embed"
"flag"
"fmt"
"html/template"
"image"
"image/color/palette"
"image/gif"
"image/jpeg"
"image/png"
"io"
"io/ioutil"
"log"
"math"
"math/rand"
"net"
"net/http"
"net/url"
"os"
@ -27,238 +34,281 @@ import (
"syscall"
"time"
"github.com/MaxHalford/halfgone"
"github.com/chromedp/cdproto/css"
"github.com/chromedp/cdproto/emulation"
"github.com/chromedp/cdproto/page"
"github.com/chromedp/chromedp"
"github.com/ericpauley/go-quantize/quantize"
"github.com/soniakeys/quant/median"
)
const version = "4.6.0"
var (
version = "4.5"
srv http.Server
ctx context.Context
cancel context.CancelFunc
img = make(map[string]bytes.Buffer)
ismap = make(map[string]wrpReq)
nodel bool
deftype string
defgeom geom
addr = flag.String("l", ":8080", "Listen address:port, default :8080")
headless = flag.Bool("h", true, "Headless mode / hide browser window (default true)")
noDel = flag.Bool("n", false, "Do not free maps and images after use")
defType = flag.String("t", "gif", "Image type: png|gif|jpg")
jpgQual = flag.Int("q", 80, "Jpeg image quality, default 80%")
fgeom = flag.String("g", "1152x600x216", "Geometry: width x height x colors, height can be 0 for unlimited")
htmFnam = flag.String("ui", "wrp.html", "HTML template file for the UI")
delay = flag.Duration("s", 2*time.Second, "Delay/sleep after page is rendered and before screenshot is taken")
userAgent = flag.String("ua", "", "override chrome user agent")
srv http.Server
actx, ctx context.Context
acncl, cncl context.CancelFunc
img = make(map[string]bytes.Buffer)
ismap = make(map[string]wrpReq)
defGeom geom
htmlTmpl *template.Template
)
//go:embed *.html
var fs embed.FS
type geom struct {
w int64
h int64
c int64
}
// Data for html template
type uiData struct {
Version string
URL string
BgColor string
NColors int64
Width int64
Height int64
Zoom float64
ImgType string
ImgURL string
ImgSize string
ImgWidth int
ImgHeight int
MapURL string
PageHeight string
}
// Parameters for HTML print function
type printParams struct {
bgColor string
pageHeight string
imgSize string
imgURL string
mapURL string
imgWidth int
imgHeight int
}
// WRP Request
type wrpReq struct {
U string // url
W int64 // width
H int64 // height
S float64 // scale
C int64 // #colors
X int64 // mouseX
Y int64 // mouseY
K string // keys to send
F string // Fn buttons
T string // imgtype
url string // url
width int64 // width
height int64 // height
zoom float64 // zoom/scale
colors int64 // #colors
mouseX int64 // mouseX
mouseY int64 // mouseY
keys string // keys to send
buttons string // Fn buttons
imgType string // imgtype
w http.ResponseWriter
r *http.Request
}
func (w *wrpReq) parseForm(req *http.Request) {
req.ParseForm()
w.U = req.FormValue("url")
if len(w.U) > 1 && !strings.HasPrefix(w.U, "http") {
w.U = fmt.Sprintf("http://www.google.com/search?q=%s", url.QueryEscape(w.U))
// Parse HTML Form, Process Input Boxes, Etc.
func (rq *wrpReq) parseForm() {
rq.r.ParseForm()
rq.url = rq.r.FormValue("url")
if len(rq.url) > 1 && !strings.HasPrefix(rq.url, "http") {
rq.url = fmt.Sprintf("http://www.google.com/search?q=%s", url.QueryEscape(rq.url))
}
w.W, _ = strconv.ParseInt(req.FormValue("w"), 10, 64)
w.H, _ = strconv.ParseInt(req.FormValue("h"), 10, 64)
if w.W < 10 && w.H < 10 {
w.W = defgeom.w
w.H = defgeom.h
rq.width, _ = strconv.ParseInt(rq.r.FormValue("w"), 10, 64)
rq.height, _ = strconv.ParseInt(rq.r.FormValue("h"), 10, 64)
if rq.width < 10 && rq.height < 10 {
rq.width = defGeom.w
rq.height = defGeom.h
}
w.S, _ = strconv.ParseFloat(req.FormValue("s"), 64)
if w.S < 0.1 {
w.S = 1.0
rq.zoom, _ = strconv.ParseFloat(rq.r.FormValue("z"), 64)
if rq.zoom < 0.1 {
rq.zoom = 1.0
}
w.C, _ = strconv.ParseInt(req.FormValue("c"), 10, 64)
if w.C < 2 || w.C > 256 {
w.C = defgeom.c
rq.colors, _ = strconv.ParseInt(rq.r.FormValue("c"), 10, 64)
if rq.colors < 2 || rq.colors > 256 {
rq.colors = defGeom.c
}
w.K = req.FormValue("k")
w.F = req.FormValue("Fn")
w.T = req.FormValue("t")
if w.T != "gif" && w.T != "png" {
w.T = deftype
rq.keys = rq.r.FormValue("k")
rq.buttons = rq.r.FormValue("Fn")
rq.imgType = rq.r.FormValue("t")
switch rq.imgType {
case "png":
case "gif":
case "jpg":
default:
rq.imgType = *defType
}
log.Printf("%s WrpReq from Form: %+v\n", req.RemoteAddr, w)
log.Printf("%s WrpReq from UI Form: %+v\n", rq.r.RemoteAddr, rq)
}
func (w wrpReq) printPage(out http.ResponseWriter, bgcolor string) {
var s string
out.Header().Set("Cache-Control", "max-age=0")
out.Header().Set("Expires", "-1")
out.Header().Set("Pragma", "no-cache")
out.Header().Set("Content-Type", "text/html")
fmt.Fprintf(out, "<!-- Web Rendering Proxy Version %s -->\n", version)
fmt.Fprintf(out, "<HTML>\n<HEAD><TITLE>WRP %s</TITLE></HEAD>\n<BODY BGCOLOR=\"%s\">\n", w.U, bgcolor)
fmt.Fprintf(out, "<FORM ACTION=\"/\" METHOD=\"POST\">\n")
fmt.Fprintf(out, "<INPUT TYPE=\"TEXT\" NAME=\"url\" VALUE=\"%s\" SIZE=\"20\">", w.U)
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" VALUE=\"Go\">\n")
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" NAME=\"Fn\" VALUE=\"Bk\">\n")
fmt.Fprintf(out, "W <INPUT TYPE=\"TEXT\" NAME=\"w\" VALUE=\"%d\" SIZE=\"4\"> \n", w.W)
fmt.Fprintf(out, "H <INPUT TYPE=\"TEXT\" NAME=\"h\" VALUE=\"%d\" SIZE=\"4\"> \n", w.H)
fmt.Fprintf(out, "S <SELECT NAME=\"s\">\n")
for _, v := range []float64{0.65, 0.75, 0.85, 0.95, 1.0, 1.05, 1.15, 1.25} {
if v == w.S {
s = "SELECTED"
} else {
s = ""
}
fmt.Fprintf(out, "<OPTION VALUE=\"%1.2f\" %s>%1.2f</OPTION>\n", v, s, v)
}
fmt.Fprintf(out, "</SELECT>\n")
fmt.Fprintf(out, "T <SELECT NAME=\"t\">\n")
for _, v := range []string{"gif", "png"} {
if v == w.T {
s = "SELECTED"
} else {
s = ""
}
fmt.Fprintf(out, "<OPTION VALUE=\"%s\" %s>%s</OPTION>\n", v, s, strings.ToUpper(v))
}
fmt.Fprintf(out, "</SELECT>\n")
fmt.Fprintf(out, "C <INPUT TYPE=\"TEXT\" NAME=\"c\" VALUE=\"%d\" SIZE=\"3\">\n", w.C)
fmt.Fprintf(out, "K <INPUT TYPE=\"TEXT\" NAME=\"k\" VALUE=\"\" SIZE=\"4\"> \n")
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" NAME=\"Fn\" VALUE=\"Bs\">\n")
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" NAME=\"Fn\" VALUE=\"Rt\">\n")
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" NAME=\"Fn\" VALUE=\"&lt;\">\n")
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" NAME=\"Fn\" VALUE=\"^\">\n")
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" NAME=\"Fn\" VALUE=\"v\">\n")
fmt.Fprintf(out, "<INPUT TYPE=\"SUBMIT\" NAME=\"Fn\" VALUE=\"&gt;\" SIZE=\"1\">\n")
fmt.Fprintf(out, "</FORM><BR>\n")
}
func (w wrpReq) printFooter(out http.ResponseWriter, h string, s string) {
fmt.Fprintf(out, "\n<P><FONT SIZE=\"-2\"><A HREF=\"/?url=https://github.com/tenox7/wrp/&w=%d&h=%d&s=%1.2f&c=%d&t=%s\">"+
"Web Rendering Proxy Version %s</A> | <A HREF=\"/shutdown/\">Shutdown WRP</A> | "+
"<A HREF=\"/\">Page Height: %s</A> | <A HREF=\"/\">Img Size: %s</A></FONT></BODY>\n</HTML>\n", w.W, w.H, w.S, w.C, w.T, version, h, s)
}
func pageServer(out http.ResponseWriter, req *http.Request) {
log.Printf("%s Page Request for %s [%+v]\n", req.RemoteAddr, req.URL.Path, req.URL.RawQuery)
var w wrpReq
w.parseForm(req)
if len(w.U) > 4 {
w.capture(req.RemoteAddr, out)
} else {
w.printPage(out, "#FFFFFF")
w.printFooter(out, "", "")
}
}
func mapServer(out http.ResponseWriter, req *http.Request) {
log.Printf("%s ISMAP Request for %s [%+v]\n", req.RemoteAddr, req.URL.Path, req.URL.RawQuery)
w, ok := ismap[req.URL.Path]
if !ok {
fmt.Fprintf(out, "Unable to find map %s\n", req.URL.Path)
log.Printf("Unable to find map %s\n", req.URL.Path)
return
}
if !nodel {
defer delete(ismap, req.URL.Path)
}
n, err := fmt.Sscanf(req.URL.RawQuery, "%d,%d", &w.X, &w.Y)
if err != nil || n != 2 {
fmt.Fprintf(out, "n=%d, err=%s\n", n, err)
log.Printf("%s ISMAP n=%d, err=%s\n", req.RemoteAddr, n, err)
return
}
log.Printf("%s WrpReq from ISMAP: %+v\n", req.RemoteAddr, w)
if len(w.U) > 4 {
w.capture(req.RemoteAddr, out)
} else {
w.printPage(out, "#FFFFFF")
w.printFooter(out, "", "")
}
}
func imgServer(out http.ResponseWriter, req *http.Request) {
log.Printf("%s IMG Request for %s\n", req.RemoteAddr, req.URL.Path)
imgbuf, ok := img[req.URL.Path]
if !ok || imgbuf.Bytes() == nil {
fmt.Fprintf(out, "Unable to find image %s\n", req.URL.Path)
log.Printf("%s Unable to find image %s\n", req.RemoteAddr, req.URL.Path)
return
}
if !nodel {
defer delete(img, req.URL.Path)
}
if strings.HasPrefix(req.URL.Path, ".gif") {
out.Header().Set("Content-Type", "image/gif")
} else if strings.HasPrefix(req.URL.Path, ".png") {
out.Header().Set("Content-Type", "image/png")
}
out.Header().Set("Content-Length", strconv.Itoa(len(imgbuf.Bytes())))
out.Header().Set("Cache-Control", "max-age=0")
out.Header().Set("Expires", "-1")
out.Header().Set("Pragma", "no-cache")
out.Write(imgbuf.Bytes())
out.(http.Flusher).Flush()
}
func (w wrpReq) capture(c string, out http.ResponseWriter) {
var err error
if w.X > 0 && w.Y > 0 {
log.Printf("%s Mouse Click %d,%d\n", c, w.X, w.Y)
err = chromedp.Run(ctx, chromedp.MouseClickXY(int64(float64(w.X)/float64(w.S)), int64(float64(w.Y)/float64(w.S))))
} else if len(w.F) > 0 {
log.Printf("%s Button %v\n", c, w.F)
switch w.F {
case "Bk":
err = chromedp.Run(ctx, chromedp.NavigateBack())
case "Bs":
err = chromedp.Run(ctx, chromedp.KeyEvent("\b"))
case "Rt":
err = chromedp.Run(ctx, chromedp.KeyEvent("\r"))
case "<":
err = chromedp.Run(ctx, chromedp.KeyEvent("\u0302"))
case "^":
err = chromedp.Run(ctx, chromedp.KeyEvent("\u0304"))
case "v":
err = chromedp.Run(ctx, chromedp.KeyEvent("\u0301"))
case ">":
err = chromedp.Run(ctx, chromedp.KeyEvent("\u0303"))
}
} else if len(w.K) > 0 {
log.Printf("%s Sending Keys: %#v\n", c, w.K)
err = chromedp.Run(ctx, chromedp.KeyEvent(w.K))
} else {
log.Printf("%s Processing Capture Request for %s\n", c, w.U)
err = chromedp.Run(ctx, chromedp.Navigate(w.U))
// Display WP UI
func (rq *wrpReq) printHTML(p printParams) {
rq.w.Header().Set("Cache-Control", "max-age=0")
rq.w.Header().Set("Expires", "-1")
rq.w.Header().Set("Pragma", "no-cache")
rq.w.Header().Set("Content-Type", "text/html")
data := uiData{
Version: version,
URL: rq.url,
BgColor: p.bgColor,
Width: rq.width,
Height: rq.height,
NColors: rq.colors,
Zoom: rq.zoom,
ImgType: rq.imgType,
ImgSize: p.imgSize,
ImgWidth: p.imgWidth,
ImgHeight: p.imgHeight,
ImgURL: p.imgURL,
MapURL: p.mapURL,
PageHeight: p.pageHeight,
}
err := htmlTmpl.Execute(rq.w, data)
if err != nil {
if err.Error() == "context canceled" {
log.Printf("%s Contex cancelled, try again", c)
fmt.Fprintf(out, "<BR>%s<BR> -- restarting, try again", err)
ctx, cancel = chromedp.NewContext(context.Background())
} else {
log.Printf("%s %s", c, err)
fmt.Fprintf(out, "<BR>%s<BR>", err)
log.Fatal(err)
}
}
// Determine what action to take
func (rq *wrpReq) action() chromedp.Action {
// Mouse Click
if rq.mouseX > 0 && rq.mouseY > 0 {
log.Printf("%s Mouse Click %d,%d\n", rq.r.RemoteAddr, rq.mouseX, rq.mouseY)
return chromedp.MouseClickXY(float64(rq.mouseX)/float64(rq.zoom), float64(rq.mouseY)/float64(rq.zoom))
}
// Buttons
if len(rq.buttons) > 0 {
log.Printf("%s Button %v\n", rq.r.RemoteAddr, rq.buttons)
switch rq.buttons {
case "Bk":
return chromedp.NavigateBack()
case "St":
return chromedp.Stop()
case "Re":
return chromedp.Reload()
case "Bs":
return chromedp.KeyEvent("\b")
case "Rt":
return chromedp.KeyEvent("\r")
case "<":
return chromedp.KeyEvent("\u0302")
case "^":
return chromedp.KeyEvent("\u0304")
case "v":
return chromedp.KeyEvent("\u0301")
case ">":
return chromedp.KeyEvent("\u0303")
}
}
// Keys
if len(rq.keys) > 0 {
log.Printf("%s Sending Keys: %#v\n", rq.r.RemoteAddr, rq.keys)
return chromedp.KeyEvent(rq.keys)
}
// Navigate to URL
log.Printf("%s Processing Capture Request for %s\n", rq.r.RemoteAddr, rq.url)
return chromedp.Navigate(rq.url)
}
// Navigate to the desired URL.
func (rq *wrpReq) navigate() {
ctxErr(chromedp.Run(ctx, rq.action()), rq.w)
}
// Handle context errors
func ctxErr(err error, w io.Writer) {
// TODO: callers should have retry logic, perhaps create another function
// that takes ...chromedp.Action and retries with give up
if err == nil {
return
}
var styles []*css.ComputedProperty
log.Printf("Context error: %s", err)
fmt.Fprintf(w, "Context error: %s<BR>\n", err)
if err.Error() != "context canceled" {
return
}
ctx, cncl = chromedp.NewContext(actx)
log.Printf("Created new context, try again")
fmt.Fprintln(w, "Created new context, try again")
}
// https://github.com/chromedp/chromedp/issues/979
func chromedpCaptureScreenshot(res *[]byte, h int64) chromedp.Action {
if res == nil {
panic("res cannot be nil")
}
if h == 0 {
return chromedp.CaptureScreenshot(res)
}
return chromedp.ActionFunc(func(ctx context.Context) error {
var err error
*res, err = page.CaptureScreenshot().Do(ctx)
return err
})
}
func gifPalette(i image.Image, n int64) image.Image {
switch n {
case 2:
i = halfgone.FloydSteinbergDitherer{}.Apply(halfgone.ImageToGray(i))
case 216:
var FastGifLut = [256]int{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5}
r := i.Bounds()
// NOTE: the color index computation below works only for palette.WebSafe!
p := image.NewPaletted(r, palette.WebSafe)
if i64, ok := i.(image.RGBA64Image); ok {
for y := r.Min.Y; y < r.Max.Y; y++ {
for x := r.Min.X; x < r.Max.X; x++ {
c := i64.RGBA64At(x, y)
r6 := FastGifLut[c.R>>8]
g6 := FastGifLut[c.G>>8]
b6 := FastGifLut[c.B>>8]
p.SetColorIndex(x, y, uint8(36*r6+6*g6+b6))
}
}
} else {
for y := r.Min.Y; y < r.Max.Y; y++ {
for x := r.Min.X; x < r.Max.X; x++ {
c := i.At(x, y)
r, g, b, _ := c.RGBA()
r6 := FastGifLut[r&0xff]
g6 := FastGifLut[g&0xff]
b6 := FastGifLut[b&0xff]
p.SetColorIndex(x, y, uint8(36*r6+6*g6+b6))
}
}
}
i = p
default:
q := median.Quantizer(n)
i = q.Paletted(i)
}
return i
}
// Capture currently rendered web page to an image and fake ISMAP
func (rq *wrpReq) capture() {
var styles []*css.ComputedStyleProperty
var r, g, b int
var h int64
var pngcap []byte
var pngCap []byte
chromedp.Run(ctx,
emulation.SetDeviceMetricsOverride(int64(float64(w.W)/w.S), 10, w.S, false),
chromedp.Sleep(time.Second*2),
chromedp.Location(&w.U),
emulation.SetDeviceMetricsOverride(int64(float64(rq.width)/rq.zoom), 10, rq.zoom, false),
chromedp.Location(&rq.url),
chromedp.ComputedStyle("body", &styles, chromedp.ByQuery),
chromedp.ActionFunc(func(ctx context.Context) error {
_, _, s, err := page.GetLayoutMetrics().Do(ctx)
_, _, _, _, _, s, err := page.GetLayoutMetrics().Do(ctx)
if err == nil {
h = int64(math.Ceil(s.Height))
}
@ -270,123 +320,288 @@ func (w wrpReq) capture(c string, out http.ResponseWriter) {
fmt.Sscanf(style.Value, "rgb(%d,%d,%d)", &r, &g, &b)
}
}
log.Printf("%s Landed on: %s, Height: %v\n", c, w.U, h)
w.printPage(out, fmt.Sprintf("#%02X%02X%02X", r, g, b))
if w.H == 0 && h > 0 {
chromedp.Run(ctx, emulation.SetDeviceMetricsOverride(int64(float64(w.W)/w.S), h+30, w.S, false))
} else {
chromedp.Run(ctx, emulation.SetDeviceMetricsOverride(int64(float64(w.W)/w.S), int64(float64(w.H)/w.S), w.S, false))
}
err = chromedp.Run(ctx, chromedp.CaptureScreenshot(&pngcap))
if err != nil {
log.Printf("%s Failed to capture screenshot: %s\n", c, err)
fmt.Fprintf(out, "<BR>Unable to capture screenshot:<BR>%s<BR>\n", err)
return
log.Printf("%s Landed on: %s, Height: %v\n", rq.r.RemoteAddr, rq.url, h)
height := int64(float64(rq.height) / rq.zoom)
if rq.height == 0 && h > 0 {
height = h + 30
}
chromedp.Run(
ctx, emulation.SetDeviceMetricsOverride(int64(float64(rq.width)/rq.zoom), height, rq.zoom, false),
chromedp.Sleep(*delay), // TODO(tenox): find a better way to determine if page is rendered
)
// Capture screenshot...
ctxErr(chromedp.Run(ctx, chromedpCaptureScreenshot(&pngCap, rq.height)), rq.w)
seq := rand.Intn(9999)
imgpath := fmt.Sprintf("/img/%04d.%s", seq, w.T)
mappath := fmt.Sprintf("/map/%04d.map", seq)
ismap[mappath] = w
var ssize string
var sw, sh int
if w.T == "gif" {
i, err := png.Decode(bytes.NewReader(pngcap))
imgPath := fmt.Sprintf("/img/%04d.%s", seq, rq.imgType)
mapPath := fmt.Sprintf("/map/%04d.map", seq)
ismap[mapPath] = *rq
var sSize string
var iW, iH int
switch rq.imgType {
case "png":
pngBuf := bytes.NewBuffer(pngCap)
img[imgPath] = *pngBuf
cfg, _, _ := image.DecodeConfig(pngBuf)
sSize = fmt.Sprintf("%.0f KB", float32(len(pngBuf.Bytes()))/1024.0)
iW = cfg.Width
iH = cfg.Height
log.Printf("%s Got PNG image: %s, Size: %s, Res: %dx%d\n", rq.r.RemoteAddr, imgPath, sSize, iW, iH)
case "gif":
i, err := png.Decode(bytes.NewReader(pngCap))
if err != nil {
log.Printf("%s Failed to decode screenshot: %s\n", c, err)
fmt.Fprintf(out, "<BR>Unable to decode page screenshot:<BR>%s<BR>\n", err)
log.Printf("%s Failed to decode PNG screenshot: %s\n", rq.r.RemoteAddr, err)
fmt.Fprintf(rq.w, "<BR>Unable to decode page PNG screenshot:<BR>%s<BR>\n", err)
return
}
var gifbuf bytes.Buffer
err = gif.Encode(&gifbuf, i, &gif.Options{NumColors: int(w.C), Quantizer: quantize.MedianCutQuantizer{}})
st := time.Now()
var gifBuf bytes.Buffer
err = gif.Encode(&gifBuf, gifPalette(i, rq.colors), &gif.Options{})
if err != nil {
log.Printf("%s Failed to encode GIF: %s\n", c, err)
fmt.Fprintf(out, "<BR>Unable to encode GIF:<BR>%s<BR>\n", err)
log.Printf("%s Failed to encode GIF: %s\n", rq.r.RemoteAddr, err)
fmt.Fprintf(rq.w, "<BR>Unable to encode GIF:<BR>%s<BR>\n", err)
return
}
img[imgpath] = gifbuf
ssize = fmt.Sprintf("%.1f MB", float32(len(gifbuf.Bytes()))/1024.0/1024.0)
sw = i.Bounds().Max.X
sh = i.Bounds().Max.Y
log.Printf("%s Encoded GIF image: %s, Size: %s, Colors: %d, %dx%d\n", c, imgpath, ssize, w.C, sw, sh)
} else if w.T == "png" {
pngbuf := bytes.NewBuffer(pngcap)
img[imgpath] = *pngbuf
cfg, _, _ := image.DecodeConfig(pngbuf)
ssize = fmt.Sprintf("%.1f MB", float32(len(pngbuf.Bytes()))/1024.0/1024.0)
sw = cfg.Width
sh = cfg.Height
log.Printf("%s Got PNG image: %s, Size: %s, %dx%d\n", c, imgpath, ssize, sw, sh)
img[imgPath] = gifBuf
sSize = fmt.Sprintf("%.0f KB", float32(len(gifBuf.Bytes()))/1024.0)
iW = i.Bounds().Max.X
iH = i.Bounds().Max.Y
log.Printf("%s Encoded GIF image: %s, Size: %s, Colors: %d, Res: %dx%d, Time: %vms\n", rq.r.RemoteAddr, imgPath, sSize, rq.colors, iW, iH, time.Since(st).Milliseconds())
case "jpg":
i, err := png.Decode(bytes.NewReader(pngCap))
if err != nil {
log.Printf("%s Failed to decode PNG screenshot: %s\n", rq.r.RemoteAddr, err)
fmt.Fprintf(rq.w, "<BR>Unable to decode page PNG screenshot:<BR>%s<BR>\n", err)
return
}
st := time.Now()
var jpgBuf bytes.Buffer
err = jpeg.Encode(&jpgBuf, i, &jpeg.Options{Quality: *jpgQual})
if err != nil {
log.Printf("%s Failed to encode JPG: %s\n", rq.r.RemoteAddr, err)
fmt.Fprintf(rq.w, "<BR>Unable to encode JPG:<BR>%s<BR>\n", err)
return
}
img[imgPath] = jpgBuf
sSize = fmt.Sprintf("%.0f KB", float32(len(jpgBuf.Bytes()))/1024.0)
iW = i.Bounds().Max.X
iH = i.Bounds().Max.Y
log.Printf("%s Encoded JPG image: %s, Size: %s, Quality: %d, Res: %dx%d, Time: %vms\n", rq.r.RemoteAddr, imgPath, sSize, *jpgQual, iW, iH, time.Since(st).Milliseconds())
}
fmt.Fprintf(out, "<A HREF=\"%s\"><IMG SRC=\"%s\" BORDER=\"0\" ALT=\"Url: %s, Size: %s\" WIDTH=\"%d\" HEIGHT=\"%d\" ISMAP></A>", mappath, imgpath, w.U, ssize, sw, sh)
w.printFooter(out, fmt.Sprintf("%d PX", h), ssize)
log.Printf("%s Done with caputure for %s\n", c, w.U)
rq.printHTML(printParams{
bgColor: fmt.Sprintf("#%02X%02X%02X", r, g, b),
pageHeight: fmt.Sprintf("%d PX", h),
imgSize: sSize,
imgURL: imgPath,
mapURL: mapPath,
imgWidth: iW,
imgHeight: iH,
})
log.Printf("%s Done with capture for %s\n", rq.r.RemoteAddr, rq.url)
}
func haltServer(out http.ResponseWriter, req *http.Request) {
log.Printf("%s Shutdown Request for %s\n", req.RemoteAddr, req.URL.Path)
out.Header().Set("Cache-Control", "max-age=0")
out.Header().Set("Expires", "-1")
out.Header().Set("Pragma", "no-cache")
out.Header().Set("Content-Type", "text/plain")
fmt.Fprintf(out, "Shutting down WRP...\n")
out.(http.Flusher).Flush()
// Process HTTP requests to WRP '/' url
func pageServer(w http.ResponseWriter, r *http.Request) {
log.Printf("%s Page Request for %s [%+v]\n", r.RemoteAddr, r.URL.Path, r.URL.RawQuery)
rq := wrpReq{
r: r,
w: w,
}
rq.parseForm()
if len(rq.url) < 4 {
rq.printHTML(printParams{bgColor: "#FFFFFF"})
return
}
rq.navigate() // TODO: if error from navigate do not capture
rq.capture()
}
// Process HTTP requests to ISMAP '/map/' url
func mapServer(w http.ResponseWriter, r *http.Request) {
log.Printf("%s ISMAP Request for %s [%+v]\n", r.RemoteAddr, r.URL.Path, r.URL.RawQuery)
rq, ok := ismap[r.URL.Path]
rq.r = r
rq.w = w
if !ok {
fmt.Fprintf(w, "Unable to find map %s\n", r.URL.Path)
log.Printf("Unable to find map %s\n", r.URL.Path)
return
}
if !*noDel {
defer delete(ismap, r.URL.Path)
}
n, err := fmt.Sscanf(r.URL.RawQuery, "%d,%d", &rq.mouseX, &rq.mouseY)
if err != nil || n != 2 {
fmt.Fprintf(w, "n=%d, err=%s\n", n, err)
log.Printf("%s ISMAP n=%d, err=%s\n", r.RemoteAddr, n, err)
return
}
log.Printf("%s WrpReq from ISMAP: %+v\n", r.RemoteAddr, rq)
if len(rq.url) < 4 {
rq.printHTML(printParams{bgColor: "#FFFFFF"})
return
}
rq.navigate() // TODO: if error from navigate do not capture
rq.capture()
}
// Process HTTP requests for images '/img/' url
func imgServer(w http.ResponseWriter, r *http.Request) {
log.Printf("%s IMG Request for %s\n", r.RemoteAddr, r.URL.Path)
imgBuf, ok := img[r.URL.Path]
if !ok || imgBuf.Bytes() == nil {
fmt.Fprintf(w, "Unable to find image %s\n", r.URL.Path)
log.Printf("%s Unable to find image %s\n", r.RemoteAddr, r.URL.Path)
return
}
if !*noDel {
defer delete(img, r.URL.Path)
}
switch {
case strings.HasPrefix(r.URL.Path, ".gif"):
w.Header().Set("Content-Type", "image/gif")
case strings.HasPrefix(r.URL.Path, ".png"):
w.Header().Set("Content-Type", "image/png")
case strings.HasPrefix(r.URL.Path, ".jpg"):
w.Header().Set("Content-Type", "image/jpeg")
}
w.Header().Set("Content-Length", strconv.Itoa(len(imgBuf.Bytes())))
w.Header().Set("Cache-Control", "max-age=0")
w.Header().Set("Expires", "-1")
w.Header().Set("Pragma", "no-cache")
w.Write(imgBuf.Bytes())
w.(http.Flusher).Flush()
}
// Process HTTP requests for Shutdown via '/shutdown/' url
func haltServer(w http.ResponseWriter, r *http.Request) {
log.Printf("%s Shutdown Request for %s\n", r.RemoteAddr, r.URL.Path)
w.Header().Set("Cache-Control", "max-age=0")
w.Header().Set("Expires", "-1")
w.Header().Set("Pragma", "no-cache")
w.Header().Set("Content-Type", "text/plain")
fmt.Fprintf(w, "Shutting down WRP...\n")
w.(http.Flusher).Flush()
time.Sleep(time.Second * 2)
cancel()
cncl()
acncl()
srv.Shutdown(context.Background())
os.Exit(1)
}
func main() {
var addr, fgeom string
var head, headless bool
var debug bool
var err error
flag.StringVar(&addr, "l", ":8080", "Listen address:port, default :8080")
flag.BoolVar(&head, "h", false, "Headed mode - display browser window")
flag.BoolVar(&debug, "d", false, "Debug ChromeDP")
flag.BoolVar(&nodel, "n", false, "Do not free maps and images after use")
flag.StringVar(&deftype, "t", "gif", "Image type: gif|png")
flag.StringVar(&fgeom, "g", "1152x600x256", "Geometry: width x height x colors, height can be 0 for unlimited")
flag.Parse()
if head {
headless = false
} else {
headless = true
// returns html template, either from html file or built-in
func tmpl(t string) string {
var tmpl []byte
fh, err := os.Open(t)
if err != nil {
goto builtin
}
n, err := fmt.Sscanf(fgeom, "%dx%dx%d", &defgeom.w, &defgeom.h, &defgeom.c)
defer fh.Close()
tmpl, err = ioutil.ReadAll(fh)
if err != nil {
goto builtin
}
log.Printf("Got HTML UI template from %v file, size %v \n", t, len(tmpl))
return string(tmpl)
builtin:
fhs, err := fs.Open("wrp.html")
if err != nil {
log.Fatal(err)
}
defer fhs.Close()
tmpl, err = ioutil.ReadAll(fhs)
if err != nil {
log.Fatal(err)
}
log.Printf("Got HTML UI template from embed\n")
return string(tmpl)
}
// Print my own IP addresses
func printIPs(b string) {
ap := strings.Split(b, ":")
if len(ap) < 1 {
log.Fatal("Wrong format of ipaddress:port")
}
log.Printf("Listen address: %v", b)
if ap[0] != "" && ap[0] != "0.0.0.0" {
return
}
a, err := net.InterfaceAddrs()
if err != nil {
log.Print("Unable to get interfaces: ", err)
return
}
var m string
for _, i := range a {
n, ok := i.(*net.IPNet)
if !ok || n.IP.IsLoopback() || strings.Contains(n.IP.String(), ":") {
continue
}
m = m + n.IP.String() + " "
}
log.Print("My IP addresses: ", m)
}
// Main
func main() {
var err error
flag.Parse()
log.Printf("Web Rendering Proxy Version %s\n", version)
log.Printf("Args: %q", os.Args)
if len(os.Getenv("PORT")) > 0 {
*addr = ":" + os.Getenv(("PORT"))
}
printIPs(*addr)
n, err := fmt.Sscanf(*fgeom, "%dx%dx%d", &defGeom.w, &defGeom.h, &defGeom.c)
if err != nil || n != 3 {
log.Fatalf("Unable to parse -g geometry flag / %s", err)
}
opts := append(chromedp.DefaultExecAllocatorOptions[:],
chromedp.Flag("headless", headless),
chromedp.Flag("headless", *headless),
chromedp.Flag("hide-scrollbars", false),
chromedp.Flag("enable-automation", false),
chromedp.Flag("disable-blink-features", "AutomationControlled"),
)
actx, acancel := chromedp.NewExecAllocator(context.Background(), opts...)
defer acancel()
if debug {
ctx, cancel = chromedp.NewContext(actx, chromedp.WithDebugf(log.Printf))
} else {
ctx, cancel = chromedp.NewContext(actx)
if *userAgent != "" {
opts = append(opts, chromedp.UserAgent(*userAgent))
}
defer cancel()
actx, acncl = chromedp.NewExecAllocator(context.Background(), opts...)
defer acncl()
ctx, cncl = chromedp.NewContext(actx)
defer cncl()
rand.Seed(time.Now().UnixNano())
c := make(chan os.Signal)
signal.Notify(c, os.Interrupt, syscall.SIGTERM)
go func() {
<-c
log.Printf("Interrupt - shutting down.")
cancel()
cncl()
acncl()
srv.Shutdown(context.Background())
os.Exit(1)
}()
http.HandleFunc("/", pageServer)
http.HandleFunc("/map/", mapServer)
http.HandleFunc("/img/", imgServer)
http.HandleFunc("/shutdown/", haltServer)
http.HandleFunc("/favicon.ico", http.NotFound)
log.Printf("Web Rendering Proxy Version %s\n", version)
log.Printf("Starting WRP http server on %s\n", addr)
srv.Addr = addr
log.Printf("Default Img Type: %v, Geometry: %+v", *defType, defGeom)
htmlTmpl, err = template.New("wrp.html").Parse(tmpl(*htmFnam))
if err != nil {
log.Fatal(err)
}
log.Print("Starting WRP http server")
srv.Addr = *addr
err = srv.ListenAndServe()
if err != nil {
log.Fatal(err)

58
wrp.html Normal file
View File

@ -0,0 +1,58 @@
<HTML>
<HEAD>
<TITLE>WRP {{.URL}}</TITLE>
</HEAD>
<BODY BGCOLOR="{{.BgColor}}">
<FORM ACTION="/" METHOD="POST">
<INPUT TYPE="TEXT" NAME="url" VALUE="{{.URL}}" SIZE="20">
<INPUT TYPE="SUBMIT" VALUE="Go">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="Bk">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="St">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="Re">
W <INPUT TYPE="TEXT" NAME="w" VALUE="{{.Width}}" SIZE="4">
H <INPUT TYPE="TEXT" NAME="h" VALUE="{{.Height}}" SIZE="4">
Z <SELECT NAME="z">
<OPTION VALUE="0.7" {{ if eq .Zoom 0.7}}SELECTED{{end}}>0.7 x</OPTION>
<OPTION VALUE="0.8" {{ if eq .Zoom 0.8}}SELECTED{{end}}>0.8 x</OPTION>
<OPTION VALUE="0.9" {{ if eq .Zoom 0.9}}SELECTED{{end}}>0.9 x</OPTION>
<OPTION VALUE="1.0" {{ if eq .Zoom 1.0}}SELECTED{{end}}>1.0 x</OPTION>
<OPTION VALUE="1.1" {{ if eq .Zoom 1.1}}SELECTED{{end}}>1.1 x</OPTION>
<OPTION VALUE="1.2" {{ if eq .Zoom 1.2}}SELECTED{{end}}>1.2 x</OPTION>
<OPTION VALUE="1.3" {{ if eq .Zoom 1.3}}SELECTED{{end}}>1.3 x</OPTION>
</SELECT>
T <SELECT NAME="t">
<OPTION VALUE="png" {{ if eq .ImgType "png"}}SELECTED{{end}}>PNG</OPTION>
<OPTION VALUE="gif" {{ if eq .ImgType "gif"}}SELECTED{{end}}>GIF</OPTION>
<OPTION VALUE="jpg" {{ if eq .ImgType "jpg"}}SELECTED{{end}}>JPG</OPTION>
</SELECT>
C <SELECT NAME="c">
<OPTION VALUE="256" {{ if eq .NColors 256}}SELECTED{{end}}>256</OPTION>
<OPTION VALUE="216" {{ if eq .NColors 216}}SELECTED{{end}}>216</OPTION>
<OPTION VALUE="128" {{ if eq .NColors 128}}SELECTED{{end}}>128</OPTION>
<OPTION VALUE="64" {{ if eq .NColors 64}}SELECTED{{end}}>64</OPTION>
<OPTION VALUE="16" {{ if eq .NColors 16}}SELECTED{{end}}>16</OPTION>
<OPTION VALUE="2" {{ if eq .NColors 2}}SELECTED{{end}}>2</OPTION>
</SELECT>
K <INPUT TYPE="TEXT" NAME="k" VALUE="" SIZE="4">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="Bs">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="Rt"><!--
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="&lt;">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="^">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="v">
<INPUT TYPE="SUBMIT" NAME="Fn" VALUE="&gt;" SIZE="1">-->
</FORM>
<BR>
{{if .ImgURL}}
<A HREF="{{.MapURL}}">
<IMG SRC="{{.ImgURL}}" BORDER="0" ALT="Url: {{.URL}}, Size: {{.ImgSize}} PageHeight: {{.PageHeight}}" WIDTH="{{.ImgWidth}}" HEIGHT="{{.ImgHeight}}" ISMAP>
</A>
<P>
{{end}}
<FONT SIZE="-2">
<A HREF="/?url=https://github.com/tenox7/wrp/&w={{.Width}}&h={{.Height}}&s={{printf "%.1f" .Zoom}}&c={{.NColors}}&t={{.ImgType}}">Web Rendering Proxy {{.Version}}</A> |
<A HREF="/shutdown/">Shutdown WRP</A> |
<A HREF="/">Page Height: {{.PageHeight}}</A> |
<A HREF="/">Img Size: {{.ImgSize}}</A>
</FONT>
</BODY>
</HTML>