-
Notifications
You must be signed in to change notification settings - Fork 0
API Reference WebPage
This is a living document. As the codebase is updated, we hope to keep this document updated as well. Unless otherwise stated, this document currently applies to the latest PhantomJS release: PhantomJS 1.8.0
Note: This page serves as a reference. To learn step-by-step on how to use PhantomJS, please refer to the Quick Start guide.
## Module: WebPage ## A `WebPage` object encapsulates a web page. It is usually instantiated using the following pattern: ```js var page = require('webpage').create(); ```Note: For backward compatibility with legacy PhantomJS applications, the constructor also remains exposed as a deprecated global WebPage
object:
var page = new WebPage();
clipRect
canGoBack
canGoForward
content
cookies
customHeaders
event
focusedFrameName
frameContent
frameName
framePlainText
frameTitle
frameUrl
framesCount
framesName
libraryPath
navigationLocked
offlineStoragePath
offlineStorageQuota
ownsPages
pages
pagesWindowName
paperSize
plainText
scrollPosition
settings
title
url
viewportSize
windowName
zoomFactor
addCookie()
childFramesCount()
childFramesName()
clearCookies()
close()
currentFrameName()
deleteCookie()
evaluateJavaScript()
evaluate()
evaluateAsync()
getPage()
go()
goBack()
goForward()
includeJs()
injectJs()
open()
openUrl()
release()
reload()
render()
renderBase64()
sendEvent()
setContent()
stop()
switchToFocusedFrame()
switchToFrame()
switchToChildFrame()
switchToMainFrame()
switchToParentFrame()
uploadFile()
onAlert
onCallback
onClosing
onConfirm
onConsoleMessage
onError
onFilePicker
onInitialized
onLoadFinished
onLoadStarted
onNavigationRequested
onPageCreated
onPrompt
onResourceRequested
onResourceReceived
onResourceTimeout
onResourceError
onUrlChanged
Internal methods to trigger callbacks :
closing()
initialized()
javaScriptAlertSent()
javaScriptConsoleMessageSent()
loadFinished()
loadStarted()
navigationRequested()
rawPageCreated()
resourceReceived()
resourceRequested()
urlChanged()
Example:
page.clipRect = { top: 14, left: 3, width: 400, height: 300 };
See also plainText
to get the content without any HTML tags.
Example:
// Send two additional headers 'X-Test' and 'DNT'.
page.customHeaders = {
'X-Test': 'foo',
'DNT': '1'
};
Do you only want these customHeaders
passed to the initial WebPage#open
request? Here's the recommended workaround:
// Send two additional headers 'X-Test' and 'DNT'.
page.customHeaders = {
'X-Test': 'foo',
'DNT': '1'
};
page.onInitialized = function() {
page.customHeaders = {};
};
The given object should be in one of the following two formats:
{ width: '200px', height: '300px', border: '0px' }
{ format: 'A4', orientation: 'portrait', border: '1cm' }
If no paperSize
is defined, the size is defined by the web page. Supported dimension units are: 'mm'
, 'cm'
, 'in'
, 'px'
. No unit means 'px'
. Border is optional and defaults to 0
. A non-uniform border can be specified in the form {left: '2cm', top: '2cm', right: '2cm', bottom: '3cm'}
. Supported formats are: 'A3'
, 'A4'
, 'A5'
, 'Legal'
, 'Letter'
, 'Tabloid'
. Orientation ('portrait'
, 'landscape'
) is optional and defaults to 'portrait'
.
Example:
page.paperSize = { width: '5in', height: '7in', border: '20px' };
A repeating page header
and footer
can also be added via this property, as in this example:
page.paperSize = {
format: 'A4',
// ...
header: {
height: "1cm",
contents: phantom.callback(function(pageNum, numPages) {
if (pageNum == 1) {
return "";
}
return "<h1>Header <span style='float:right'>" + pageNum + " / " + numPages + "</span></h1>";
})
}
}
See also: content
which returns the content with element tags.
Example:
page.scrollPosition = { top: 100, left: 0 };
Note: The settings
apply only during the initial call to the WebPage#open
function. Subsequent modification of the settings
object will not have any impact.
Because PhantomJS is headless (nothing is shown), viewportSize
effectively simulates the size of the window like in a traditional browser.
Example:
page.viewportSize = { width: 480, height: 800 };
Example:
// Create a thumbnail preview with 25% zoom
page.zoomFactor = 0.25;
page.render('capture.png');
Example:
page.addCookie({
'name' : 'Added-Cookie-Name',
'value' : 'Added-Cookie-Value',
'domain': 'Added-Cookie-Domain'
});
Deprecated.
#### `childFramesName()` ####Deprecated.
#### `clearCookies()` {void} #### **Introduced:** PhantomJS 1.7 Delete all [Cookies](#cookie) visible to the current URL. #### `close()` {void} #### **Introduced:** PhantomJS 1.7 Close the page and releases the memory heap associated with it. Do not use the page instance after calling this.Due to some technical limitations, the web page object might not be completely garbage collected. This is often encountered when the same object is used over and over again. Calling this function may stop the increasing heap allocation.
#### `currentFrameName()` ####Deprecated.
#### `deleteCookie(cookieName)` {boolean} #### **Introduced:** PhantomJS 1.7 Delete any [Cookies](#cookie) visible to the current URL with a 'name' property matching `cookieName`. Returns `true` if successfully deleted, otherwise `false`.Example:
page.deleteCookie('Added-Cookie-Name');
Example:
var page = require('webpage').create();
page.open('http://m.bing.com', function(status) {
var title = page.evaluate(function() {
return document.title;
});
console.log(title);
phantom.exit();
});
As of PhantomJS 1.6, JSON-serializable arguments can be passed to the function. In the following example, the text value of a DOM element is extracted. The following example achieves the same end goal as the previous example but the element is chosen based on a selector which is passed to the evaluate
call:
var page = require('webpage').create();
page.open('http://m.bing.com', function(status) {
var title = page.evaluate(function(s) {
return document.querySelector(s).innerText;
}, 'title');
console.log(title);
phantom.exit();
});
Note: The arguments and the return value to the evaluate
function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine. Closures,
functions, DOM nodes, etc. will not work!
Example:
page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js', function() {
/* jQuery is loaded, now manipulate the DOM */
});
Example:
page.open('http://www.google.com/', function(status) {
console.log('Status: ' + status);
// Do other things here...
});
As of PhantomJS 1.2, the open function can be used to request a URL with methods other than GET. This syntax also includes the ability to specify data to be sent with the request. In the following example, we make a request using the POST method, and include some basic data.
Example:
var data = 'user=username&password=password';
page.open('http://www.google.com/', 'POST', data, function(status) {
console.log('Status: ' + status);
// Do other things here...
});
Due to some technical limitations, the web page object might not be completely garbage collected. This is often encountered when the same object is used over and over again. Calling this function may stop the increasing heap allocation.
#### `reload()` #### #### `render(filename, options)` {void} #### Renders the web page to an image buffer and saves it as the specified `filename`. The special files `/dev/stdout` and `/dev/stderr` can be used here. The `options` hash is optional, and may contain these options: `format` and `quality`.// render to file named "test.jpg" with JPEG format
page.render("test.jpg");
// render to file named "test.jpg" with PNG format. format option will override format of file extension.
page.render("test.jpg", { format: "png" });
// render to "test.jpg" with JPEG format and 50 quality
page.render("test.jpg", { quality: 50 });
// render to "test.jpg" with JPEG format and 50 quality
page.render("test.jpg", { format: "jpg", quality: 50 });
// render to stdout with PNG format. PNG is default for stdout.
page.render("/dev/stdout");
// render to stdout with JPEG format.
page.render("/dev/stdout", { format: "jpg" });
// render to stdout with JPEG format and 50 quality.
page.render("/dev/stdout", { format: "jpg", quality: 50 });
The output format is determined automatically by the file extension. Supported formats include:
- PNG
- GIF
- JPEG
- And any other formats available in the QImage class.
Supported formats include:
- PNG
- GIF
- JPEG
The events are not like synthetic DOM events. Each event is sent to the web page as if it comes as part of user interaction.
The first argument is the event type. Supported types are 'mouseup'
, 'mousedown'
, 'mousemove'
, 'doubleclick'
and 'click'
. The next two arguments are optional but represent the mouse position for the event.
The button parameter (defaults to left
) specifies the button to push.
For 'mousemove'
, however, there is no button pressed (i.e. it is not dragging).
The first argument is the event type. The supported types are: keyup
, keypress
and keydown
. The second parameter is a key (from page.event.key), or a string.
You can also indicate a fifth argument, which is an integer indicating the modifier key.
- 0: No modifier key is pressed
- 0x02000000: A Shift key on the keyboard is pressed
- 0x04000000: A Ctrl key on the keyboard is pressed
- 0x08000000: An Alt key on the keyboard is pressed
- 0x10000000: A Meta key on the keyboard is pressed
- 0x20000000: A keypad button is pressed
Third and fourth argument are not taken account for keyboard events. Just give null for them.
Example:
page.sendEvent('keypress', page.event.key.A, null, null, 0x02000000 | 0x08000000 );
It simulate a shift+alt+A keyboard combination.
#### `setContent(content, url)` #### **Introduced:** PhantomJS 1.8Allows to set both WebPage#content
and WebPage#url
properties.
The webpage will be reloaded with the new content and the current location set as the given url, without any actual http request being made.
#### `stop()` #### #### `switchToFocusedFrame()` #### #### `switchToFrame(frameName)` or `switchToFrame(framePosition)` #### #### `switchToChildFrame(frameName)` or `switchToChildFrame(framePosition)` ####deprecated
#### `switchToMainFrame()` #### #### `switchToParentFrame()` #### #### `uploadFile(selector, filename)` #### Uploads the specified file (`filename`) to the form element associated with the `selector`.This function is used to automate the upload of a file, which is usually handled with a file dialog in a traditional browser. Since there is no dialog in this headless mode, such an upload mechanism is handled via this special function instead.
Example:
page.uploadFile('input[name=image]', '/path/to/some/photo.jpg');
Example:
page.onAlert = function(msg) {
console.log('ALERT: ' + msg);
};
Note: window.callPhantom
is still an experimental API. In the near future, it will be likely replaced with a message-based solution which will still provide the same functionality.
Although there are many possible use cases for this inversion of control, the primary one so far is to prevent the need for a PhantomJS script to be continually polling for some variable on the web page.
Example:
WebPage (client-side)
if (typeof window.callPhantom === 'function') {
window.callPhantom({ hello: 'world' });
}
PhantomJS (server-side)
page.onCallback = function(data) {
console.log('CALLBACK: ' + JSON.stringify(data)); // Prints 'CALLBACK: { "hello": "world" }'
};
Additionally, note that the WebPage#onCallback
handler can return a data object that will be carried back as the result of the originating window.callPhantom
call, too.
Example:
WebPage (client-side)
if (typeof window.callPhantom === 'function') {
var status = window.callPhantom({ secret: 'ghostly' });
alert(status); // Will either print 'Accepted.' or 'DENIED!'
}
PhantomJS (server-side)
page.onCallback = function(data) {
if (data && data.secret && data.secret === 'ghostly') {
return 'Accepted.';
}
return 'DENIED!';
};
Example:
page.onClosing = function(closingPage) {
console.log('The page is closing! URL: ' + closingPage.url);
};
Example:
page.onConfirm = function(msg) {
console.log('CONFIRM: ' + msg);
return true; // `true` === pressing the "OK" button, `false` === pressing the "Cancel" button
};
By default, console
messages from the web page are not displayed. Using this callback is a typical way to redirect it.
Example:
page.onConsoleMessage = function(msg, lineNum, sourceId) {
console.log('CONSOLE: ' + msg + ' (from line #' + lineNum + ' in "' + sourceId + '")');
};
Note: line number and source identifier are not used yet, at least in phantomJS <= 1.8.1. You receive undefined values.
#### `onError` #### **Introduced:** PhantomJS 1.5 This callback is invoked when there is a JavaScript execution error. It is a good way to catch problems when evaluating a script in the web page context. The arguments passed to the callback are the error message and the stack trace [as an Array].Example:
page.onError = function(msg, trace) {
var msgStack = ['ERROR: ' + msg];
if (trace && trace.length) {
msgStack.push('TRACE:');
trace.forEach(function(t) {
msgStack.push(' -> ' + t.file + ': ' + t.line + (t.function ? ' (in function "' + t.function + '")' : ''));
});
}
console.error(msgStack.join('\n'));
};
Example:
page.onInitialized = function() {
page.evaluate(function() {
document.addEventListener('DOMContentLoaded', function() {
console.log('DOM content has loaded.');
}, false);
});
};
Also see WebPage#open
for an alternate hook for the onLoadFinished
callback.
Example:
page.onLoadFinished = function(status) {
console.log('Status: ' + status);
// Do other things here...
};
Example:
page.onLoadStarted = function() {
var currentUrl = page.evaluate(function() {
return window.location.href;
});
console.log('Current page ' + currentUrl +' will gone...');
console.log('Now loading a new page...');
};
Example:
page.onNavigationRequested = function(url, type, willNavigate, main) {
console.log('Trying to navigate to: ' + url);
console.log('Caused by: ' + type);
console.log('Will actually navigate: ' + willNavigate);
console.log("Sent from the page's main frame: " + main);
}
Example:
page.onPageCreated = function(newPage) {
console.log('A new child page was created! Its requested URL is not yet available, though.');
// Decorate
newPage.onClosing = function(closingPage) {
console.log('A child page is closing: ' + closingPage.url);
};
};
Example:
page.onPrompt = function(msg, defaultVal) {
if (msg === "What's your name?") {
return 'PhantomJS';
}
return defaultVal;
};
Example:
page.onResourceRequested = function(requestData, networkRequest) {
console.log('Request (#' + requestData.id + '): ' + JSON.stringify(requestData));
};
The requestData
metadata object contains these properties:
-
id
: the number of the requested resource -
method
: http method -
url
: the URL of the requested resource -
time
: Date object containing the date of the request -
headers
: list of http headers
The networkRequest
object contains these functions:
-
abort()
: aborts the current network request. Aborting the current network request will invoke (onResourceError
) callback. -
changeUrl(url)
: changes the current URL of the network request. setHeader(key, value)
If the resource is large and sent by the server in multiple chunks, onResourceReceived
will be invoked for every chunk received by PhantomJS.
Example:
page.onResourceReceived = function(response) {
console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};
The response
metadata object contains these properties:
-
id
: the number of the requested resource -
url
: the URL of the requested resource -
time
: Date object containing the date of the response -
headers
: list of http headers -
bodySize
: size of the received content decompressed (entire content or chunk content) -
contentType
: the content type if specified -
redirectURL
: if there is a redirection, the redirected URL -
stage
: "start", "end" (FIXME: other value for intermediate chunk?) -
status
: http status code. ex:200
-
statusText
: http status text. ex:OK
Example:
page.onResourceTimeout = function(request) {
console.log('Response (#' + request.id + '): ' + JSON.stringify(request));
};
The request
metadata object contains extra error related properties:
-
id
: the number of the requested resource -
method
: http method -
url
: the URL of the requested resource -
time
: Date object containing the date of the request -
headers
: list of http headers -
errorCode
: the error code of the error -
errorString
: text message of the error
Example:
page.onUrlChanged = function(targetUrl) {
console.log('New URL: ' + targetUrl);
};
To retrieve the old URL, use the onLoadStarted callback.
#### `onResourceError` #### **Introduced:** PhantomJS 1.9 This callback is invoked when a web page was unable to load resource. The only argument to the callback is the `resourceError` metadata object.Example:
page.onResourceError = function(resourceError) {
console.log('Unable to load resource (#' + resourceError.id + 'URL:' + resourceError.url + ')');
console.log('Error code: ' + resourceError.errorCode + '. Description: ' + resourceError.errorString);
};
The resourceError
metadata object contains these properties:
-
id
: the number of the request -
url
: the resource url -
errorCode
: the error code -
errorString
: the error description
These function call callbacks. Used for tests...
#### `closing(page)` #### #### `initialized()` #### #### `javaScriptAlertSent(message)` #### #### `javaScriptConsoleMessageSent(message)` #### #### `loadFinished(status)` #### #### `loadStarted()` #### #### `navigationRequested(url, navigationType, navigationLocked, isMainFrame)` #### #### `rawPageCreated(page)` #### #### `resourceReceived(request)` #### #### `resourceRequested(resource)` #### #### `resourceError(resource)` #### #### `urlChanged(url)` ####