Symfony2 API
Class

Symfony\Component\DomCrawler\Crawler

class Crawler extends SplObjectStorage

Crawler eases navigation of a list of \DOMNode objects.

Methods

__construct(mixed $node = null, string $uri = null)

Constructor.

clear()

Removes all the nodes.

add(null|DOMNodeList|array|DOMNode $node)

Adds a node to the current list of nodes.

null|void addContent(string $content, null|string $type = null)

Adds HTML/XML content.

addHtmlContent(string $content, string $charset = 'UTF-8')

Adds an HTML content to the list of nodes.

addXmlContent(string $content, string $charset = 'UTF-8')

Adds an XML content to the list of nodes.

addDocument(DOMDocument $dom)

Adds a \DOMDocument to the list of nodes.

addNodeList(DOMNodeList $nodes)

Adds a \DOMNodeList to the list of nodes.

addNodes(array $nodes)

Adds an array of \DOMNode instances to the list of nodes.

addNode(DOMNode $node)

Adds a \DOMNode instance to the list of nodes.

Crawler eq(integer $position)

Returns a node given its position in the node list.

array each(Closure $closure)

Calls an anonymous function on each node of the list.

Crawler reduce(Closure $closure)

Reduces the list of nodes by calling an anonymous function.

Crawler first()

Returns the first node of the current selection

Crawler last()

Returns the last node of the current selection

Crawler siblings()

Returns the siblings nodes of the current selection

Crawler nextAll()

Returns the next siblings nodes of the current selection

Crawler previousAll()

Returns the previous sibling nodes of the current selection

Crawler parents()

Returns the parents nodes of the current selection

Crawler children()

Returns the children nodes of the current selection

string attr(string $attribute)

Returns the attribute value of the first node of the list.

string text()

Returns the node value of the first node of the list.

array extract(array $attributes)

Extracts information from the list of nodes.

Crawler filterXPath(string $xpath)

Filters the list of nodes with an XPath expression.

Crawler filter(string $selector)

Filters the list of nodes with a CSS selector.

Crawler selectLink(string $value)

Selects links by name or alt value for clickable images.

Crawler selectButton(string $value)

Selects a button by name or alt value for images.

Link link(string $method = 'get')

Returns a Link object for the first node in the list.

array links()

Returns an array of Link objects for the nodes in the list.

Form form(array $values = null, string $method = null)

Returns a Form object for the first node in the list.

static string xpathLiteral(string $s)

Converts string for XPath expressions.

Details

at line 38
public __construct(mixed $node = null, string $uri = null)

Constructor.

Parameters

mixed $node A Node to use as the base for the crawling
string $uri The current URI or the base href value

at line 50
public clear()

Removes all the nodes.

at line 65
public add(null|DOMNodeList|array|DOMNode $node)

Adds a node to the current list of nodes.

This method uses the appropriate specialized add*() method based
on the type of the argument.

Parameters

null|DOMNodeList|array|DOMNode $node A node

at line 86
public null|void addContent(string $content, null|string $type = null)

Adds HTML/XML content.

Parameters

string $content A string to parse as HTML/XML
null|string $type The content type of the string

Return Value

null|void

at line 120
public addHtmlContent(string $content, string $charset = 'UTF-8')

Adds an HTML content to the list of nodes.

Parameters

string $content The HTML content
string $charset The charset

at line 143
public addXmlContent(string $content, string $charset = 'UTF-8')

Adds an XML content to the list of nodes.

Parameters

string $content The XML content
string $charset The charset

at line 160
public addDocument(DOMDocument $dom)

Adds a \DOMDocument to the list of nodes.

Parameters

DOMDocument $dom A \DOMDocument instance

at line 174
public addNodeList(DOMNodeList $nodes)

Adds a \DOMNodeList to the list of nodes.

Parameters

DOMNodeList $nodes A \DOMNodeList instance

at line 188
public addNodes(array $nodes)

Adds an array of \DOMNode instances to the list of nodes.

Parameters

array $nodes An array of \DOMNode instances

at line 202
public addNode(DOMNode $node)

Adds a \DOMNode instance to the list of nodes.

Parameters

DOMNode $node A \DOMNode instance

at line 220
public Crawler eq(integer $position)

Returns a node given its position in the node list.

Parameters

integer $position The position

Return Value

Crawler A new instance of the Crawler with the selected node, or an empty Crawler if it does not exist.

at line 249
public array each(Closure $closure)

Calls an anonymous function on each node of the list.

The anonymous function receives the position and the node as arguments.

Example:

$crawler->filter('h1')->each(function ($node, $i)
{
return $node->nodeValue;
});

Parameters

Closure $closure An anonymous function

Return Value

array An array of values returned by the anonymous function

at line 270
public Crawler reduce(Closure $closure)

Reduces the list of nodes by calling an anonymous function.

To remove a node from the list, the anonymous function must return false.

Parameters

Closure $closure An anonymous function

Return Value

Crawler A Crawler instance with the selected nodes.

at line 289
public Crawler first()

Returns the first node of the current selection

Return Value

Crawler A Crawler instance with the first selected node

at line 301
public Crawler last()

Returns the last node of the current selection

Return Value

Crawler A Crawler instance with the last selected node

at line 315
public Crawler siblings()

Returns the siblings nodes of the current selection

Return Value

Crawler A Crawler instance with the sibling nodes

Exceptions

InvalidArgumentException When current node is empty

at line 333
public Crawler nextAll()

Returns the next siblings nodes of the current selection

Return Value

Crawler A Crawler instance with the next sibling nodes

Exceptions

InvalidArgumentException When current node is empty

at line 349
public Crawler previousAll()

Returns the previous sibling nodes of the current selection

Return Value

Crawler A Crawler instance with the previous sibling nodes

at line 367
public Crawler parents()

Returns the parents nodes of the current selection

Return Value

Crawler A Crawler instance with the parents nodes of the current selection

Exceptions

InvalidArgumentException When current node is empty

at line 394
public Crawler children()

Returns the children nodes of the current selection

Return Value

Crawler A Crawler instance with the children nodes

Exceptions

InvalidArgumentException When current node is empty

at line 414
public string attr(string $attribute)

Returns the attribute value of the first node of the list.

Parameters

string $attribute The attribute name

Return Value

string The attribute value

Exceptions

InvalidArgumentException When current node is empty

at line 432
public string text()

Returns the node value of the first node of the list.

Return Value

string The node value

Exceptions

InvalidArgumentException When current node is empty

at line 456
public array extract(array $attributes)

Extracts information from the list of nodes.

You can extract attributes or/and the node value (_text).

Example:

$crawler->filter('h1 a')->extract(array('_text', 'href'));

Parameters

array $attributes An array of attributes

Return Value

array An array of extracted values

at line 486
public Crawler filterXPath(string $xpath)

Filters the list of nodes with an XPath expression.

Parameters

string $xpath An XPath expression

Return Value

Crawler A new instance of Crawler with the filtered list of nodes

at line 512
public Crawler filter(string $selector)

Filters the list of nodes with a CSS selector.

This method only works if you have installed the CssSelector Symfony Component.

Parameters

string $selector A CSS selector

Return Value

Crawler A new instance of Crawler with the filtered list of nodes

Exceptions

RuntimeException if the CssSelector Component is not available

Selects links by name or alt value for clickable images.

Parameters

string $value The link text

Return Value

Crawler A new instance of Crawler with the filtered list of nodes

at line 549
public Crawler selectButton(string $value)

Selects a button by name or alt value for images.

Parameters

string $value The button text

Return Value

Crawler A new instance of Crawler with the filtered list of nodes

Returns a Link object for the first node in the list.

Parameters

string $method The method for the link (get by default)

Return Value

Link A Link instance

Exceptions

InvalidArgumentException If the current node list is empty

Returns an array of Link objects for the nodes in the list.

Return Value

array An array of Link instances

at line 609
public Form form(array $values = null, string $method = null)

Returns a Form object for the first node in the list.

Parameters

array $values An array of values for the form fields
string $method The method for the form

Return Value

Form A Form instance

Exceptions

InvalidArgumentException If the current node list is empty

at line 646
static public string xpathLiteral(string $s)

Converts string for XPath expressions.

Escaped characters are: quotes (") and apostrophe (').

Examples:
<code>
echo Crawler::xpathLiteral('foo " bar');
//prints 'foo " bar'

echo Crawler::xpathLiteral("foo ' bar");
//prints "foo ' bar"

echo Crawler::xpathLiteral('a\'b"c');
//prints concat('a', "'", 'b"c')
</code>

Parameters

string $s String to be escaped

Return Value

string Converted string