downloads | documentation | faq | getting help | mailing lists | licenses | wiki | reporting bugs | php.net sites | links | conferences | my php.net

search for in the

DOMDocument::__construct> <DOMComment::__construct
Last updated: Fri, 06 Nov 2009

view this page in

DOMDocument クラス

導入

HTML ドキュメントあるいは XML ドキュメント全体を表し、 ドキュメントツリーのルートとなります。

クラス概要

DOMDocument
DOMDocument extends DOMNode {
/* プロパティ */
readonly public string $actualEncoding ;
readonly public DOMConfiguration $config ;
readonly public DOMDocumentType $doctype ;
readonly public DOMElement $documentElement ;
public string $documentURI ;
public string $encoding ;
public bool $formatOutput ;
public bool $preserveWhiteSpace = true ;
public bool $recover ;
public bool $resolveExternals ;
public bool $standalone ;
public bool $strictErrorChecking = true ;
public bool $substituteEntities ;
public bool $validateOnParse = false ;
public string $version ;
readonly public string $xmlEncoding ;
public bool $xmlStandalone ;
public string $xmlVersion ;
/* メソッド */
__construct ([ string $version [, string $encoding ]] )
DOMAttr createAttribute ( string $name )
DOMAttr createAttributeNS ( string $namespaceURI , string $qualifiedName )
DOMCDATASection createCDATASection ( string $data )
DOMComment createComment ( string $data )
DOMDocumentFragment createDocumentFragment ( void )
DOMElement createElement ( string $name [, string $value ] )
DOMElement createElementNS ( string $namespaceURI , string $qualifiedName [, string $value ] )
DOMEntityReference createEntityReference ( string $name )
DOMProcessingInstruction createProcessingInstruction ( string $target [, string $data ] )
DOMText createTextNode ( string $content )
DOMElement getElementById ( string $elementId )
DOMNodeList getElementsByTagName ( string $name )
DOMNodeList getElementsByTagNameNS ( string $namespaceURI , string $localName )
DOMNode importNode ( DOMNode $importedNode [, bool $deep ] )
mixed load ( string $filename [, int $options = 0 ] )
bool loadHTML ( string $source )
bool loadHTMLFile ( string $filename )
mixed loadXML ( string $source [, int $options = 0 ] )
void normalizeDocument ( void )
bool registerNodeClass ( string $baseclass , string $extendedclass )
bool relaxNGValidate ( string $filename )
bool relaxNGValidateSource ( string $source )
int save ( string $filename [, int $options ] )
string saveHTML ( void )
int saveHTMLFile ( string $filename )
string saveXML ([ DOMNode $node [, int $options ]] )
bool schemaValidate ( string $filename )
bool schemaValidateSource ( string $source )
bool validate ( void )
int xinclude ([ int $options ] )
/* 継承されたメソッド */
DOMNode DOMNode::appendChild ( DOMNode $newnode )
DOMNode DOMNode::cloneNode ([ bool $deep ] )
public int DOMNode::getLineNo ( void )
bool DOMNode::hasAttributes ( void )
bool DOMNode::hasChildNodes ( void )
DOMNode DOMNode::insertBefore ( DOMNode $newnode [, DOMNode $refnode ] )
bool DOMNode::isDefaultNamespace ( string $namespaceURI )
bool DOMNode::isSupported ( string $feature , string $version )
string DOMNode::lookupNamespaceURI ( string $prefix )
string DOMNode::lookupPrefix ( string $namespaceURI )
void DOMNode::normalize ( void )
DOMNode DOMNode::removeChild ( DOMNode $oldnode )
DOMNode DOMNode::replaceChild ( DOMNode $newnode , DOMNode $oldnode )
}

プロパティ

actualEncoding

非推奨。ドキュメントの実際のエンコーディング。 読み込み専用で、 encoding と同等の内容です。

config

非推奨DOMDocument::normalizeDocument() を実行する際に使用する設定。

doctype

このドキュメントに関連付けられた文書型宣言

documentElement

ドキュメントの子ノードであるドキュメント要素に対し、 直接アクセスするために便利な属性

documentURI

ドキュメントの位置。未定義の場合は NULL

encoding

XML 宣言で指定したドキュメントのエンコーディング。 この属性は、DOM Level 3 の最終的な仕様には存在しません。 しかし、この実装で XML ドキュメントのエンコーディングを扱うにはこれを使用するしかありません。

formatOutput

字下げや空白を考慮してきれいに整形した出力を行う。

implementation

このドキュメントを処理する DOMImplementation オブジェクト

preserveWhiteSpace

余分な空白を取り除かない。デフォルトは TRUE

recover

プロプライエタリ。 リカバリーモードを有効にし、整形式でないドキュメントのパースを試みます。 この属性は DOM の仕様にはなく、libxml に固有のものです。

resolveExternals

文書型宣言で外部エンティティを読み込む際に TRUE を設定する。 XML ドキュメントに文字エンティティを含める際に便利です。

standalone

非推奨。 そのドキュメントがスタンドアローンかどうかを XML 宣言で指定したもの。 xmlStandalone に対応します。

strictErrorChecking

エラー時に DOMException をスローする。デフォルトは TRUE

substituteEntities

プロプライエタリ。 エンティティの置換を行うかどうか。 この属性は DOM の仕様にはなく、libxml に固有のものです。

validateOnParse

DTD を読み込んで検証する。デフォルトは FALSE

version

非推奨。 XML のバージョン。 xmlVersion に対応します。

xmlEncoding

XML 宣言の一部として、このドキュメントのエンコーディングを 指定する属性。指定されていない場合や不明な場合 (たとえば ドキュメントがメモリ上に存在する場合など) は NULL

xmlStandalone

XML 宣言の一部として、このドキュメントがスタンドアローンか どうかを指定する。指定されていない場合は FALSE

xmlVersion

XML 宣言の一部として、このドキュメントのバージョン番号を指定する。 バージョン番号が定義されておらず、ドキュメントが "XML" の機能を サポートしている場合は、値は "1.0"

目次



DOMDocument::__construct> <DOMComment::__construct
Last updated: Fri, 06 Nov 2009
 
add a note add a note User Contributed Notes
DOMDocument
fcartegnie
31-Oct-2009 09:30
Be careful with formatOutput().

Creating an empty node like this:
createElement('foo','')
instead of
createElement('foo')
will break formatOutput.
Yarg Dahc
06-Oct-2009 12:08
Child class of DOMDocument which has a toArray() method. Enjoy and/or improve
<?php
class MyDOMDocument extends DOMDocument
{
    public function
toArray(DOMNode $oDomNode = null)
    {
       
// return empty array if dom is blank
       
if (is_null($oDomNode) && !$this->hasChildNodes()) {
            return array();
        }
       
$oDomNode = (is_null($oDomNode)) ? $this->documentElement : $oDomNode;
        if (!
$oDomNode->hasChildNodes()) {
           
$mResult = $oDomNode->nodeValue;
        } else {
           
$mResult = array();
            foreach (
$oDomNode->childNodes as $oChildNode) {
               
// how many of these child nodes do we have?
                // this will give us a clue as to what the result structure should be
               
$oChildNodeList = $oDomNode->getElementsByTagName($oChildNode->nodeName); 
               
$iChildCount = 0;
               
// there are x number of childs in this node that have the same tag name
                // however, we are only interested in the # of siblings with the same tag name
               
foreach ($oChildNodeList as $oNode) {
                    if (
$oNode->parentNode->isSameNode($oChildNode->parentNode)) {
                       
$iChildCount++;
                    }
                }
               
$mValue = $this->toArray($oChildNode);
               
$sKey   = ($oChildNode->nodeName{0} == '#') ? 0 : $oChildNode->nodeName;
               
$mValue = is_array($mValue) ? $mValue[$oChildNode->nodeName] : $mValue;
               
// how many of thse child nodes do we have?
               
if ($iChildCount > 1) {  // more than 1 child - make numeric array
                   
$mResult[$sKey][] = $mValue;
                } else {
                   
$mResult[$sKey] = $mValue;
                }
            }
           
// if the child is <foo>bar</foo>, the result will be array(bar)
            // make the result just 'bar'
           
if (count($mResult) == 1 && isset($mResult[0]) && !is_array($mResult[0])) {
               
$mResult = $mResult[0];
            }
        }
       
// get our attributes if we have any
       
$arAttributes = array();
        if (
$oDomNode->hasAttributes()) {
            foreach (
$oDomNode->attributes as $sAttrName=>$oAttrNode) {
               
// retain namespace prefixes
               
$arAttributes["@{$oAttrNode->nodeName}"] = $oAttrNode->nodeValue;
            }
        }
       
// check for namespace attribute - Namespaces will not show up in the attributes list
       
if ($oDomNode instanceof DOMElement && $oDomNode->getAttribute('xmlns')) {
           
$arAttributes["@xmlns"] = $oDomNode->getAttribute('xmlns');
        }
        if (
count($arAttributes)) {
            if (!
is_array($mResult)) {
               
$mResult = (trim($mResult)) ? array($mResult) : array();
            }
           
$mResult = array_merge($mResult, $arAttributes);
        }
       
$arResult = array($oDomNode->nodeName=>$mResult);
        return
$arResult;
    }
}

$sXml = <<<XML
<nodes>
    <node>text<node>
    <node>
        <field>hello<field>
        <field>world<field>
    <node>
<nodes>
XML;
$dom = new MyDOMDocument;
$dom->loadXml($sXml);
var_dump($dom->toArray());
?>
Output:

array (
    "nodes" => array (
        "node" => array (
            0 => "text",
            1 => array (
            "field" => array (
                0 => "hello",
                1 => "world"
            )
        )
    )
 )
PhilipWayneRollins at gmail dot com
15-Aug-2009 08:32
If you want to use the DOMDocument to create xHTML documents here is a simple class

Note this is designed for creating xHTML documents from scratch but could be easily extended to work with xHTML documents. Also this is for xHTML not XML.

<?php
   
class Document
   
{
        public
$doctype;
        public
$head;
        public
$title = 'Sensei Ninja';
        public
$body;
        private
$styles;
        private
$metas;
        private
$scripts;
        private
$document;
       
       
        function
__construct (  )
        {
           
$this->document = new DOMDocument( );
           
$this->head = $this->document->createElement( 'head', ' ' );
           
$this->body = $this->document->createElement( 'body', ' ' );
        }
       
       
        public function
addStyleSheet ( $url, $media='all' )
        {
           
$element = $this->document->createElement( 'link' );
           
$element->setAttribute( 'type', 'text/css' );
           
$element->setAttribute( 'href', $url );
           
$element->setAttribute( 'media', $media );
           
$this->styles[] = $element;
        }
       
       
        public function
addScript ( $url )
        {
           
$element = $this->document->createElement( 'script', ' ' );
           
$element->setAttribute( 'type', 'text/javascript' );
           
$element->setAttribute( 'src', $url );
           
$this->scripts[] = $element;
        }
       
       
        public function
addMetaTag ( $name, $content )
        {
           
$element = $this->document->createElement( 'meta' );
           
$element->setAttribute( 'name', $name );
           
$element->setAttribute( 'content', $content );
           
$this->metas[] = $element;
        }
       
       
        public function
setDescription ( $dec )
        {
           
$this->addMetaTag( 'description', $dec );
        }
       
       
        public function
setKeywords ( $keywords )
        {
           
$this->addMetaTag( 'keywords', $keywords );
        }
       
        public function
createElement ( $nodeName, $nodeValue=null )
        {
          return
$this->document->createElement( $nodeName, $nodeValue );
        }
       
        public function
assemble ( )
        {
           
// Doctype creation
           
$doctype = '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML TRANSITIONAL 1.0//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">';
           
           
// Create the head element
           
$title = $this->document->createElement( 'title', $this->title );
           
// Add stylesheets if needed
           
if ( is_array( $this->styles ))
                foreach (
$this->styles as $element )
                   
$this->head->appendChild( $element );
           
// Add scripts if needed
           
if(  is_array( $this->scripts ))
                foreach (
$this->scripts as $element )
                   
$this->head->appendChild( $element );
           
// Add meta tags if needed
           
if ( is_array( $this->metas ))
                foreach (
$this->metas as $element )
                   
$this->head->appendChild( $element );
           
$this->head->appendChild( $title );
           
           
// Create the document
           
$html = $this->document->createElement( 'html' );
           
$html->setAttribute( 'xmlns', 'http://www.w3.org/1999/xhtml' );
           
$html->setAttribute( 'xml:lang', 'en' );
           
$html->setAttribute( 'lang', 'en' );
           
$html->appendChild( $this->head );
           
$html->appendChild( $this->body );
           
           
           
$this->document->appendChild( $html );
            return
$doctype . $this->document->saveXML( );
        }
       
    }
   
?>

Small example

<?php
        $document
= new Document( );
   
$document->title = 'Hello';
   
$document->addStyleSheet( 'StyleSheets/main.css' );
   
$div = $document->createElement( 'div' );
   
$div->nodeValue = 'Hello, world!';
   
$div->setAttribute( 'style', 'color: red;' );
   
$document->body->appendChild( $div );
   
printf( '%s', $document->assemble( ) );
?>
e dot sand at elisand dot com
19-Jun-2009 04:19
It should be pointed out that DOMDocument extends DOMNode in every way... that means that you even have access to the DOMNode properties (even though the documentation here does not mention them as being inherited).

I used to use an XPath query to access nodes from a DOMDocument (when getElementById or getElementsByTagName weren't usable), as I believed this to be the only way.  However, since DOMDocument fully extends DOMNode, you can use DOMDocument->firstChild for example to get the first child node.

This simplifies things quite a bit when using an XPath query may seem a bit excessive to get access to something as simple as the child nodes.
cmyk777 at gmail dot com
23-May-2009 07:31
This function may help to debug current dom element:

<?php
function dom_dump($obj) {
    if (
$classname = get_class($obj)) {
       
$retval = "Instance of $classname, node list: \n";
        switch (
true) {
            case (
$obj instanceof DOMDocument):
               
$retval .= "XPath: {$obj->getNodePath()}\n".$obj->saveXML($obj);
                break;
            case (
$obj instanceof DOMElement):
               
$retval .= "XPath: {$obj->getNodePath()}\n".$obj->ownerDocument->saveXML($obj);
                break;
            case (
$obj instanceof DOMAttr):
               
$retval .= "XPath: {$obj->getNodePath()}\n".$obj->ownerDocument->saveXML($obj);
               
//$retval .= $obj->ownerDocument->saveXML($obj);
               
break;
            case (
$obj instanceof DOMNodeList):
                for (
$i = 0; $i < $obj->length; $i++) {
                   
$retval .= "Item #$i, XPath: {$obj->item($i)->getNodePath()}\n".
"{$obj->item($i)->ownerDocument->saveXML($obj->item($i))}\n";
                }
                break;
            default:
                return
"Instance of unknown class";
        }
    } else {
        return
'no elements...';
    }
    return
htmlspecialchars($retval);
}
?>

Example usage:

<?php
$dom
= new DomDocument();
$dom->load('test.xml');
$body = $dom->documentElement->getElementsByTagName('book');
echo
'<pre>'.dom_dump($body).'<pre>';
?>

Output:

Instance of DOMNodeList, node list:
Item #0, XPath: /library/book[1]
<book isbn="0345342968">
<title>Fahrenheit 451</title>
<author>R. Bradbury</author>
<publisher>Del Rey</publisher>
</book>
Item #1, XPath: /library/book[2]
<book isbn="0048231398">
<title>The Silmarillion</title>
<author>J.R.R. Tolkien</author>
<publisher>G. Allen &amp; Unwin</publisher>
</book>
Item #2, XPath: /library/book[3]
<book isbn="0451524934">
<title>1984</title>
<author>G. Orwell</author>
<publisher>Signet</publisher>
</book>
Item #3, XPath: /library/book[4]
<book isbn="031219126X">
<title>Frankenstein</title>
<author>M. Shelley</author>
<publisher>Bedford</publisher>
</book>
Item #4, XPath: /library/book[5]
<book isbn="0312863551">
<title>The Moon Is a Harsh Mistress</title>
<author>R. A. Heinlein</author>
<publisher>Orb</publisher>
</book>
Atanas Markov (dreamer79bg at gmail dot com)
30-Nov-2008 07:59
Here is a simple web scraping example using the PHP DOM that tries to get the largest text body of a HTML document. I needed it for a spider that had to show a short description for a page. It assumes that document annotation can be the largest <div>, <td> or <p> element in the page.
In the example I show a way to prevent a bug in the DOM as it sometimes just doesn't recognize html encoding. It seems to work if you put charset meta tag right after the head tag of the document.

<?php
$ch
= curl_init();
curl_setopt ($ch, CURLOPT_URL, '...put url here...' );
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch,CURLOPT_VERBOSE,1);
curl_setopt($ch, CURLOPT_USERAGENT, 'set sth...');
curl_setopt ($ch, CURLOPT_REFERER, '...set sth...'); //just a fake referer
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch,CURLOPT_POST,0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 20);

$html= curl_exec($ch);
$html1= curl_getinfo($ch);

//try to get page encoding as it was sent from server
if ($html1['content_type']){
   
$arr= explode('charset=',$html1['content_type']);
   
$csethdr= strtolower(trim($arr[1]));
} else {
   
$csethdr= false;
}

$cset= false;
$arr= array();

//This has to replace page meta tags for charset with utf-8, but it doesn't actually help(see the bug info).
if (preg_match_all(
'/(<meta\s*http-equiv="Content-Type"\s*content="[^;]*;
\s*charset=([^"]*?)(?:"|\;)[^>]*>)/'
//merge this line
,$html,$arr,PREG_PATTERN_ORDER)){
   
$cset= strtolower(trim($arr[2][0]));
    if (
$cset!='utf-8'||$cset!=$csethdr){
       
$new= str_replace($arr[2][0],'utf-8',$arr[1][0]);
       
$html= str_replace($arr[1][0],$new,$html);
       
$cset= $csethdr;
    } else {
       
$cset= false;
    }

    if (
$cset=='utf-8'){
       
$cset= false;
    }
}
unset(
$arr);
if (
$cset){
   
$html= iconv($cset,'utf-8',$html);
}
unset(
$cset);

//solve dom bug
$html=preg_replace('/<head[^>]*>/','<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
'
,$html);

$dom= new DOMDocument();
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;

function
getMaxTextBody($dom){
   
$content = $dom->getElementsByTagname('div');
   
$content2= $dom->getElementsByTagname('td');
   
$content3= $dom->getElementsByTagname('p');

   
$new= array();
    foreach (
$content as $value) {
       
$new[]= $value;
        unset(
$value);
    }
    unset(
$content);

    foreach (
$content2 as $value) {
       
$new[]= $value;
        unset(
$value);
    }
    unset(
$content2);

    foreach (
$content3 as $value) {
       
$new[]= $value;
        unset(
$value);
    }
    unset(
$content3);

   
$maxlen= 0;
   
$result= '';
    foreach (
$new as $item)
    {
       
$str= $item->nodeValue;
        if (
strlen($str)>$maxlen){
           
$content1= $item->getElementsByTagName('div');
           
$content2= $item->getElementsByTagname('td');
                       
$content3= $item->getElementsByTagname('p');
           
$contentnew= array();
            foreach (
$content1 as $value) {
               
$contentnew[]= $value;
                unset(
$value);
            }
            unset(
$content1);
            foreach (
$content2 as $value) {
               
$contentnew[]= $value;
                unset(
$value);
            }
            unset(
$content2);
            foreach (
$content3 as $value) {
               
$contentnew[]= $value;
                unset(
$value);
            }
            unset(
$content3);

            if (
count($contentnew)==0){
               
$result= $str;
            } else {
                foreach (
$contentnew as $value) {
                   
$str1= getMaxTextBody($value);
                   
$str2= $value->nodeValue;
                       
//let's say largest body has more than 50% of the text in its parent
                                   
if (strlen($str1)*2<strlen($str2)){
                       
$str1= $str2;
                    }
                    if (
strlen($str1)*2>strlen($str)&&strlen($str1)>$maxlen){
                       
$result= $str1;
                    } elseif (
strlen($str1)>$maxlen){
                       
$result= $str1;
                    }
                   
$maxlen= strlen($result);
                }
            }
           
$maxlen= strlen($result);
            unset(
$contnentnew);
        }
    }

    unset(
$new);
    return
$result;
}
print
getMaxTextBody($dom);

?>
Jochem Blok
15-May-2008 01:58
To indent a XML in a pretty way I use:

<?
$sXML
= '<root><element><key>a</key><value>b</value></element></root>';
$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->formatOutput   = true;
$doc->loadXML($sXML);
echo
$doc->saveXML();
?>
Fernando H
11-Apr-2008 07:48
Showing a quick example of how to use this class, just so that new users can get a quick start without having to figure it all out by themself. ( At the day of posting, this documentation just got added and is lacking examples. )

<?php

// Set the content type to be XML, so that the browser will   recognise it as XML.
header( "content-type: application/xml; charset=ISO-8859-15" );

// "Create" the document.
$xml = new DOMDocument( "1.0", "ISO-8859-15" );

// Create some elements.
$xml_album = $xml->createElement( "Album" );
$xml_track = $xml->createElement( "Track", "The ninth symphony" );

// Set the attributes.
$xml_track->setAttribute( "length", "0:01:15" );
$xml_track->setAttribute( "bitrate", "64kb/s" );
$xml_track->setAttribute( "channels", "2" );

// Create another element, just to show you can add any (realistic to computer) number of sublevels.
$xml_note = $xml->createElement( "Note", "The last symphony composed by Ludwig van Beethoven." );

// Append the whole bunch.
$xml_track->appendChild( $xml_note );
$xml_album->appendChild( $xml_track );

// Repeat the above with some different values..
$xml_track = $xml->createElement( "Track", "Highway Blues" );

$xml_track->setAttribute( "length", "0:01:33" );
$xml_track->setAttribute( "bitrate", "64kb/s" );
$xml_track->setAttribute( "channels", "2" );
$xml_album->appendChild( $xml_track );

$xml->appendChild( $xml_album );

// Parse the XML.
print $xml->saveXML();

?>

Output:
<Album>
  <Track length="0:01:15" bitrate="64kb/s" channels="2">
    The ninth symphony
    <Note>
      The last symphony composed by Ludwig van Beethoven.
    </Note>
  </Track>
  <Track length="0:01:33" bitrate="64kb/s" channels="2">Highway Blues</Track>
</Album>

If you want your PHP->DOM code to run under the .xml extension, you should set your webserver up to run the .xml extension with PHP ( Refer to the installation/configuration configuration for PHP on how to do this ).

Note that this:
<?php
$xml
= new DOMDocument( "1.0", "ISO-8859-15" );
$xml_album = $xml->createElement( "Album" );
$xml_track = $xml->createElement( "Track" );
$xml_album->appendChild( $xml_track );
$xml->appendChild( $xml_album );
?>

is NOT the same as this:
<?php
// Will NOT work.
$xml = new DOMDocument( "1.0", "ISO-8859-15" );
$xml_album = new DOMElement( "Album" );
$xml_track = new DOMElement( "Track" );
$xml_album->appendChild( $xml_track );
$xml->appendChild( $xml_album );
?>

although this will work:
<?php
$xml
= new DOMDocument( "1.0", "ISO-8859-15" );
$xml_album = new DOMElement( "Album" );
$xml->appendChild( $xml_album );
?>

DOMDocument::__construct> <DOMComment::__construct
Last updated: Fri, 06 Nov 2009
 
 
show source | credits | sitemap | contact | advertising | mirror sites