Creating A Sitemap For Search Engines
Creation of a sitemap XML file is easy using PHP
Structure Of The Sitemap
XML sitemaps can be submitted to all of the major search engine providers to identify the files within a website that you want to be indexed. The structure of a sitemap is defined at www.sitemaps.org and consists of a set of header information followed by a repeating set of information for each file to be indexed.
The only mandatory item of information is the URL of the file. All other information eg <changefreq>
is optional.
All of these elements are defined in detail at www.sitemaps.org.
XML
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/one.html</loc>
<lastmod>2014-03-24</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
.
. repeated for each URL
.
<url>
<loc>http://www.example.com/two.php</loc>
<lastmod>2014-03-24</lastmod>
<changefreq>daily</changefreq>
<priority>0.5</priority>
</url>
</urlset>
Creating The Sitemap
Creation of the sitemap uses DOMImplementation functions to create the structure, given an array of URLs to be inserted.
The resulting structure is formatted to look 'pretty' with formatOutput = true
and saved to a nominated .xml file withsave($outputFile)
.
The basic sequence of creating the structure is that an element is created within the DOM eg $urlData = $dom->createElement('url')
and then appended as a child to a previously defined element
eg $urlSet->appendChild($urlData)
. The previous example creates an element with no contained data. If the element has contained data, this is specified when it is created
eg
$dom->createElement('changefreq','daily')
PHP
$urlList,
$outputFile)
{
$xmlSiteMap = new DOMImplementation();
/* create the basic document */
$dom = $xmlSiteMap->createDocument();
/* set the encoding */
$dom->encoding = "UTF-8";
/* define the namespace */
$urlSet = $dom->createElementNS(
'http://www.sitemaps.org/schemas/sitemap/0.9',
'urlset');
$dom->appendChild($urlSet);
foreach ($urlList as $url) {
$urlData = $dom->createElement('url');
$urlSet->appendChild($urlData);
/* the url is the minimum info required */
$locData = $dom->createElement(
'loc',
$url);
$urlData->appendChild($locData);
/* add any optional elements */
$urlData->appendChild(
$dom->createElement('lastmod',
date("Y-m-d")));
$urlData->appendChild($dom->createElement(
'changefreq',
'daily'));
$urlData->appendChild($dom->createElement(
'priority',
'0.5'));
}
/* make it look pretty */
$dom->formatOutput = true;
$saveResult = $dom->save($outputFile);
if ($saveResult === false) {
return false;
} else {
return true;
}
Validating The Sitemap
It is useful to validate the structure of the sitemap before submitting it to a search engine.
This is done using the schemaValidate
function, supplied with a XML Schema Definition file (.xsd) provided at www.sitemaps.org.
Normally any errors found would be displayed on the screen, but if necessary they can be trapped and handled in whatever way appropriate.
PHP
{
/* stop validation errors going to the display */
libxml_use_internal_errors(true);
$dom = new DOMDocument();
/* load the sitemap file */
$dom->load($inputFile);
$validateResult = $dom->schemaValidate(
'http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd');
if (!$validateResult) {
/* get the errors into an array */
$errors = libxml_get_errors();
$return = $errors;
} else {
$return = true;
}
/* direct errors back to the display */
libxml_use_internal_errors(false);
return $return;
}
Submitting The Sitemap
A sitemap can be manually submitted to a search engine website, but the easiest way is to provide an entry in the robots.txt file.
ROBOTS
Disallow: /test/
.
.
.
Sitemap: http://www.example.com/sitemap.xml