Table of contents
Foreword
There are many websites that support multiple languages. There are cases when websites have multiple countries with multiple languages. These different language versions might be just variations of the main website or exist as a standalone application. Even though they all have a lot in common and share most of the functionality, their behavior may vary. Especially content part - each case has its own restrictions, rules, and regulations. As a result, each page will be different from the other and search engine like Google should be able to index website correctly.
This can be solved with a sitemap. If you haven't heard about sitemap before, It basically a way to tell Google to go and look for specific pages to crawl. There are two ways to implement sitemaps for multilingual and multinational websites:
- using one global sitemap (alternate language pages)
- have multiple sitemaps for each country-language.
URL structure
Before we start looking at the specific sitemap implementation, I want to highlight one important concept - URL structure for multilingual and multinational sitemaps. It's very important to do it properly in the very beginning since it might be challenging to change it in future (although it's also possible).
Each specific version of the website should have a language-country code in the URL. The preferred method (used by a lot of companies, but not required) is the following:
http://www.example.com/en/index.html
- English version of the websitehttp://www.example.com/us/en/index.html
- United States English version of the websitehttp://www.example.com/de/de/index.html
- German version for Germanyhttp://www.example.com/be/fe/index.html
- French version for Belgium
In the example above I use the language-locale codes for different countries and languages:
- "en" for English
- "us" for the United States
- "de" for Germany and German
- "fr" for French (also can be used for France)
- "be" for Belgium
There are multiple resources where you can find country-language codes for your needs:
After becoming familiar with the concept of URLs, let's take a look at the actual ways of adding sitemaps.
One global sitemap approach
This approach is especially useful when all country/language versions are variations of the main website. The solution is also mentioned in Google SEO section and known as an indication of alternate language pages.
The core idea is to have a base site with the main language.
Very often it's an English version with the following URL structure -
http://www.example.com/en/index.html
. All other websites
are considered to be alternatives to the main site, for example,
http://www.example.com/de/index.html
,
http://www.example.com/fr/index.html
, etc.
With this approach it's possible ot have only one sitemap for all country-language variations. It should be located at the base level: http://www.example.com/sitemap.xml. Sitemap itself should look in the following way:
<?xml version="1.0" encoding="UTF-8"?>
urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
< xmlns:xhtml="http://www.w3.org/1999/xhtml">
url>
<loc>http://www.example.com/en/index.html</loc>
<xhtml:link
< rel="alternate"
hreflang="en"
href="http://www.example.com/en/"
/>xhtml:link
< rel="alternate"
hreflang="de"
href="http://www.example.com/de/"
/>xhtml:link
< rel="alternate"
hreflang="it-it"
href="http://www.example.com/it/it/"
/>url>
</url>
<loc>http://www.example.com/en/about.html</loc>
<xhtml:link
< rel="alternate"
hreflang="en"
href="http://www.example.com/en/about.html"
/>xhtml:link
< rel="alternate"
hreflang="de"
href="http://www.example.com/de/about.html"
/>xhtml:link
< rel="alternate"
hreflang="it-it"
href="http://www.example.com/it/it/about.html"
/>url>
</urlset> </
As you may recognize from the previous section we used locale country
codes in the hreflang
attribute.
Per each URL it's required to have
<url></url>
block with <loc>
tag indicating the page URLs of the "base site", plus alternative urls
for each language version.
The downside of this approach is that all variations are tightly coupled to the base version. If the main site wouldn't contain the specific page and its language version will, it won't be indexed. All websites should have the same amount of pages to be indexed properly.
Multiple sitemaps approach
In some cases, it's not possible to have one sitemap for all countries and languages. It might be a case when sites are more independent and have their own pages that don't exist in the other siblings. In this case, we are talking about country-language sitemaps (in this context I also call them individual sitemaps).
Create individual sitemaps
In this situation, each country-language sitemap (in this context it
is also called individual sitemap) should be hosted in its own
subfolder. For example English sitemap at
http://www.example.com/us/en
and Spanish sitemap at
http://www.example.com/us/es/
. In this case, individual
sitemaps have no difference compare to the sitemaps of simple one
language website. Here is an example of sitemap for our "Spanish
version":
<?xml version="1.0" encoding="UTF-8"?>
urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.example.com/us/es/index.html</loc>
<lastmod>2018-03-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
<url>
</url>
<loc>https://www.example.com/us/es/about.html</loc>
<lastmod>2018-03-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
<url>
</url>
<loc>https://www.example.com/us/es/contact-us.html</loc>
<lastmod>2018-03-10</lastmod>
<changefreq>monthly</changefreq>
<priority>0.5</priority>
<url>
</
...urlset> </
However, we can end up having a lot of sitemaps the different country-language versions, for example:
- http://www.example.com/us/en/sitemap.xml
- http://www.example.com/us/es/sitemap.xml
- http://www.example.com/be/fr/sitemap.xml
- http://www.example.com/be/ch/sitemap.xml
- ….
All of them should be submitted to the search engines. Doing that manually requires a lot of work. For that purpose, there is another concept of sitemap index file.
Create sitemap index
It's a file that stores all references for individual sitemaps to simplify submission process. The piece of a sitemap index looks in the following way:
<?xml version="1.0" encoding="UTF-8"?>
sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/en/sitemap.xml</loc>
<lastmod>2004-10-01T18:23:17+00:00</lastmod>
<sitemap>
</sitemap>
<loc>http://www.example.com/es/sitemap.xml</loc>
<lastmod>2005-01-01</lastmod>
<sitemap>
</sitemapindex> </
As you may notice, all reference URLs are stored in the
Sitemap index should be stored under the root folder of the entire website.
Having sitemap index allows submit all sitemaps at once. It can be
done through the Google
Webmasters Console or by triggering the following URL:
<searchengine_URL>/ping?sitemap=http://www.example.com/sitemap.xml
.
That's pretty much it. I hope this article helped you better understand how to manage multilingual and multinational sitemaps. Happy coding :-) .