Site icon Newsfeed

Google Clarifies Googlebot Crawls and Indexes the First 15MB of HTML Content Per Page

Google Clarifies Googlebot Crawl and Index the First 15MB of HTML Content Per Page

Recently, tech giant Google has clarified the working of Googlebot and how it crawls and indexes the first 15MB of HTML content. Currently, Googlebot And The 15 MB Thing is the hottest topic on the web. As per the report, Googlebot will stop crawling and indexing the pages after 15MB. In simple words, anything after this limit will not include in the rankings calculations.

Google has also mentioned that any resources referenced in the HTML, including CSS, videos, and images, are fetched separately. Remember that after the first 15MB of the HTML content, Googlebot stops crawling and considers the first 15MB of the file for indexing purposes. The file size limit is usually applied to the uncompressed data. You will have different limits with other crawlers. So, when Googlebot’s 15 MB restriction was introduced, there was undoubtedly some trepidation. Is this a new Google update, everyone wondered in confusion?

What is Googlebot?

Googlebot is the web crawler accessed by Google to collect the information required and build the searchable index of the web. It has both desktop and mobile crawlers. Additionally, it has specialized crawlers for videos, images and news.

Google uses more crawlers for specific tasks. Every crawler will find itself with a different string of text known as a user agent. It sees the websites as users would in the latest version of the Chrome browser. It also determines how fast and what to crawl on the site.

Google’s Gary Illyes took charge of the matter and published a blog post titled “Googlebot And The 15 MB Thing.” He explained in his blog that the threshold is not a brand-new concept. It has been around for a while; Google is just now making it clear.

What does the 15MB limit mean?

Only a few pages on the web are bigger. The median size of the HTML file is usually about 500times smaller, i.e., 30kilobytes. But, if you are the owner of the HTML page, which is more than 15MB, then you need to move some CSS dust and inline scripts to the external files. It helps your HTML content crawled and indexed by the Googlebot.

If your HTML content per page is above 15MB, Googlebot will consider only the first 15MB for indexing. This limit applies to fetches made by Googlebot whenever fetching the file types supported by Google Search.

How to check the size of the page

Plenty of ways are available to check the size of the page. However, accessing your own browser and its developer tools is the easiest method. Firstly, you need to load the page as you do regularly and then launch the Developer Tools.

After that, switch to the Network tab and reload the page. You must witness all your browser’s requests to render the page. Usually, the top request is what you are searching for. The byte size of the page is mentioned in the size column.

Share on:
Exit mobile version