Tuesday, June 28, 2022

Over the last few days we’ve received a great deal of questions about a recent update to
our documentation about Googlebot.
Namely, we’ve documented that Googlebot only ever “sees” the first 15
megabytes (MB)
when fetching
certain file types.
This threshold is not new; it’s been around for many years. We just added it to our documentation
because it might be helpful for some folks when debugging, and because it rarely ever changes.

This limit only applies to the
bytes (content)
received for the initial request Googlebot makes, not the referenced resources within the page.
For example, when you open https://example.com/puppies.html, your browser will
initially download the bytes of the HTML file, and based on those bytes it might make further
requests for external JavaScript, images, or whatever else is referenced with a URL in the HTML.
Googlebot does the same thing.

What does this 15 MB limit mean to me?
Most likely nothing. There are
very few pages

on the internet that are bigger in size. You, dear reader, are unlikely to be the owner of one,
since the
median size of a HTML file is about 500 times smaller:
30 kilobytes (kB).
However, if you are the owner of an HTML page that’s over 15 MB, perhaps you could at least move
some inline scripts and CSS dust to external files, pretty please.

What happens to the content after 15 MB?
The content after the first 15 MB is dropped by Googlebot, and only the first 15 MB gets forwarded
to indexing.

What content types does the 15 MB limit apply to?
The 15 MB limit applies to fetches made by Googlebot (Googlebot Smartphone and Googlebot Desktop)
when fetching
file types supported by Google Search.

Does this mean Googlebot doesn’t see my image or video?
No. Googlebot fetches videos and images that are referenced in the HTML with a URL (for example,
<img src="https://example.com/images/puppy.jpg" alt="cute puppy looking very disappointed" />
separately with consecutive fetches.

Do data URIs add to the HTML file size?
Yes. Using
data URIs
will contribute to the HTML file size since they are in the HTML file.

How can I look up the size of a page?
There are a number of ways, but the easiest is probably using your own browser and its Developer
Tools. Load the page as you normally would, then launch the Developer Tools and switch to the
Network tab. Reload the page, and you should see all the requests your browser had to make to
render the page. The top request is what you’re looking for, with the byte size of the page in
the Size column.

For example, in the
Chrome Developer Tools
might look something like this, with 150 kB in the size column:

The Network tab in Chrome Developer Tools

If you’re more adventurous, you can use cURL
from a command line:

curl 
-A "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36" 
-so /dev/null https://example.com/puppies.html -w '%{size_download}'

If you have more questions, you can find us on
Twitter
and in the
Search Central Forums,
and if you need more clarification about our documentation, leave us feedback on the pages
themselves.





Source link

Avatar photo

By Ryan Bullet

I am interested in SEO and IT, launching new projects and administering a webmasters forum.

Leave a Reply

Your email address will not be published. Required fields are marked *