More News Sites Default To Blocking AI Crawlers

Reuters and Time now default to blocking AI bots, allowing only approved crawlers through allowlists, Digiday reports.

Both publishers made the decision in May, joining People Inc. and The Atlantic, which adopted similar setups within the past year.

Reuters says the change hasn’t cost it traffic, while cutting what it spends serving bots. Executives credit the added friction with helping push AI companies toward licensing talks.

Why Blocklists Weren’t Enough

Robots.txt works only when crawlers choose to honor it. Digiday cited a Tollbit report finding that 30% of total AI bot scrapes didn’t comply with explicit robots.txt permissions.

Blocking at other levels still has teeth, the executives say. Scrapers that route around blocks pay for workarounds, and that expense is the point.

A blocklist catches only the bots a publisher can name. People Inc. learned that switching to an allowlist increased the number of user agents it blocked from about 2,100 to more than 30,000. Lindsay Van Kirk, svp of innovation, shared the figures at an IAB Tech Lab event in late May.

That scale matches what robots.txt data has shown for months. A BuzzStream analysis we covered in January found 79% of top news publishers block at least one AI training bot. Anthropic’s crawler documentation now warns publishers about the visibility cost of blocking its search bot. In the UK, a new conduct requirement requires Google to let websites opt out of AI search features.

How Publishers Decide Which Bots To Allow

Blocking by default, a setup sometimes called default-deny, changes the decision from which bots to block to which bots to let in.

Reuters approves a bot when it offers a “fair value exchange,” head of Reuters Professional Josh London told Digiday. That exchange covers four kinds of value. A bot can pay for content through licensing, send traffic back, keep the site running, or support monetization.

The result is visible in the live Reuters robots.txt file. It lists approved crawlers from Amazon, Google, Bing/Microsoft, Yahoo, and OpenAI, then disallows other bots from most of the site.

Why This Matters

Crawler access has worked the same way since robots.txt was created. Every bot gets in unless a publisher names it and blocks it.

Now Reuters and Time are reversing that default, and the People Inc. figures show why. You can’t block a bot you’ve never heard of.

Blocking has costs, though. Block a crawler, and you lose whatever it was sending back, like AI search visibility or referral traffic. That’s why both publishers ask what each bot gives them before letting it in. It’s a question worth asking about your own robots.txt.

Looking Ahead

The publishers are betting there’s strength in numbers. One site blocking AI bots is easy to ignore. The SPUR Coalition is building shared standards for licensing and content use. It grew to 36 organizations this month after adding 30 members. Thirty-six publishers blocking together is harder to dismiss than one.

What’s less clear is who this works for. Reuters came to the table with a newswire business and licensing deals already signed. Smaller publishers face the same choice without that leverage. They can block, but blocking costs AI visibility and doesn’t guarantee anyone shows up to negotiate.

In a deep dive I wrote a few months ago, I found that the payment pools stay small relative to traditional search revenue. If deals only come in for the biggest names, default-deny could stay a big-publisher tool.

Featured Image: Grenar/Shutterstock

Source link

No Agency Cube

More News Sites Default To Blocking AI Crawlers

ByRose Milev

Why Blocklists Weren’t Enough

How Publishers Decide Which Bots To Allow

Why This Matters

Looking Ahead

By Rose Milev

Related Post

5 Findings From 300 Enterprise Marketing Execs

Claude Fable 5 “Feels Next Level”

The Complete Guide To Local SEO For Multiple Locations

Leave a Reply Cancel reply

You missed

5 Findings From 300 Enterprise Marketing Execs

Claude Fable 5 “Feels Next Level”

More News Sites Default To Blocking AI Crawlers

The Complete Guide To Local SEO For Multiple Locations

No Agency Cube