AI Visibility Report

carbonremote.com

How visible is Carbon Remote to AI agents, LLM-powered search engines, and automated research tools? A comprehensive audit of agent readiness, content signals, and discoverability gaps.

📅 14 May 2026 🔗 www.carbonremote.com 🏗️ Webflow
40

Agent Readiness Score: 40/100

Carbon Remote has solid traditional SEO foundations and an unusually smart robots.txt. But it's missing the three signals that actually get you surfaced in LLMs: an agent discovery file, structured data, and markdown-negotiable content.

1. Signal-by-Signal Scorecard

Every signal that AI agents, crawlers, and LLM search engines use to discover and understand your site.

SignalStatusImpactFinding
Agent discovery file ❌ Missing Critical No /llms.txt or /.well-known/llms.txt file. This is the front door for AI models to discover your content.
Full content endpoint ❌ 404 High No dedicated markdown content endpoint for bulk ingestion by AI systems.
Markdown content negotiation ❌ Returns HTML High Accept: text/markdown requests receive the full JS-heavy HTML page. AI crawlers cannot parse this efficiently.
JSON-LD structured data ❌ None High Zero JSON-LD blocks on homepage. No Schema.org entity descriptions for the company, services, or articles.
Semantic heading structure ⚠️ Minimal Medium Homepage: 1 h1, 4 h2, 3 h3. Functional but minimal. Most content is laid out via Webflow divs, not semantic elements.
robots.txt (AI crawlers) ✅ Excellent High Explicitly allows GPTBot, PerplexityBot, Anthropic, MetaAI, DeepSeekBot, Google-Extended, and Applebot.
robots.txt (Scrapers) ✅ Well-configured Low Blocks 100+ known scraping/spam bots aggressively.
Sitemap.xml ✅ Present Medium Multiple sitemaps found covering blog posts, service pages, and core site pages.
OpenGraph / Twitter Cards ✅ Configured Medium Title, description, and image tags present for both OG and Twitter.
Canonical URL ✅ Clean Low Set to www.carbonremote.com — no duplicate issues.
Analytics (GTM + GA4) ✅ Present - Both Google Tag Manager and GA4 tracking installed.
Google Search Console ✅ Verified - Site verification meta tag present.

2. The Robots.txt Story

This is the standout finding. Carbon Remote's robots.txt is unusually well-configured for the AI era — it blocks known scraping bots aggressively while explicitly welcoming every major AI crawler. The configuration shows real awareness of the AI landscape.

✅ Allowed AI Crawlers

GPTBot (OpenAI/ChatGPT) · PerplexityBot · anthropic-ai (Claude) · MetaAI · DeepSeekBot · Google-Extended · Applebot

🚫 Blocked Scrapers

100+ known bad actors, including BLEXBot, dotbot, EmailCollector, HTTrack, and dozens more. Well-maintained blocklist.

💡 The Catch

AI crawlers have permission, but when they arrive there's nothing structured for them to consume. The front door is open — the library is in a language they can't read.

"Carbon's robots.txt is in the top 5% of sites I've audited for AI crawler configuration. Most companies are either blocking everything or haven't thought about it. Carbon got the policy right — now they need to give the crawlers something useful to read."

— Audit observation, May 2026

3. Content & Site Architecture

Platform & Structure

Built on Webflow (last published 24 Feb 2026). Single-page sections cover Solutions, About, AI, The Carbon Way, Success Stories, and Blog. JS-heavy: animations powered by Webflow's native animation engine and Lottie, with Crisp for live chat.

Page Inventory

Key pages from sitemap analysis:

Core Pages

Home · About Us · How It Works · Success Stories · Contact Us · Careers

Services

Team Augmentation · Talent Hubs · Product Studio · Artificial Intelligence

Content

Blog (44+ articles) · Covering offshoring, engineering productivity, BOT models, talent strategy, digital transformation

Legal

Privacy Policy · Cookies Policy · Terms & Conditions · Legal Notice

Blog Quality Assessment

Carbon's blog is a genuine asset. Articles are long-form (1,000–2,500 words), properly structured with h1/h2 headings, and reference real frameworks like DORA metrics, the SPACE framework, Atlassian research, and GitHub's State of Distributed Development report. These are real thought leadership pieces, not AI-generated filler.

"Engineering Productivity Across Distributed Teams in 2025" — 7 h2 sections, methodology grounded in published research, directly relevant to their ICP (CTOs, VPs of Engineering, technical founders). This is exactly the kind of content AI models should be citing.

Structural Problems

IssueSeverityWhy It Matters
Webflow JS shell High Navigation, animations, and dynamic content all depend on JavaScript execution. AI crawlers may only read the raw HTML — missing large portions of the rendered content.
No semantic outlines Medium Content is laid out visually via Webflow divs. LLMs parsing raw HTML see a flat structure with no clear document hierarchy beyond the headings that exist.
No blog structured data Medium None of the 44+ blog articles have JSON-LD Article markup. No author, datePublished, or about schema — all signals AI models use to assess authority.
Single HTML response type Medium Every request returns the same JS-heavy HTML regardless of Accept header. No content negotiation for AI-native formats.

4. Competitive Context

Carbon Remote competes in the remote engineering talent / staff augmentation space. This section maps the competitive landscape for AI visibility.

SignalCarbon RemoteIndustry BenchmarkLeaders
llms.txt Rare (under 5%) Pioneers: smaller agencies adopting early
JSON-LD Mixed (~40%) Larger platforms with dedicated SEO teams
AI crawler allowlist ~20% have explicit AI policy Carbon is ahead of most here
Blog authority Varies widely Content quality is a differentiator
Agent readiness 40/100 45–55 typical for SMBs 65+ for tech-forward companies

Key takeaway: Carbon is in the middle of the pack. The robots.txt configuration puts them ahead of most competitors on policy, but the absence of structured data and agent-native content formats pulls them back to average. The blog quality is a real asset most competitors don't have — it's just not discoverable by AI yet.

5. Recommended Actions (Priority Order)

These are ordered by impact-to-effort ratio. Items 1-3 are high-impact wins that can be implemented in days, not months.

PRIORITY 1

Create llms.txt with blog content indexing Low Effort

One text file at the site root. List your key pages with descriptions, then link to your full-content markdown endpoint and blog index. This is the single highest-impact change — it's how ChatGPT, Claude, Perplexity, and Google AI discover what matters on your site. Carbon's 44+ blog articles are the perfect content to expose here.

PRIORITY 2

Serve markdown versions of blog content Medium Effort

Add content negotiation: when an AI crawler requests a blog article with Accept: text/markdown, return a clean markdown version. This turns your blog from invisible to indexable overnight. The articles are already well-structured — extraction should be straightforward for the Webflow CMS.

PRIORITY 3

Add JSON-LD structured data Medium Effort

Start with Article schema on blog posts (author, datePublished, headline, about) and Organization schema on the homepage (name, description, sameAs links). This gives AI models machine-readable entity descriptions — not just raw HTML to parse.

PRIORITY 4

Add full-content markdown endpoint Medium Effort

A single /full.txt or /llms-full.txt containing all key content in one markdown file. This is the bulk ingestion endpoint that let AI models consume everything at once. Link it from your llms.txt.

PRIORITY 5

Improve semantic HTML on landing pages Higher Effort

Since the site runs on Webflow, this means working within the Webflow designer to replace div-based layouts with proper section/article/nav elements and expand the heading hierarchy. Lower priority because it affects LLM parsing less than the items above — but it improves everything: accessibility, SEO, and AI readability.

6. What This Means for Carbon Remote

Carbon's ICP — CTOs, VPs of Engineering, technical founders — are exactly the kind of buyer who uses AI search to research vendors. Queries like "best nearshore engineering teams," "build-operate-transfer staffing model," or "Eastern European tech talent" are high-intent and increasingly answered by LLMs rather than traditional search.

🟢 Strengths

Excellent robots.txt policy · Strong blog with genuine thought leadership · Clean OpenGraph/social metadata · Well-structured sitemap · Analytics infrastructure in place

🟡 Opportunities

Be first-mover on llms.txt in the remote staffing space · Blog content is AI-ready in quality, just not format · Webflow CMS can support structured data additions

🔴 Risks

Competitors adopting AI visibility faster · Blog content invisible when prospects use AI search · Webflow JS dependency limits what AI crawlers can extract today

The gap is not content quality — it's content format. Carbon has the hardest part (genuine expert content) already done. The remaining work is technical plumbing: making that content available in the formats AI systems expect. This is a two-week fix, not a six-month content program.

— Summary assessment, May 2026