# aicodingpricing leaderboard — final SEO/product recheck

Date: 2026-06-05 UTC
Task: t_3b8b5df8
Site: https://aicodingpricing.com
Scope:
- https://aicodingpricing.com/llm-leaderboard
- https://aicodingpricing.com/best-llm-for-coding
- https://aicodingpricing.com/coding-agent-cost-calculator

## Verdict

SEO_NO_GO.

Production is technically crawlable and the P0 pages are live, but launch gate should not pass until one data-integrity blocker is fixed: `/llm-leaderboard` visibly maps the SWE-bench `56.20%` score to `GPT-5.4 mini` while the checked SWE-bench source names the row as `GPT-5 Mini`. The page caveat says the benchmark mapping remains unverified, but the numeric benchmark is still displayed in the model row. This violates the freeze rule: exact model/source alignment not verified => mark benchmark as `source needs recheck` or `not publicly benchmarked`, not a numeric value.

## Production technical evidence

Audit command/output stored in worker evidence:
- `/root/.hermes/kanban/boards/site-review/workspaces/t_3b8b5df8/audit_production.py`
- `/root/.hermes/kanban/boards/site-review/workspaces/t_3b8b5df8/production_audit.json`

| URL | HTTP | Indexable | Title | Meta desc | H1 | H2/H3 | Words | Canonical | Schema | Sitemap |
|---|---:|---|---:|---:|---:|---:|---:|---|---|---|
| `/llm-leaderboard` | 200 | yes | 69 | 142 | 1 | 7/23 | 2233 | self | WebPage, BreadcrumbList, FAQPage, Dataset | yes |
| `/best-llm-for-coding` | 200 | yes | 56 | 142 | 1 | 6/13 | 1094 | self | WebPage, BreadcrumbList, FAQPage | yes |
| `/coding-agent-cost-calculator` | 200 | yes | 62 | 139 | 1 | 6/7 | 902 | self | WebPage, BreadcrumbList, FAQPage, SoftwareApplication | yes |

Supporting files:
- `/robots.txt`: 200; allows `/`, blocks only `/_state/`; declares sitemap.
- `/sitemap.xml`: 200; contains all 3 scoped URLs.
- `/indexnow-key.txt`: 200; key file visible.
- All scoped pages have `og:title`, `og:description`, `og:url`, `twitter:card`.
- No missing image alt found in scoped HTML.

## Freeze contract recheck

### `/llm-leaderboard`

Resolved:
- 200, self canonical, indexable, sitemap included.
- Title matches freeze exactly: `AI Coding Model Leaderboard: Compare Coding LLMs by Cost & Benchmarks`.
- Meta description is in target range and matches intent.
- Visible answer-first block exists near top.
- Methodology/caveat copy visible: `No fake universal score`, `not disclosed`, `not publicly benchmarked`, `partial evidence`, source/freshness language.
- Table is crawlable HTML, not canvas-only.
- Calculator links exist from nav, hero CTA, and row CTA with query params.
- Schema includes WebPage, BreadcrumbList, FAQPage, Dataset.

Waived/minor:
- H1 differs from freeze (`Compare Coding Models by Workflow Cost` vs frozen `AI Coding Model Leaderboard for Coding Agents and Developer Workflows`). This is acceptable as product hero copy because Title, nav, body, and headings still carry the target cluster. Not a blocker.

Still open / blocker:
- `GPT-5.4 mini` row shows `56.20% SWE-bench Verified · medium` while public SWE-bench extract verified `GPT-5 Mini` at `56.20%`, not exact `GPT-5.4 mini`. The caveat is visible, but the numeric value should not be attached to the renamed model row.
- Row-level source display emphasizes `OpenAI API pricing` and does not clearly show a row-level benchmark source/date alongside the benchmark metric. The table intro says every row should show source, last checked, confidence, and data status; this is partly satisfied but weak for benchmark evidence.

Required fix:
- Either rename row to exact benchmark model alias `GPT-5 Mini` and keep pricing unknown/source-recheck if exact API price is not the same row, or keep `GPT-5.4 mini` and change benchmark metric to `source needs recheck` / `not publicly benchmarked` until exact benchmark source exists.
- Add explicit row-level benchmark source/date for each numeric benchmark metric, not only pricing source.

### `/best-llm-for-coding`

Resolved:
- 200, self canonical, indexable, sitemap included.
- Title/meta match intent and length constraints.
- H1 is unique and page has scenario-based answer-first copy.
- Internal links to `/llm-leaderboard` and `/coding-agent-cost-calculator` exist.
- Schema includes WebPage, BreadcrumbList, FAQPage.
- Copy avoids universal winner positioning.

Waived/minor:
- Exact CTA text `Estimate Coding-Agent Cost` is not present; equivalent CTA `Calculate My Coding Cost` and calculator row CTAs exist. Acceptable.
- Visible word count is 1094 vs freeze target 1200–1600. Below target but above the core 800-word indexability gate; not a blocker for this deployment.

Still open:
- If the page reuses the same `GPT-5.4 mini` evidence row/shortlist, the alias-mapping blocker from `/llm-leaderboard` also affects recommendation confidence.

### `/coding-agent-cost-calculator`

Resolved:
- 200, self canonical, indexable, sitemap included.
- Calculator-intent Title/H1/meta present.
- Interactive calculator page exposes model, workflow, input/output tokens, retry, cache, monthly volume assumptions.
- Caveat visible near result: estimate, not billing quote.
- Internal link back to `/llm-leaderboard` exists.
- Schema includes WebPage, BreadcrumbList, FAQPage, SoftwareApplication.

Waived/minor:
- Meta description length is 139 chars, one character below the 140-char preferred threshold. Not a blocker; can be padded in next copy pass.
- Exact copy `Compare Alternatives` not found; equivalent `Compare Models First` / leaderboard return flow exists.

Still open:
- Calculator is SEO/product-ready only after source rows feeding model prefill are corrected for exact alias/data integrity.

## No-fake-claims check

Pass:
- No broad claims like `best LLM overall`, `most authoritative global leaderboard`, `guaranteed cheapest coding model`, `complete ranking`, `100% accurate ranking`, `real billing quote`, or `official benchmark owned by AICodingPricing` found as affirmative claims.
- P0 copy repeatedly says workflow-specific, partial evidence, no universal score, and estimates not billing quotes.
- DeepSeek V4 rows correctly mark exact coding benchmark as `not_publicly_benchmarked` while showing source-backed pricing/context.
- Public source spot checks confirm:
  - OpenAI pricing page lists GPT-5.4 mini at $0.75 input / $4.50 output per 1M tokens.
  - Anthropic pricing page lists Claude Opus 4.5 at $5 input / $25 output, Sonnet 4.5 at $3 / $15, Haiku 4.5 at $1 / $5.
  - DeepSeek docs list V4 Flash/Pro pricing and 1M context.
  - SWE-bench source lists Claude 4.5 Opus 76.80%, Gemini 3 Flash high reasoning 75.80%, Claude 4.5 Sonnet 71.40%, Kimi K2.5 70.80%, GPT-5 Mini 56.20%.

Fail / blocker:
- `GPT-5.4 mini` benchmark row uses the `GPT-5 Mini` SWE-bench score. Caveat text does not fully cure the visible numeric mapping.

## SERP target coverage

Production copy covers the planned clusters:
- `ai coding model leaderboard`: Title, meta, schema/page positioning, leaderboard table.
- `best llm for coding`: dedicated `/best-llm-for-coding` page with scenario recommendations.
- `coding agent cost calculator`: dedicated calculator page with pricing-to-task-cost flow.
- `llm api pricing`, `cheapest coding model`, `coding model benchmark`: represented as supporting terms, but P1 pages still needed for full coverage.

Current SERP check on 2026-06-05:
- Exact site query `site:aicodingpricing.com/llm-leaderboard "AI Coding Model Leaderboard"`: no web results returned by configured search backend yet.
- `"ai coding model leaderboard" "aicodingpricing"`: no aicodingpricing result in top 5; visible competitors include Onyx, Kilo, SWFTE, LiveBench, YouTube.
- `"best llm for coding" "aicodingpricing"`: no aicodingpricing result in top 5; visible competitors include Onyx, Vellum, Tenten AI, Codingscape, Medium.
- `"coding agent cost calculator" "aicodingpricing"`: no aicodingpricing result in top 5; visible results are broader AI/coding cost pages.

Interpretation: this is not a production blocker immediately after deploy, but it confirms the initial plan’s competitive risk. Technical indexability is ready; actual ranking/visibility still requires indexing and data-trust cleanup.

## Internal link / conversion path

Pass:
- `/llm-leaderboard` has hero CTA `Estimate Task Cost` -> `/coding-agent-cost-calculator`.
- Model rows link to `/coding-agent-cost-calculator?model=<model>&workflow=<workflow>`.
- `/best-llm-for-coding` links to `/llm-leaderboard` and calculator paths.
- `/coding-agent-cost-calculator` links back to `/llm-leaderboard`.

Risk:
- The conversion path should not prefill or imply confidence from rows whose benchmark alias is unverified.

## Blockers

P0 blocker:
1. Data integrity / no-fake-benchmark gate: replace the `GPT-5.4 mini` visible SWE-bench numeric value with `source needs recheck` / `not publicly benchmarked`, or split exact benchmark alias (`GPT-5 Mini`) from exact pricing alias (`GPT-5.4 mini`) so no metric is attached to the wrong model.

P1 follow-ups:
1. Add row-level benchmark source/date display for every numeric benchmark metric.
2. Pad calculator meta description from 139 to >=140 chars.
3. Consider changing `/llm-leaderboard` H1 closer to the frozen H1 if search intent underperforms after indexing.
4. Submit/monitor indexing: sitemap already includes pages; GSC/Bing/IndexNow evidence should be checked after remediation.

## Next gate

`SEO_NO_GO -> mojie remediation -> re-run SEO recheck -> then motest final QA`.

Do not promote final product QA as a clean launch gate until the benchmark alias/source-label issue is corrected.