Foundations

The five papers CiteForge is built on.

Every product decision maps to a citable claim—numbers you can audit, not marketing adjectives. Start here when you need to explain why a feature exists.

Cite map

Papers on the left; product surfaces on the right. Lines are schematic—each card below cites the same mapping in prose.

PapersProductGEO · KDDYu · SFESrc coverageChen · probesNestaas · PMA/analyzer/analyzer/authority/citations/probes/defensive
KDD 2024

GEO: Generative Engine Optimization

Aggarwal, Mittal, Murthy, Tian, Patel et al. (KDD 2024)

Why it matters

Foundational. Shows +41% visibility from GEO methods, +115% lift specifically for rank-5 sites. The headline result that says: structural changes can move you from invisible to cited.

Applied in

Content Agent uses Aggarwal's top techniques: Cite Sources, Statistics, Quotation Addition.

Preprint 2025

Structural Feature Engineering for Generative Engine Optimization

Yu et al. (2025)

Why it matters

Operational. Defines six numerical thresholds (heading depth, paragraph length, format diversity, emphasis density, internal-link density, content density) and three engine paradigms (STS, IR, ISG) with per-paradigm weight profiles.

Applied in

Content Analyzer encodes every threshold and paradigm weight verbatim. Semantic preservation guardrail uses Yu's eq. 16 thresholds.

Under review

Source Coverage and Citation Bias in LLM-based vs. Traditional Search Engines

Anonymous (under review, 2025)

Why it matters

Strategic. Shows LLM-SEs cite 37% unique domains vs Google, and that every LLM-SE except ChatGPT prefers less popular domains. Inverts the 'big brands will dominate AEO' objection.

Applied in

Authority Agent targets less-popular but topical domains. Citation Graph computes popularity-controlled lift over Tranco rank.

ICLR 2025

Adversarial Search Engine Optimization for Large Language Models

Nestaas et al. (ICLR 2025)

Why it matters

Defensive. Catalogues PMA attack classes: visual hidden injection, instruction override, preference biasing, cross-page injection. Proves the position effect and the prisoner's dilemma.

Applied in

Defensive Mode detector implements all four attack classes. Enterprise upsell scans both your pages and competitor pages.

What's not in here yet

v2 research ingestion on the roadmap includes deeper Google AI Overviews coverage, ContextCite-style attribution beyond our enterprise defensive tier, and Tranco rank as a first-class popularity prior for the Citation Graph—shipping as those pipelines harden.