Generate and verify BibTeX entries from paper notes, writing `citations/ref.bib` and `citations/verified.jsonl`. **Trigger**: citation, BibTeX, ref.bib, verified.jsonl, references, 引用, 参考文献. **Use when**: 已有 `papers/paper_notes.jsonl`,需要为 prose/LaTeX 准备可追溯的引用(每条都有 url/date/title 验证记录)。 **Skip if**: 还没有 paper notes(或本次产出不需要引用/参考文献)。 **Network**: 自动验证通常需要网络;无网络时可先 record,再标注 needs manual verification。 **Guardrail**: 每个 BibTeX entry 必须对应一条 `citations/verified.jsonl` 记录;prose 只能使用已存在于 `citations/ref.bib` 的 citation keys。
Installation
Details
Usage
After installing, this skill will be available to your AI coding assistant.
Verify installation:
npx agent-skills-cli listSkill Instructions
name: citation-verifier
description: |
Generate and verify BibTeX entries from paper notes, writing citations/ref.bib and citations/verified.jsonl.
Trigger: citation, BibTeX, ref.bib, verified.jsonl, references, 引用, 参考文献.
Use when: 已有 papers/paper_notes.jsonl,需要为 prose/LaTeX 准备可追溯的引用(每条都有 url/date/title 验证记录)。
Skip if: 还没有 paper notes(或本次产出不需要引用/参考文献)。
Network: 自动验证通常需要网络;无网络时可先 record,再标注 needs manual verification。
Guardrail: 每个 BibTeX entry 必须对应一条 citations/verified.jsonl 记录;prose 只能使用已存在于 citations/ref.bib 的 citation keys。
Citation Verifier
Generate citations/ref.bib and ensure every entry has a traceable verification record in citations/verified.jsonl.
When network access is restricted, prefer a “record now, verify later” workflow: keep URLs/titles consistent and leave a clear verification note.
Input
papers/paper_notes.jsonl
Outputs
citations/ref.bibcitations/verified.jsonl
Workflow (heuristic)
- Collect
bibkey,title,url,year,authorsfrompapers/paper_notes.jsonl. - Write/refresh
citations/ref.bib:- Prefer arXiv-style fields when
arxiv_id/primary_categoryexist (eprint,archivePrefix,primaryClass).
- Prefer arXiv-style fields when
- Write one verification record per BibTeX entry to
citations/verified.jsonlwith at least:bibkey,title,url,date
- If you cannot verify via network, record a clear
notesfield (e.g., “auto-generated; needs manual verification”) and/or request human confirmation depending on your policy.
Quality checklist
- Every BibTeX entry has a corresponding
verified.jsonlrecord. - No missing
url/date/titlein verification records.
Offline Mode
When network access is restricted, run in offline mode to produce auditable records now, then verify later.
- Generate offline records:
verification_status: offline_generated - Verify later (when network is available):
--verify-only
verification_status
offline_generated: record was generated without network verification (needs later verification)verified_online: URL/title verified successfully by the scriptverify_failed: verification was attempted but failed (network error or title mismatch)needs_manual_verification: missing/ambiguous fields (e.g., emptyurl/title)
Script
Quick Start
python .codex/skills/citation-verifier/scripts/run.py --help- Offline (record now, verify later):
python .codex/skills/citation-verifier/scripts/run.py --workspace <workspace_dir> --offline
All Options
--offline: do not attempt network verification; writeverification_status=offline_generated--verify-only: verify existingcitations/verified.jsonlrecords (does not rewrite BibTeX)--verification-note <text>: stored incitations/verified.jsonlnotes
Examples
- Generate BibTeX + offline verification records:
python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --offline --verification-note "auto-generated; needs manual verification"
- Later, verify-only (when network is available):
python .codex/skills/citation-verifier/scripts/run.py --workspace <ws> --verify-only
Notes
- Minimal requirement for every verification record:
url,date,title. - The script sanitizes stray/unbalanced
{}in titles to keepbibtexparsing robust. - The script escapes LaTeX special chars in text fields (
& % $ # _) and rewrites superscript patterns likeX^NorX$^N$asX\textsuperscript{N}to keep LaTeX builds stable. - URLs are kept raw in BibTeX
urlfields (BibTeX styles wrap them with\url{...});@miscuseshowpublished=\url{...}. - In offline mode, records are not truly verified; treat
offline_generatedas a to-do for human/network verification.
Troubleshooting
Common Issues
Issue: Missing bibkey / missing url in notes
Symptom:
citations/ref.bibis missing entries, orverified.jsonlhas emptyurl/title.
Causes:
papers/paper_notes.jsonllacksbibkey/urlfields.
Solutions:
- Ensure each core paper note has a stable
bibkeyand a canonicalurl. - Rerun citation generation after fixing notes.
Issue: verification_status=offline_generated
Symptom:
- Records exist but are not truly verified.
Causes:
--offlinewas used, or network verification was unavailable.
Solutions:
- When network is available, run
--verify-onlyto upgrade records. - Or manually verify and update
citations/verified.jsonlwith notes.
Recovery Checklist
- Every BibTeX entry has a matching
citations/verified.jsonlrecord. - Verification records include
url,date,title.
More by WILLOSCAR
View allWrite the tutorial content (`output/TUTORIAL.md`) from an approved module plan, including exercises and answer outlines. **Trigger**: write tutorial, tutorial modules, 教程写作, TUTORIAL.md. **Use when**: tutorial pipeline 的写作阶段(C3),且 `DECISIONS.md` 已记录 HUMAN 对 scope/running example 的批准(C2)。 **Skip if**: module plan 未完成/未批准(先跑 `module-planner`/`exercise-builder` 并通过 Approve C2)。 **Network**: none. **Guardrail**: 只写已批准范围;保持 running example 一致;每模块包含练习与答案要点。
Use when a reader-facing deliverable exists and needs a deterministic PASS/FAIL quality gate. **Trigger**: self loop, self-loop, polish deliverable, quality gate, fix-on-fail, 收敛, 自循环, 质量门. **Use when**: A pipeline has produced a reader-facing deliverable (`output/*.md`) and you want deterministic convergence to PASS. **Skip if**: You are still pre-approval for prose or the upstream evidence/structure artifacts are missing. **Network**: none. **Guardrail**: Do not invent papers/citations/results. Only use in-scope inputs already present in the workspace.
Lock an ideation run into a single-source-of-truth brainstorm brief (`output/trace/IDEA_BRIEF.md`) and a replayable multi-query plan (`queries.md`). **Trigger**: idea brief, ideation brief, research ideas, brainstorm, 找 idea, 选题, 点子, 找方向. **Use when**: the user wants research ideas and their input is long / multi-turn; you need to clarify topic + constraints before retrieval. **Skip if**: the goal is to write a survey draft directly (use `arxiv-survey*` pipelines instead). **Network**: none. **Guardrail**: do not invent papers/citations; do not start retrieval here; keep the brief structured (no long prose).
Download PDFs (when available) and extract plain text to support full-text evidence, writing `papers/fulltext_index.jsonl` and `papers/fulltext/*.txt`. **Trigger**: PDF download, fulltext, extract text, papers/pdfs, 全文抽取, 下载PDF. **Use when**: `queries.md` 设置 `evidence_mode: fulltext`(或你明确需要全文证据)并希望为 paper notes/claims 提供更强 evidence。 **Skip if**: `evidence_mode: abstract`(默认);或你不希望进行下载/抽取(成本/权限/时间)。 **Network**: fulltext 下载通常需要网络(除非你手工提供 PDF 缓存在 `papers/pdfs/`)。 **Guardrail**: 缓存下载到 `papers/pdfs/`;默认不覆盖已有抽取文本(除非显式要求重抽)。
