Skip to content

Active Frontier Source Registry

This registry tracks where to monitor the active autonomy research frontier and how to query each source. It is a manual-first operating page for perception, SLAM/mapping, world models, VLA/VLM, datasets/benchmarks, and validation.

It owns source discovery metadata only. Candidate status remains in the canonical owners: Perception Coverage Audit, SLAM Coverage Audit, Knowledge Gap Backlog, and Continuous Research Loop.

How To Use This Registry

  1. Scan a source or saved query at its review cadence.
  2. Verify the primary source before routing: paper, project page, dataset page, benchmark page, release note, standard, or official repository.
  3. Dedupe against existing registry rows, audits, canonical pages, and gap backlogs.
  4. Route manually:
    • route-to-backlog: durable gap belongs in Knowledge Gap Backlog.
    • route-to-audit: perception or SLAM evidence belongs in the relevant audit.
    • route-to-loop: process or cadence change belongs in Continuous Research Loop.
    • watch: relevant but not yet actionable; if it creates ongoing research work, place it in the canonical watchlist/P2 row.
    • ignore: out of scope, duplicate, superseded, or insufficiently primary.
  5. Link the registry row to the canonical destination instead of restating the backlog item.

Scope

Included trackWatch for
perception3D detection, BEV, occupancy, open-world/OOD, sensor fusion, radar, event, thermal, and robustness methods.
slam-mappingLiDAR-inertial, visual-inertial, radar, Gaussian/neural SLAM, loop closure, pose graphs, lifelong mapping, and map-change methods.
world-modelsOccupancy world models, scene generation, neural simulation, 4D forecasting, diffusion planning, and closed-loop simulation.
vla-vlmDriving VLM/VLA reliability, spatial reasoning, grounded language, action heads, and closed-loop evaluation.
datasets-benchmarksDriving, adverse-weather, corruption, FOD, V2X, 4D radar, occupancy, and SLAM datasets/benchmarks.
validationClosed-loop evaluation, scenario testing, uncertainty calibration, runtime monitoring, safety evidence, and fault injection.

Out of scope: broad company tracking, market intelligence, general operations, and platform maintenance unless they directly affect one of the six active frontier tracks.

Registry Content Model

Each source row uses stable IDs and parseable fields so a future Node checker can validate the page without changing the document model.

FieldPurpose
Source IDStable machine-readable identifier.
SourceHuman-readable source name.
Source TypeControlled source class such as preprint, venue, publisher-index, dataset, benchmark, code, model, standard, regulator, industry-lab, or weak-signal.
Authority TierT1, T2, T3, or T4.
Source RoleDiscovery role: primary evidence, aggregator, code artifact, benchmark, dataset, weak signal, or standards source.
Frontier TracksComma-separated track IDs.
Evidence ObjectsExpected primary evidence: paper, DOI, arXiv/OpenReview ID, repo, project page, dataset page, benchmark page, standard, release note.
Native FiltersNative categories, venue filters, search operators, alert options, or API parameters.
Manual Query PatternHuman-readable query seed.
Watch MethodRSS, API, email alert, saved search, page watch, manual page review, or issue digest.
CadenceWeekly, monthly, quarterly, event-driven, or conference-cycle.
Last CheckedISO date.
Next ReviewISO date.
Source HealthControlled health value.
Automationhigh, medium, low, or none.
Verification RuleRequired primary-source check before routing.
CaveatsShort source-specific warnings.

Authority tiers:

  • T1: peer-reviewed venue, official benchmark/dataset, official standard, or regulator source.
  • T2: author/lab page, official repository/model release, industry technical report, or preprint with artifacts.
  • T3: scholarly aggregator or metadata index.
  • T4: social, newsletter, community, vendor, or other weak signal.

Automation values:

  • high: stable API, RSS, export, or predictable identifiers.
  • medium: stable pages, but scraping/page-diff/manual query shaping likely.
  • low: useful manually, brittle for scripts.
  • none: manual-only.

Source health values:

  • active: source works and returns relevant frontier material.
  • degraded: filters, feeds, or result quality have weakened but remain useful.
  • paused: temporarily noisy, unavailable, or off-cycle.
  • retired: no longer useful; replacement or rationale belongs in caveats.
  • moved: source appears relocated; keep old URL until manually verified.
  • stale: review date has expired; content is not automatically wrong.
  • unreachable: last check failed; requires retry or manual review.
  • withdrawn-or-retracted: only set after primary-source confirmation.
  • license-or-access-limited: source exists but use or redistribution is constrained.
  • unknown: automation could not determine status.

Source Categories

CategoryExample sourcesRole
Preprints and metadataarXiv, OpenReview, Semantic Scholar, OpenAlex, Crossref, DBLPEarly discovery and metadata cross-checking.
Core venuesCVPR, ICCV, ECCV, WACV, NeurIPS, ICML, ICLR, CoRL, RSS, ICRA, IROS, IEEE IV, ITSCAccepted paper and workshop tracking.
Publisher indexesIEEE Xplore, ACM Digital Library, SpringerLink, ScienceDirect, SAGE/IJRR, SAEFinal proceedings, journals, robotics/control/sensor papers.
Datasets and benchmarksWaymo Open Dataset, nuScenes, Argoverse, KITTI/SemanticKITTI, nuPlan, CARLA Leaderboard, NAVSIM, Bench2Drive, OpenLane/OpenLane-V2, V2X, adverse-weather, corruption, and SLAM benchmarksDataset and evaluation freshness.
Code and model artifactsGitHub, Hugging Face, official project pages, OpenMMLab, OpenPCDet, Autoware/ROS ecosystemsReproducibility and release signals.
Simulation and validation infrastructureCARLA, CommonRoad, Scenic, MetaDrive, Waymax, ASAM OpenSCENARIO, ASAM OpenODD, ASAM OpenDRIVEClosed-loop and scenario-evidence monitoring.
Safety and regulatory frontierISO 21448/SOTIF, ISO 26262, UL 4600, SAE, UNECE, NHTSA ADS materialsValidation, assurance, and standards signals.
Industry researchWaymo, NVIDIA, Waabi, Toyota/Woven, Mobileye, Bosch, Aurora, Zoox, Tesla AIProduction-adjacent technical signals.
Sensor frontierLiDAR, 4D radar, FMCW LiDAR, event-camera, and thermal vendors; SPIE/SAE/IEEE sensor papers; patents and application notesHardware-driven perception and SLAM frontier signals.
Weak signalsLab pages, newsletters, challenge pages, curated GitHub lists, social/community postsDiscovery only; never sufficient for promotion.

Source Registry

Source IDSourceSource TypeAuthority TierSource RoleFrontier TracksEvidence ObjectsNative FiltersManual Query PatternWatch MethodCadenceLast CheckedNext ReviewSource HealthAutomationVerification RuleCaveats
arxiv-cs-cvarXiv cs.CVpreprintT2preprint discoveryperception, world-models, vla-vlm, datasets-benchmarkspaper, arXiv ID, project page, repocat:cs.CV, recent submissionsautonomous driving plus occupancy, BEV, 3D detection, OOD, VLM, world modelRSS/APIweekly2026-05-122026-05-19activehighVerify paper page and artifact links before routing.Preprint quality varies; dedupe against conference versions.
arxiv-cs-roarXiv cs.ROpreprintT2preprint discoveryslam-mapping, validation, datasets-benchmarkspaper, arXiv ID, repo, dataset pagecat:cs.RO, recent submissionsSLAM, odometry, mapping, pose graph, closed-loop evaluation, benchmarkRSS/APIweekly2026-05-122026-05-19activehighVerify robotics relevance and primary artifacts before routing.Many generic robotics papers need AV relevance filters.
arxiv-eess-ivarXiv eess.IVpreprintT2preprint discoveryperception, datasets-benchmarkspaper, arXiv ID, repocat:eess.IV, recent submissionsimage/video perception, event camera, thermal, sensor fusion, driving scenesRSS/APIweekly2026-05-122026-05-19activehighVerify driving or robotic perception fit before routing.Medical and generic image papers require negative filtering.
arxiv-eess-syarXiv eess.SYpreprintT2preprint discoveryvalidation, slam-mappingpaper, arXiv ID, repocat:eess.SY, recent submissionscontrol, safety, runtime verification, uncertainty, fault injection, localizationRSS/APIweekly2026-05-122026-05-19activehighVerify autonomy or safety-case relevance before routing.Broad systems/control category can be noisy.
openreview-venuesOpenReview venuesvenueT1open-review venue discoveryworld-models, vla-vlm, perception, validationpaper, OpenReview ID, decision, reviewsvenue invitation, title, abstract, decisionICLR, ICML, NeurIPS, CoRL with driving, embodied, world model, VLA, robustnessAPI/manualweekly2026-05-122026-05-19activemediumVerify venue decision and official paper page before routing.Venue fields differ and API versions vary.
cvf-openaccessCVF Open AccessvenueT1proceedings discoveryperception, world-models, vla-vlm, datasets-benchmarkspaper, proceedings page, project page, repoconference year, title searchCVPR, ICCV, ECCV, WACV with occupancy, BEV, 3D, driving, VLM, datasetmanual/page watchconference-cycle2026-05-122026-08-12activemediumVerify official CVF page and artifact links before routing.Static pages are stable but not a clean API.
pmlr-proceedingsPMLR ProceedingsvenueT1proceedings discoveryworld-models, vla-vlm, validationpaper, proceedings page, PDFproceedings volume, title searchCoRL, ICML, AISTATS with robotics, driving, world model, policy, evaluationmanual/page watchmonthly2026-05-122026-06-12activemediumVerify proceedings record and project artifacts before routing.Some relevant venues use OpenReview instead.
rss-proceedingsRobotics: Science and SystemsvenueT1proceedings discoveryslam-mapping, validation, datasets-benchmarkspaper, proceedings page, project page, repoconference year, title searchSLAM, mapping, localization, multi-robot, benchmark, field roboticsmanual/page watchconference-cycle2026-05-122026-08-12activemediumVerify official RSS paper page before routing.Proceedings page structure changes by year.
icraIEEE RAS ICRAvenueT1proceedings discoveryslam-mapping, perception, validation, datasets-benchmarkspaper, DOI, IEEE record, project pageprogram, title, keywords, IEEE XploreSLAM, odometry, mapping, field robotics, planning, evaluation, datasetsmanual/page watchconference-cycle2026-05-122026-08-12activelowVerify official program or IEEE record before routing.Final metadata often lands in IEEE Xplore.
irosIEEE RAS IROSvenueT1proceedings discoveryslam-mapping, perception, validationpaper, DOI, IEEE record, project pageprogram, title, keywords, IEEE Xplorerobust perception, SLAM, localization, mapping, multi-robot, safetymanual/page watchconference-cycle2026-05-122026-08-12activelowVerify official program or IEEE record before routing.Program pages and accepted-paper lists vary by year.
ieee-xploreIEEE Xplorepublisher-indexT1publisher recordperception, slam-mapping, validation, datasets-benchmarksDOI, publisher page, abstract, venuepublication title, year, author, IEEE termsIV, ITSC, ICRA, IROS, T-RO, RA-L with driving, SLAM, sensor fusion, safetysaved search/APImonthly2026-05-122026-06-12activemediumVerify publisher record and avoid relying on abstract-only claims.API access requires credentials; many records are paywalled.
acm-dlACM Digital Librarypublisher-indexT1publisher recordvla-vlm, world-models, validationDOI, publisher page, conference pagetitle, abstract, venue, dateembodied AI, HRI, simulation, evaluation, foundation models, safetysaved search/manualmonthly2026-05-122026-06-12activelowVerify DOI and official paper page before routing.Less central for AV but useful for HRI and embodied-AI overlap.
semantic-scholarSemantic Scholarpublisher-indexT3metadata aggregatorperception, slam-mapping, world-models, vla-vlm, validationpaper metadata, DOI, arXiv ID, citationsquery, field of study, dateexact method names, citation trails, related work, code linksAPI/manualweekly2026-05-122026-05-19activemediumUse only to discover or cross-check primary sources.Aggregator metadata can lag or merge records incorrectly.
openalexOpenAlexpublisher-indexT3metadata aggregatorperception, slam-mapping, world-models, validationDOI, venue, authors, conceptsworks search, concept, datetopic discovery, publication date checks, DOI lookupAPI/manualweekly2026-05-122026-05-19activehighUse only to discover or cross-check primary sources.Concept tags are broad and require manual validation.
dblpDBLPpublisher-indexT3bibliographic indexperception, slam-mapping, world-models, vla-vlmvenue record, author record, DOI linkauthor, title, venue, yearvenue completeness, author/lab follow-up, accepted paper cross-checkmanual/APImonthly2026-05-122026-06-12activemediumUse for bibliographic cross-checking, not promotion alone.Does not provide technical evidence.
github-searchGitHub SearchcodeT2code artifact discoveryperception, slam-mapping, world-models, vla-vlm, datasets-benchmarksrepo, release, license, README, issuestopic, language, stars, pushed dateautonomous-driving occupancy SLAM VLM dataset benchmark stars pushedAPI/manualweekly2026-05-122026-05-19activehighVerify official repo link from paper/project page before routing.Stars are weak evidence; do not clone arbitrary repos in CI.
huggingface-papersHugging Face PapersmodelT3model and paper discoveryworld-models, vla-vlm, perception, datasets-benchmarkspaper, model card, dataset card, licensepapers, models, datasets, datedriving VLM, robot foundation model, world model, occupancy, datasetmanual/APIweekly2026-05-122026-05-19activemediumVerify canonical paper and license before routing.Trending pages are discovery signals only.
waymo-openWaymo Open DatasetdatasetT1dataset and challenge sourcedatasets-benchmarks, perception, world-models, validationdataset page, challenge page, paper, licensechallenge, task, release note3D detection, motion, occupancy, E2E, simulation, challenge updatespage watch/manualmonthly2026-05-122026-06-12activemediumVerify official release notes and task definitions before routing.Terms and benchmark splits matter for claims.
nuscenesnuScenesdatasetT1dataset and benchmark sourceperception, datasets-benchmarks, validationdataset page, benchmark, devkit, licensetask, benchmark, releasedetection, tracking, prediction, occupancy, map, sensor fusionpage watch/manualmonthly2026-05-122026-06-12activemediumVerify benchmark task and devkit version before routing.Older but still central; watch for derivative benchmarks.
argoverseArgoversedatasetT1dataset and benchmark sourcedatasets-benchmarks, perception, world-models, validationdataset page, benchmark, paper, licensetask, benchmark, releasemotion forecasting, sensor dataset, maps, E2E, scenario miningpage watch/manualmonthly2026-05-122026-06-12activemediumVerify official task page before routing.Derivative papers may use outdated versions.
kittiKITTIdatasetT1legacy benchmark sourceperception, slam-mapping, datasets-benchmarksdataset page, benchmark page, papertask, benchmark, raw dataodometry, object detection, tracking, semantic scene completionmanualquarterly2026-05-122026-08-12activelowVerify task relevance and avoid over-weighting legacy leaderboard claims.Useful baseline but not representative of modern AV stacks alone.
carla-leaderboardCARLA LeaderboardbenchmarkT1closed-loop benchmark sourcevalidation, world-models, vla-vlmbenchmark page, paper, code, scenario definitionsleaderboard, challenge, route, scenarioclosed-loop driving, leaderboard, agent robustness, scenario evaluationpage watch/manualmonthly2026-05-122026-06-12activemediumVerify scenario version and leaderboard rules before routing.Synthetic benchmark claims need real-world caveats.
navsimNAVSIMbenchmarkT1closed-loop evaluation sourcevalidation, world-models, vla-vlmbenchmark page, paper, codetask, metric, releaseopen-loop and closed-loop planning, E2E evaluation, driving scorepage watch/manualmonthly2026-05-122026-06-12activemediumVerify official task and metric definitions before routing.Benchmark versions can change quickly.
bench2driveBench2DrivebenchmarkT1closed-loop benchmark sourcevalidation, vla-vlm, world-modelsbenchmark page, paper, codetask, metric, scenarioclosed-loop driving, VLM/VLA evaluation, long-tail scenariospage watch/manualmonthly2026-05-122026-06-12activemediumVerify official benchmark page and code before routing.CARLA-based results need sim-to-real caveats.
asam-openxASAM StandardsstandardT1scenario and ODD standard sourcevalidation, datasets-benchmarksstandard page, release notes, schemaOpenSCENARIO, OpenODD, OpenDRIVEscenario testing, ODD, road network, simulation evidencepage watch/manualquarterly2026-05-122026-08-12activelowVerify official standard version before routing.Access and versioning differ across standards.
nhtsa-adsNHTSA Automated Vehicles SafetyregulatorT1regulator sourcevalidationofficial guidance, rulemaking, reporting pageADS, crash reporting, guidance, exemptionsADS safety, reporting, post-market monitoring, incident evidencepage watch/manualmonthly2026-05-122026-06-12activelowVerify official docket, rule, or guidance page before routing.Date-sensitive; do not infer final rules from proposals.
nvidia-researchNVIDIA Researchindustry-labT2industry research sourceperception, world-models, vla-vlm, validationpaper, project page, code, dataset, modelpublication page, project page, dateCosmos, simulation, 3D perception, world models, autonomous drivingpage watch/manualmonthly2026-05-122026-06-12activemediumVerify paper/project artifacts and license before routing.Product pages and research pages should not be mixed.
waymo-researchWaymo Researchindustry-labT2industry research sourceperception, world-models, datasets-benchmarks, validationpaper, dataset, benchmark, technical reportpublication page, date, topicsafety, simulation, datasets, planning, perception, E2E drivingpage watch/manualmonthly2026-05-122026-06-12activemediumVerify official paper or dataset page before routing.Company claims should stay attributed.
researchgateResearchGateweak-signalT4author follow-up sourceperception, slam-mapping, world-models, vla-vlm, datasets-benchmarks, validationauthor page, project update, paper pointerauthor, topic, projectexact author or method follow-up after primary source discoverymanualmonthly2026-05-122026-06-12activenoneUse only to find a primary source elsewhere.Not reliable as canonical evidence and not suitable for automation.

Active Frontier Query Bank

Query IDTrackSource IDsQuery PatternNative FiltersIntentCadence
perception-occupancyperceptionarxiv-cs-cv, cvf-openaccess, openreview-venues, semantic-scholar"occupancy" AND "autonomous driving" plus BEV, panoptic, semantic occupancy, open occupancycat:cs.CV, CVPR/ICCV/WACV, recent dateFind occupancy perception methods and benchmarks.weekly
perception-open-worldperceptionarxiv-cs-cv, cvf-openaccess, huggingface-papers, github-search"open-vocabulary 3D" OR "open-world" OR "OOD" OR "anomaly segmentation" plus drivingcat:cs.CV, project/repo linksFind open-world, OOD, and rare-object perception methods.weekly
perception-radar-sensorperceptionarxiv-eess-iv, ieee-xplore, icra, iros"4D radar" OR "FMCW LiDAR" OR "event camera" OR "thermal" plus driving or roboticscat:eess.IV, IEEE IV/ITSC/ICRA/IROSFind sensor-frontier perception and fusion work.weekly
slam-lio-radarslam-mappingarxiv-cs-ro, icra, iros, rss-proceedings, ieee-xplore"LiDAR-inertial odometry" OR "radar odometry" OR "visual-inertial SLAM" plus benchmarkcat:cs.RO, ICRA/IROS/RSS, T-RO/RA-LFind SLAM, odometry, and localization methods.weekly
slam-map-changeslam-mappingarxiv-cs-ro, rss-proceedings, semantic-scholar, github-search"lifelong mapping" OR "map change detection" OR "dynamic object removal" OR "pose graph optimization"cat:cs.RO, recent date, repo linksFind map hygiene, lifelong mapping, and backend gaps.weekly
world-models-drivingworld-modelsarxiv-cs-cv, openreview-venues, cvf-openaccess, pmlr-proceedings, nvidia-research"driving world model" OR "occupancy world model" OR "4D occupancy forecasting" OR "neural simulation"cat:cs.CV, OpenReview venues, CVFFind frontier driving world-model and simulation papers.weekly
vla-vlm-drivingvla-vlmarxiv-cs-cv, openreview-venues, huggingface-papers, github-search"driving VLM" OR "vision-language-action" OR "spatial reasoning" OR "action head" plus driving or robotcat:cs.CV, OpenReview, model cardsFind driving VLM/VLA methods and reliability benchmarks.weekly
datasets-adverse-foddatasets-benchmarkswaymo-open, nuscenes, argoverse, kitti, arxiv-cs-cv, semantic-scholar"adverse weather dataset" OR "sensor corruption" OR "FOD" OR "V2X dataset" OR "4D radar dataset"dataset release pages, benchmark pagesFind datasets and benchmarks that should feed audits.monthly
validation-closed-loopvalidationcarla-leaderboard, navsim, bench2drive, openreview-venues, pmlr-proceedings"closed-loop evaluation" OR "scenario testing" OR "distribution shift" OR "fault injection"benchmark task, metrics, recent dateFind validation protocols and closed-loop benchmarks.monthly
validation-standardsvalidationasam-openx, nhtsa-ads, ieee-xplore"OpenSCENARIO" OR "OpenODD" OR "ADS safety" OR "runtime monitor" OR "safety case"standard version, official guidance, publication dateFind scenario-evidence, safety, and regulatory updates.monthly

Cross-cutting modifiers:

  • autonomous driving
  • self-driving
  • ego vehicle
  • driving scenes
  • closed-loop
  • ODD
  • sim-to-real
  • long-tail
  • safety-critical

Negative filters:

  • Add NOT medical when image or segmentation queries return medical imaging.
  • Add NOT satellite when perception queries return remote-sensing papers.
  • Add NOT generic LLM when VLM queries lose driving or robotics grounding.
  • Add NOT manipulation when robotics queries are unrelated to vehicle autonomy.

Candidate Routing

This registry owns source discovery only. Candidate status is maintained in the canonical owner, not in this page.

Candidate typeCanonical owner after triage
Perception methods, perception robustness, perception datasetsPerception Coverage Audit
SLAM, odometry, localization, mapping methods, SLAM benchmarksSLAM Coverage Audit
World models, VLA/VLM, end-to-end driving, simulation frontierKnowledge Gap Backlog until a dedicated audit exists
Cross-track datasets and benchmarksOwning domain audit if method-specific; otherwise Knowledge Gap Backlog
Validation, safety evidence, evaluation protocolsKnowledge Gap Backlog until a dedicated validation audit exists

Promotion rule:

A candidate can be promoted only after primary-source verification: canonical paper/DOI/arXiv/OpenReview ID, official repo or project page where available, dataset/benchmark page if relevant, artifact/license/access notes, validation or benchmark claim, and at least one caveat.

Candidate flow:

source scan -> primary-source verification -> dedupe -> canonical backlog/audit -> atomic page or watchlist

Workflow And Automation

Manual-First Boundary

The registry is a manual-first intake surface. Automation may validate structure, report diagnostics, and produce candidate digests. It must not edit canonical backlogs, route candidates, delete sources, promote work, or replace primary-source evidence with aggregator metadata.

Review Cadence And Gates

  • Weekly: review new candidates and automation notes, dedupe them, and assign a routing decision.
  • Monthly: prune stale watch traces, confirm canonical links still resolve, and close entries that have been routed.
  • Quarterly: review whether source categories, scope boundaries, or automation checks need adjustment in Continuous Research Loop.

Human review is required before any source changes canonical backlog, audit, or loop content. A source may be routed only after primary-source verification and dedupe are complete.

Automation Phases

  • Phase 0: Markdown-only registry.
  • Phase 1: offline Node checker validates IDs, enums, required fields, ISO dates, duplicate IDs/URLs, source-health values, stale review dates, and canonical links.
  • Phase 2: opt-in Node scanner creates review artifacts only: dated Markdown digest, raw-result manifest/cache, and skipped/error summary.
  • Phase 3: scheduled GitHub Action may open a review issue or upload artifacts. It must not commit edits or modify canonical files.

CI And Fetch Policy

Normal PR checks stay offline and deterministic: schema checks, internal links, tests, and VitePress build.

Live source fetching runs only through manual commands or scheduled jobs. Scheduled fetches must use provider allowlists, short timeouts, bounded retries, low concurrency, per-provider budgets, stable user-agent strings, cached responses where permitted, and secrets only from GitHub Actions or local environment variables.

Missing API keys skip that provider rather than failing the build.

Failure, Security, And Terms

External failures produce diagnostics, not content changes. API outages, 403/429 responses, DNS/TLS failures, malformed data, and schema drift should be recorded as unknown or unreachable until manually reviewed.

No API keys, cookies, signed URLs, or tokens belong in Markdown, VitePress output, logs, or PR jobs. Scheduled jobs should use least permissions, such as contents: read and issues: write.

Use official APIs where possible. Do not store copied abstracts, paper bodies, README bodies, fetched PDFs, dataset contents, or large scraped text. Store metadata, URLs, license/access notes, and human-written review notes.

Testing Split

Run on PRs:

  • Registry schema validation.
  • Internal link checks.
  • VitePress build.
  • Fixture-based parser tests.
  • Deterministic diagnostic snapshots.
  • Secret-pattern checks.

Run only manually or on schedule:

  • Live provider smoke tests.
  • Freshness scans.
  • Source-health reports.
  • Candidate discovery digests.

Later Automation Feasibility

Initial high-feasibility automation sources:

  • arXiv API/RSS
  • GitHub Search API
  • OpenAlex API

Medium-feasibility sources:

  • Semantic Scholar API
  • OpenReview API
  • CVF static proceedings pages
  • stable dataset/benchmark pages

Manual-only or weak automation sources:

  • vendor blogs
  • benchmark leaderboards with changing HTML
  • standards pages behind access controls
  • ResearchGate
  • social/news/community sources

Provider limits must be configuration, not hidden constants. arXiv expects polite delays and bounded slices; GitHub search uses separate rate buckets and secondary limits; Semantic Scholar and OpenAlex have key-dependent throughput; OpenReview has venue-specific fields and API-version differences.

Sources

Public research notes collected from public sources.