- Feature RSS feed for civilgrandjury.org /new (newly discovered grand jury reports)
- Feature feat(cgj): discoverer should enumerate Google Drive folders + embedded-Drive iframes
- Fix CGJ: don't nest supporting files under consolidated-split children — use the single end section
- Fix CGJ consolidated splitter: misses numbered 'List of Reports' TOCs without page numbers (Ventura 2023-24)
- Fix CGJ discovery: Ventura per-report 'Ref-0NN'/'Att-0NN' reference & attachment files ingested as standalone reports
- Fix CGJ discovery: Ventura 2025-26 reports mis-titled 'Environment' and mis-filed under 2015-2016 (Elementor h1 layout)
- Fix fix(cgj): generalize respondent place-token/wrong-function gate (matcher + audit + Gavelly web) + re-match ~109 legacy bad links
- Fix fix(cgj): discoverer re-enumerates the same Drive folder once per crawled page
- Fix fix(cgj): phantom pre-1950 jury years from URL digit-strings mis-parsed as years (#analysis shows 1905/1907/1922)
- Fix CGJ: respondent matcher force-assigns a government entity to non-government / unresolvable names; Gavelly doesn't validate the assignment
- Fix CGJ: "Permit Sonoma" (Sonoma County dept) matched to City of Sonoma in 21 reports — should be Sonoma County
- Fix fix: auto-fix code quality issues from global check (2026-06-22)
- Fix Entities: 193 CA cities duplicated as mis-typed special_district/other entities
- CGJ /new: swap Discovered/Published column order so sorted column leads
- Canonicalize civilgrandjury.org: 301 www→apex + pin sitemap to apex
- CGJ: news-first email subscriptions on /news (inverse of /new)
- audit(cgj): frozen batch_pattern/context_pattern respondent matches bypass the cross-domain audit
- fix(cgj): consolidated detector over-splits single reports (section-outline density guard)
- CGJ: per-jury non-statutory report-count audit script (dedup + cross-year + statutory-gap corrections)
- CGJ corpus pollution: kern 2020-2021 (redistricting) + mariposa 2023-2024 (market study) mis-seeded years
- santa-cruz/2020-2021: 2021 redistricting-commission docs mis-crawled into CGJ corpus (maps as reports)
- civilgrandjury.org/search: add 'most recent' sort option (date vs relevance)
- CGJ follow-up: reclassify held-back context docs that carry extracted content
- Widen CGJ NON_REPORT_PATTERNS: rosters, GJ membership, minutes, bare agendas, press releases, jury/applicant demographics, to/from-jury letters
- CGJ R2 publish has no concurrency lock — parallel runs thrash, manifest never rotates
- CGJ: review non_report-titled 'report' rows that have extracted findings/recs (held back by backfill)
- fix(cgj): unify masthead/typography across all page types (two-tier hero system)
- fix(cgj): Shasta 2024-2025 consolidated mis-split — missing 'Shasta County Jail' report, wrong ranges, duplicate Juvenile Rehab
- fix(cgj): Shasta 2025-2026 consolidated final report unflagged & unsplit (TOC parse below auto-flag threshold)
- CGJ: 'Final Report <case-number>' titles not recognized as generic (real title hidden)
- CGJ: strip leading jury-name + year boilerplate from display titles ( follow-up)
- fix(cgj): county page news heading → 'This county in the news', drop 'automatically gathered'
- CGJ classifier: truncated/glued 'Respons<Agency>' titles misclassified as reports (Del Norte Drive)
- CGJ /new QA pass: continuity reclassification, Marin/Tulare title artifacts, Solano BOS response misclass
- CGJ: group non-report docs into an 'Additional documents' section on the county-year page
- CGJ header: long title overflows on phones — needs responsive/hamburger treatment
- CGJ homepage/header polish: drop logo divider, larger nav, roomier county index, intro lede
- docs: fix civilgrandjury.org URL examples in CLAUDE.md (/reports/<county> 500s — use /<county>/<year>)
- Redesign /analysis: unify deep reports + 60-topic taxonomy into one page
- CGJ: full-width page header — logo + title left, nav center, language/login right; relabel + reorder menu
- Finding↔recommendation association ignores the number-prefix convention (R1A/R1B→F1)
- Archive the Gavelly Word add-in — keep the code, remove the (dev-only/broken) public release surface
- CGJ rate-limiting gaps: standalone POSTs bypass middleware, Gavelly GPU amplifier, /search full scans
- performance(cgj): fetch_all_news has no LIMIT — unbounded statewide load
- /news: hide redundant county tag when filtered to one county; clarify report link + mark off-site story links
- CGJ monitor detect_cms: a Legistar/Granicus link must not override the host CMS
- /news: sortable Date·Source·Story table + bookmarkable filter/sort; hide entity chips
- Surface "Report a problem with this page" link inline at end of county-page disclaimer
- CGJ monitor: name the specific report years added/removed in website-change alerts
- CGJ email alerts: per-alert Edit/Remove on the /new panel + raise cap to 10
- What's New page: surface a signed-in user's existing email alerts (+ Manage link)
- CGJ website-change monitor: reduce false-positive noise + add debug detail
- fix(cgj): consolidated-PDF splitter produces overlapping page ranges + promotes subtitles to sub-reports
- test(request/cgj): test coverage gaps for messaging, push-token, and tracking routes
- Atlas: dedup + acronym-slug standardization for CA regional agencies (SANDAG/SCAG/MTC/ABAG/BART)
- Document the UnGovr URN scheme on the Den (slash vs colon delimiter)
- Rate-limit hardening: guard test for unbucketed POSTs, forwarded-IP trust, body-size/slowloris, durability, edge rules
- fix(ops): show the app waffle (public apps) to logged-out visitors + nudge geo breadcrumb 1px
- Clean up untracked working-tree backlog (gitignore venvs/nltk/build, delete ephemera)
- Atlas landing: welcome intro band + new /intro getting-started guide
- Rate-limit: outbound email/SMS endpoints slip their tight bucket into the write catch-all
- Codify + apply Cloudflare edge rate-limit rules (request intake + Gavelly) — follow-up
- Login does not return users to the originating page (request.ungovr.ie → /e; www/law nav has no redirect)
Release notes
What's changed
- Feature feat(api,mcp): product availability on the entity graph — products field + inverted product→coverage API
- Feature www /impact: drop team blurb that duplicates /people
- fix(cgj): consolidated-volume rows leak into the per-report list as 'consolidated report'
- Remove "Development Preview" banner; add page-seeded content-feedback link
- chore: untrack ungovr-cgj egg-info build artifacts (already gitignored)
- Authenticate the civilgrandjury.org subscription-confirmation email
- bug(cgj): ToU gate leaks /ungovr/cgj mount prefix → strands CGJ-login users on ungovr.org
- fix(i18n): es-419 localization gaps on civilgrandjury.org home + footer
- Bump UnGovrBot crawler version to 0.3.53 and unify all UA strings
- Mail relay TLS: internal clients use mail.ungovr.org but certs only cover mail-00X.ungovr.cc (hostname mismatch)
- Fix CGJ static publish can wipe entire R2 site when origin (5002) is down at build time
- Fix CGJ: consolidated reports with non-'consolidated' link titles ingest as single reports and surface raw on /new (no auto-split)
- Fix fix(cgj): /consolidate correctness, performance & test gaps
- Fix security(cgj): harden /consolidate — decompression bomb + dead Turnstile CSRF exemption
- Fix fix(cgj): digest CLI doesn't init core DB → email suppression check + core email-audit log skipped
- Fix fix(cgj): subscription digest cron missing
doppler run --→ no subscriber emails since launch ( regression) - Fix CGJ titles: 'View the document' not recognized as generic (54 Marin reports show anchor text); Plumas Fairgrounds report mistitled
- Fix reextract_multiprocess.py --county crashes: Pass 0 uses alias 'r' from wrong query
- Fix CGJ /new: title cleanup — run-on cover OCR, (PDF,NNkb) suffix, acronym casing, numbering noise
- Fix CGJ /new: generic crawler titles leak when extracted_title is NULL (backfill + always-extract)
- Fix CGJ /new: agency responses misclassified as reports (classifier blind to generic/typo filenames)
- Fix ungovr1 postgres: recurring backend SIGSEGV in core executor — symptom of hardware instability, no data corruption found
- Fix Rate-limit + audit-log IP collapse: worker-proxied traffic presents one Cloudflare egress IP
- CGJ: wire generated_title into year_detail footnote + audit helper ( residuals)
- CGJ: account-based ★ Track this jury + homepage tracked-juries band
- CGJ county page: surface redesign (title, news strip, entities-by-search, breadcrumb, totals)
- CGJ: is_consolidated_title misses underscore-joined titles (ingest detection gap,)
- CGJ generated titles not surfaced in /api/new feed, email digest, or JSON API endpoints
- LLM agency-naming title generator for CGJ reports
- CGJ audit: teach the workflow prompt to consume the new discoverer.* fields
- CGJ: fix generic/truncated extracted titles (marin responses; audit cross-county)
- CGJ: audit_consolidated_candidates cron log lines are doubled (StreamHandler + FileHandler + cron redirect to same file)
- CGJ report-audit helper: handle JS-interaction-gated report lists (kern tabs) + title-based DB matching (marin) — false unreadable/missing
- fix(cgj): light/dark variants + larger size for UnGovr popout icon
- fix(cgj): use canonical UnGovr popout icon on county entities page
- Polish county 'Keep me updated' CTA: bigger email icon + brand-blue button
- CGJ /new: link county name in report list to county page
- cgjweb: same Accept-Language 500 in its ported i18n copy (sibling of)
- CGJ season-pace re-check reminder (Jun 25-30): one-shot cron re-runs county-coverage comparison + emails verdict
- CGJ daily monitor: scan all 58 counties per night (raise MAX_REPORT_SCAN_COUNTIES 20→58) for same-day report/response discovery
- CGJ discoverer blind on courts.ca.gov template — Amador + Mono yield pdfs_on_site=0 (miss-risk for 2025-2026 reports)
- CGJ News Tracker — Phase 3: Tier 3 statewide /news
- CGJ News Tracker — Phase 2: Tier 2 county news feed
- CGJ News Tracker — Phase 1: Tier 1 per-report 'In the News' section
- Add 'Keep me updated' subscribe CTA to county grand jury pages + news-stories opt-in
- civilgrandjury.org static assets (logo, favicon, CSS, JS) missing from R2 → broken when origin is down
- Gavelly: warn when a report uses more than one font
- New /consolidate tool: merge reports into one DOCX/PDF with report delimiters that let consolidated reports be split back into individual reports — a primary UnGovr concern
- Gavelly respondent extractor: grouped 'X of the following Y – N days' header drops the member lines
- Gavelly: infer Findings/Recommendations for sole-respondent reports in the contacts XLSX
- Gavelly: respondent section-finder grabs §933 boilerplate, missing Orange County 'Comments to the Presiding Judge … required/requested from:' lists
- Gavelly: respondent entity-matching skipped because county detection is case-sensitive (misses ALL-CAPS headers)
- CGJ classifier: detect agency response LETTERS misclassified as reports (corpus noise)
- Gavelly: download XLSX of required respondents + their UnGovr contacts (stateless, browser-mediated)
- chore(platform): bundle shared maintenance-decision module instead of hand-inlining
- GPU driver down on ungovr1 — Secure Boot MOK enrollment wiped (nvidia module rejected)
- Rebrand UnGovr Verbatim → UnGovr Words (final family name; keep codeline
meeting)
- Add UnGovr profile pop-out icon to CGJ entity detail header
- CGJ: report-count sanity check false-positives on split consolidated reports (count excludes consolidated:// children)
- CGJ: Evotiva UserFiles crawler is dead code — reimplement over httpx + wire into discover path (unblinds calaveras + any Evotiva county)
- Link county pages to sheriffoversight.org when an oversight body exists
- CGJ discoverer can't reach calaveras (DotNetNuke) grand-jury report listing
- CGJ reachability audit: false 'missing' alerts from CivicPlus version-tick URL drift (Madera)
- Public-site leak guard misses GitHub issue numbers in CSS/JS comments + skips CGJ templates
- i18n Phase 1: language switcher persists to global user setting + text-only + CGJ/CGJA fixes
- i18n: worker should bypass edge cache for language-negotiated ungovr.org requests
- i18n: Accept-Language detection is dead code (regional catch-all preempts step 5)
- Backup hardening: close disaster-recovery audit gaps
- Meeting: reduce Stage-1 GPU load (diarization dominates; TF32 + batched embeds = ~40% win)
- Feature CGJ topics: per-topic source listings on /analysis/topics/{slug}
- Feature CGJ discovery: per-county reachability audit (PDFs on landing pages vs cgj_reports)
- Feature Top 60 topics page at /analysis/topics
- Feature analysis: link quotes to source report; collapse sources page counties
- Feature F/R extraction: capture lifted recommender/finder prefix on grouped reports
- Feature Reusable entity picker widget for 5-100 items; replace CGJ /new comma-separated input
- Fix CGJ: consolidated audit flags single reports with internal-section TOCs as consolidated (section-TOC false positives)
- Fix CGJ: blank-divider page text leaks into extracted_title (e.g. Homelessness in Nevada County renders as 'This Page Intentionally Blank')
- Fix fix(cgj): /new feed goes blank during re-extraction passes
- Fix CGJ extraction daemon: pdf_url dropped → publication_date stays NULL for newly-discovered reports
- Fix Tighten State Oversight Context relevance bar: oversight/critique + primary topic only
- Fix fix(cgj): CGJ digest header — civilgrandjury.org subtitle auto-links as blue-on-blue
- Fix bug: 28 pre-existing test failures across 6 subsystems (full suite)
- Fix Merge CA C-tail cross-type duplicates (69 clusters across 9 type-pairs)
- Fix Dedup CA entities: slug collisions, abbreviation variants, cross-type pairs
- Remove broken WCAG link from Gavelly accessibility check group
- Add Privacy Policy + Terms of Use links to CGJ and UnGovr Request footers
- CGJ home page: remove stats + drop free-service colophon from main site
- County page search box should say 'Search {County} Reports…'
- Remove WCAG accessibility audit from public view (hub card + audit pages)
- Hide Accessibility Audit link on CGJ county pages
- Broader response-misclassification cleanup across counties
- Detect responses mis-classified as reports (Inyo 2024-2025 case)
- Refresh CGJ home + county pages with civic-almanac aesthetic
- civilgrandjury.org/search: add category pills (jump-to-section + breakdown)
- fix(cgj): report titles extracted from appendix headings — display canonical title, harden extractor
- Tone down red on CGJ dev preview banner
- civilgrandjury.org/search undersells corpus: reports query is title/summary-only, 100-row cap, no relevance sort, no in-result county filter
- Gavelly: relabel 'fact sheet' citations — no such public document exists
- CGJ: link reclassified responses to parents + render response-typed pages distinctly
- CGJ: OCGJ responses misclassified as reports (nested-li DOM signal missing)
- Document unlisted /m/ memo path in ungovr-cgj/CLAUDE.md
- CGJ: detect-and-flag de-facto consolidated reports tagged as single
- Basic/OAuth users bounced to /e on /account/* alias paths (e.g. /account/api-keys)
- Feature Integrate CA Civil Grand Jury investigations of sheriff's office (incl. deaths in custody) into county pages
- Feature feat(law,www): classify broken links body-vs-core, 2-run debounce for body, /fix-urls skill
- Feature feat(law,www): weekly external-link checker — email digest of broken outbound links
- Fix modeleval: Gavelly adapter's model swap is a no-op (module constant read at import time)
- Fix fix(cgj): worker strips per-request CSP nonce, breaks inline JS on civilgrandjury.org (e.g. /analysis tile clicks)
- Fix cgj: report titles polluted by TOC artifacts (bullets, dot leaders, HTML, encoding errors)
- Fix modeleval: Gavelly adapter passes raw Evidence objects to scorer, breaking Jaccard set()
- Fix fix(cgj): wrong parent_report_id linking on response docs blocks LLM tier
- Fix fix(www): 39 broken outbound URLs on www.ungovr.org (sources/us, tech, unclaimed, …)
- Fix law: 5 subnational rows still need source URLs (mh/ts records, ar/s meetings)
- gavelly: harden JSON extraction — pass format=json to Ollama + strip <think> tags
- modeleval: clean 4 missing-text reports from Gavelly pin; evaluate qwen3:30b-a3b and qwen3.5:35b
- CGJ: thread term-window into extract_date_from_pdf_cover_page (prevent in-body year hits)
- CGJ: pdf-cover publication dates outside term window (e.g., 2006 on a 2011-2012 OC report)
- civilgrandjury.org on R2: static HTML + scheduled PDF publishing
- Model evaluation framework: pluggable harness for local + cloud LLM swaps
- Feature Gavelly: detect non-government actors in recommendations (Lompoc/Pajaro pattern)
- Feature Gavelly: detect 'wrong-respondent' recommendations (action-subject vs respondent mismatch)
- Feature Trust PDF /Title metadata for CGJ extraction (current jury year forward); always for Gavelly
- Feature Weekly referrer tracker digest (CF GraphQL → email + Slack)
- Fix CGJ extraction: subsequent reports/responses/continuity bleed into single-report extracted text
- Fix Gavelly: detect duplicate / out-of-sequence finding & recommendation labels
- Convert thematic reports to static dated artifacts (no DB queries on render)
- Gavelly: regression check — every change must compare prior-year report counts
- Gavelly: align response-language guidance with Penal Code § 933.05 wording
- Feature Add California Civil Grand Juries to /law/oversight/us/ca
- Feature Prominent link to civilgrandjury.org from CA grand jury oversight page
- Feature Law data coordination — FACTS 500 coverage & quality (tracking)
- Fix CGJ home page is 7s due to cgj_county_summary view Cartesian explosion
- Fix Gavelly Python score calculation includes gray checks in denominator (caps clean reports below 100)
- Fix cgj_daily_monitor_watchdog: '\n' in alert subject crashes Resend send
- Fix security(cgj-changelog): render_html() passes javascript: URIs through to | safe output
- Fix fix(cgj): Marin sub-page respondents misclassified as 'report' when title is generic
- Fix CGJ: Sierra County low findings rate — junk docs + narrative-only reports + missing GJ terms
- Fix CGJ: Kern County 0 extracted titles — backfill never run + null-byte fix
- Fix CGJ: add term-year-from-year-record fallback to pub date extraction chain
- Fix CGJ: add YY-YY URL path handler for Sacramento-style docs/reports/21-22/ paths
- Fix CGJ: investigate 64 reports with NULL publication_date and no cached PDF
- Fix CGJ: clear 5 zero-byte SJ cached PDFs and mark dead links
- Fix CGJ: Surya OCR fallback for NULL publication_date rows (scanned PDFs)
- Fix Fix false-positive alerts in CGJ website monitor
- Fix bug(cgj): confirm_canonical — no input validation, wrong count returned, self-referential allowed
- Fix bug(cgj): log_audit_event called inside transaction in confirm_canonical
- Fix bug(ops): social_media.py file I/O blocks event loop (threading.Lock + sync reads in async routes)
- Fix mobile: Android build fails on Linux — manifest merger conflict (com.android.support 28 vs AndroidX)
- Move CGJ daily monitor log/state out of /tmp (wiped on reboot)
- cgj: validate 285 title_backfill children after page ranges assigned
- cgj: triage 563 remaining no-range no-findings consolidated children
- cgj: exclude canonical_report_id rows from cgj_county_summary coverage
- cgj: GPU OCR index pass for 122 OCR-only consolidated parents
- CGJ: anchor /new feed to fixed start date (2026-03-01) instead of sliding window
- CGJ: tighten /new "hide older" cutoff from 1 year to 90 days
- CGJ: Sitecore dedup leftover — canonicalize pre-existing FILETIME-tokened pdf_urls (Madera/Monterey/Yolo)
- audit(cgj): tighten NON_REPORT_PATTERNS — false positives + 150+ unflagged candidates
- fix(cgj): dedupe Sutter sutter.courts.ca.gov vs suttercourts.com aliases
- fix(cgj): more response/handbook miscategorizations on new-reports page
- CGJ: data completeness audit — findings/recs extraction rate, dead links, missing terms
- Discovery scraper captures link accessibility text instead of report title (sjcourts.org)
- CGJ: weekly changelog at civilgrandjury.org/changelog
- chore(cgj): confirm_canonical DB update + audit log not in a transaction
- security(cgj): dismissed.json write is not atomic — race condition + unbounded signature
- security(cgj): changelog URI sanitizer incomplete — unquoted attrs, vbscript:, encoded colons
- test(ops): add coverage for dev_mac, mobile_screenshots, and intranet digest routes
- chore(www): add asgi-lifespan to /tech page and review changelog system listing
- chore(cgj): move dismissed duplicate signatures from disk JSON to database
- chore(ops): add CF-Connecting-IP extraction helper for audit IP logging behind Cloudflare
- chore(tests): add test coverage for changelog_admin, SSH key validation, and social_media validation
- performance(ops): N+1 queries in crawl_backlog CRUD loops and missing DB indexes
- Add ISO-8601 timestamps to ungovrd-ops.log and other restart.sh logs
- chore(ops): add test coverage gate for new route modules (dev_mac, mobile_screenshots)
- chore(ops): audit all write routes for log_audit_event placement inside vs outside transactions
- Collapse legacy users.role into users.system_role
- chore(www): update /tech page — tldextract and google-analytics-data added to requirements.txt
- perf(geotracker): add partial index on documents WHERE resolved_at IS NULL
- users API: VALID_ROLES asymmetry between create and update endpoints
- Add UnGovr Law to /applications page
- law(oversight): show 'Full text of law →' instead of raw statute URL
- Feature CGJ: consolidated child reports have no fetchable URL — link to parent PDF or viewer
- Feature ADA Title II compliance deadlines extended by one year (DOJ interim final rule, 2026-04-20)
- Feature Labs: turn /request into a subsection (move existing project + Rumble underneath)
- Feature Add patch watchdog for OS / pip / ollama security updates across all servers
- Fix CGJ: Sonoma BoS responses still misclassified as reports
- Fix security(cgj): actionsContainer innerHTML XSS risk in problem_reports.html
- Fix a11y: PDF iframe in problem_reports.html missing title attribute (SC 4.1.2)
- Title extraction fails on 40 San Joaquin reports — "Download Report (PDF)(opens in new tab)"
- Clean up 39 orphan consolidated:// rows (NULL consolidated_report_id)
- Image-only cover-page date fallback for 75 reports
- Re-crawl Sonoma from sonomacourt.org (domain migration)
- CGJ: improve publication_date extraction (full-text scan + Sitefinity URL ticks fallback)
- civilgrandjury.org/new: hide newly-discovered older reports by default
- Audit: re-extract CGJ reports OCR'd despite having a clean native text layer
- CGJ /new lists 1744 broken consolidated:// links and 2465-row backfill flood
- Phase 0: Parametrize paths/DB name + adopt Alembic (dev onboarding prereq)
- Feature CGJ: per-child F&R re-extraction for 3,904 new synthetic children
- Fix a11y: easy wins from WCAG audit — form control labels + in-paragraph link underlines
- Fix a11y harness: wcag_audit.py doesn't validate HTTP status — 404/429 responses are scored as if real pages
- Fix CGJ: recover 4 orphan parents with broken cached_pdf_path
- Fix CGJ: improve consolidated_splitter SKIP_SECTIONS to suppress non-report TOC entries
- Notify IndexNow (Bing/Yandex) when static site content changes
- chore(crawl-backlog): sync file I/O in async context in gavelly llm_checker
- CGJ: update sub_report_duplication check to respect consolidated_report_id
- CGJ: remove country flags from language dropdown (MX flag too narrow a signal)
- cloudflare-purge: add civilgrandjury.org zone and accept full domain names
- CGJ: language dropdown shows 'US'/'MX' text instead of flags on Windows
- CGJ: simplify language dropdown label ('English (US)' wraps to 2 lines)
- Remove duplicate 'Become a Civil Grand Juror' CTA from CGJ home page
- Conservative Python venv + Ollama upgrade pass (post-reboot 2026-04-13)
- chore(csp): tighten app-wide policy — /request doesn't need Turnstile, translate.google, or jsdelivr
- Feature Gavelly: extract footnotes and check font size compliance
- Feature Gavelly: detect tracked changes, warn users, and extract original text
- Feature Gavelly: add copy buttons for individual issues and full results
- Feature Gavelly: map appointed staff respondents to their governing board
- Feature Gavelly home page: simplify layout and move security details to modal
- Feature Gavelly for Word: landing page with installation instructions
- Feature Gavelly Word Add-in: real-time compliance checking inside Microsoft Word
- Feature Gavelly checks page: style 'How to fix' as blue fix-panel
- Feature Gavelly home: add UnGovr logo, move privacy box, add 'What Gavelly Checks' page
- Fix CGJ: Flag remaining unflagged continuity/compliance metadata reports
- Fix Gavelly: fix broken logo + rate limit display mismatch
- Fix Gavelly: fix run-to-run inconsistency — single prompt, per-report seed, move mechanizable checks out of LLM
- Labs: restyle from cyan to amber Mission Control theme
- gavelly: downgrade 'Additional checks require PDF' from yellow warning to end-of-results note
- docs: update hub labels, add Gavelly page, add Open Records research section
- docs: fix critical documentation inaccuracies — routing, security, developer workflow, schemas
- docs: comprehensive hub.ungovr.org documentation audit — 60+ inaccuracies found
- feat: intranet home page at home.ungovr.org
- Feature feat: /audit directory — US state audit & oversight bodies
- Feature Add What's New page at /new showing recently discovered reports
- Feature Create fetching-external-pages skill
- Feature Build CGJ website changes triage page
- Feature WCAG 2.1 AA compliance audit page for CGJ reports
- Feature CGJ: Activate report monitoring pipeline (daily crawl, hourly notifier, weekly monitor)
- Fix Investigate ~611 reports with ALL-CAPS CONCLUSIONS headers missing annotations
- Fix Audit and fix inline event handlers blocked by nonce-based CSP
- Fix Data quality: ~1,134 agency responses misclassified as reports
- Fix Bug: 'View Original PDF' links broken when cached_pdf_path set but file not in R2
- Fix security(ops,cgj): XSS via unescaped API values in pipeline.html innerHTML
- Fix CGJ extraction: jury year misassignment on some reports
- Fix CGJ extraction: related_findings range references (F1-F5) not parsed
- Fix CGJ extraction: page header/footer text contaminating findings and recommendations
- Fix Fix daily monitor Phase 1 to use crawler infrastructure
- Fix CGJ: Fix false positives in schedule monitor analyzers
- Fix fix(cgj): update Ollama model name and add focused LLM options
- Labs: move How It Works sections lower on project pages
- chore(ops,cgj): audit and eliminate | safe filter usage on database/API-sourced content in templates
- chore(cgj): add regression tests for CGJ PDF extraction pipeline before merging extractor changes
- chore(ops,cgj,core): write tests for security-critical auth functions (Bearer, OAuth state, unsubscribe HMAC)
- chore(ops): add lint rule flagging subprocess.run() inside async def functions
- chore(crawler): add Ruff/lint rule to block raw httpx/requests imports outside ungovr-crawler/
- chore(cgj): missing tests for CGJ PDF extraction pipeline
- security(ops,cgj): replace raw httpx usage with HttpClientManager across non-crawler code
- chore: auto-fix code quality issues from global check
- Add permanent human-readable IDs to website change alerts
- Audit codebase for raw HTTP client usage on external URLs
- Classify CGJ findings by type and extract annotations (conclusions, commendations)
- CGJ homepage report count inflated by responses and non-reports
- Feature Exclude compliance/continuity reports from CGJ counts and WCAG analysis
- Feature WCAG 2.1 AA compliance tracking for CGJ reports after 4/24/2026
- Feature feat: migrate to qwen3.5:27b (LLM) and qwen3-embedding:0.6b (embeddings)
- Feature feat(cgj): add bullying in schools thematic report
- Feature Add MPP (Machine Payments Protocol) support for MCP server and data API
- Feature Unblock search engine crawling of www.ungovr.org
- Feature feat(watchdogs): add semantic page correctness checks to site watchdogs
- Fix Reports assigned to wrong fiscal year window based on web server date instead of publication date
- Fix fix(ocr): add CUDA GPU guard to prevent Surya running on CPU
- Fix fix(cgj): add response title patterns for Humboldt convention
- Fix fix(cgj): reclassify Orange County agency responses miscategorized as reports (no title/URL signal)
- Fix fix(cgj): mark Fresno annual report+response bundles as consolidated (11 records)
- Fix fix(cgj): resolve document_type='consolidated' inconsistencies (35 records)
- Fix fix(cgj): remove or reclassify ~60 non-report documents (press releases, appendices, cover letters)
- Fix fix(cgj): reclassify ~79 agency responses miscategorized as reports
- Fix CGJ: Deduplicate title-based standalone reports with different URLs
- Fix CGJ: Review Mendocino 2007-2008 duplicate 'Mendocino County District Attorney' reports
- Fix CGJ: Remove standalone-to-standalone duplicate reports (www vs non-www URLs)
- Fix fix(mypy): migrate TemplateResponse calls to new Starlette API
- Fix perf: gavelly_api.py Ollama timeout too short for large documents
- Fix perf: sync Anthropic client blocks event loop in rematch_cgj_entities.py
- Fix CGJ: Re-check all reports against current non-report detection patterns
- Fix CGJ: Garbled extracted_title displayed instead of correct title
- Fix fix(gavelly): data quality improvements — Spanish skip, response-deadline, large PDFs, consolidated page-range splits
- Delete non-report PDFs ingested as CGJ reports (demographics, rosters, boilerplate)
- Fix CGJ reports assigned to wrong fiscal year based on title year
- Embedding watchdog sends false stall SMS alerts during CGJ GPU yield
- Change CGJ monitor from weekly to daily at 1 AM Eastern
- Add 7 missing watchdog scripts to crontab
- Optimize CGJ extraction GPU utilization with per-pass worker tuning
- CSP: img-src blocks www.googletagmanager.com
- chore(mypy): add type annotations to bin utility scripts
- perf: standardize Ollama keep_alive settings
- perf: add connection pooling for Ollama HTTP clients
- Feature Build status.ungovr.org status page
- Fix Cloudflared watchdog fails to detect tunnel down when other user's tunnel is running
- Fix CGJ: Reject impossible future years from crawler, clean up existing bad data
- Fix fix(gavelly): LLM overrides correct GREEN findings-numbered for Alameda XX-N format
- Fix fix: Labs GA tag undocumented, audit script parser bug, minor code quality
- Fix Gavelly: style guide not applied when using preloaded reports or dev drafts
- Fix fix(auth): Google OAuth fails for cross-domain login to civilgrandjury.org
- Fix fix(cgj): broken logo and missing report count on civilgrandjury.org homepage
- Fix civilgrandjury.org blocked by Cloudflare managed challenge
- Fix civilgrandjury.org has no external health monitoring
- fix(gavelly): evidence field can be null from LLM, crashing parse loop
- perf(cgj): fix N+1 on juries schedule page (58 sequential DB queries)
- perf(cgj): add GIN trigram index on cgj_findings.text + cache thematic reports
- Fix 6 high-severity security findings from audit
- perf: full performance audit — fixes and infrastructure
- Add UG favicon to all UnGovr sites
- Add external uptime monitoring that doesn't bypass Cloudflare security
- Auto-purge Cloudflare cache for both zones after tunnel recovery
- Add civilgrandjury.org to external health monitoring
- Feature CGJ: Sources page enhancements and automated report discovery
- Feature MCP: add offset pagination to search_entities and search_cgj_reports
- Feature Serve CGJ PDFs from R2 on cgj.ungovr.org
- Feature data-api: add live endpoint integration tests for data worker and MCP
- Feature CGJ: Show per-year term exceptions on /juries/schedule page
- Feature data-api: deploy workers and publish data to R2
- Feature data-api: run end-to-end build pipeline and verify output
- Feature CGJ: Support non-standard grand jury term schedules (calendar year, COVID extensions)
- Feature CGJ analysis: challenges of privatizing government services (mental healthcare focus)
- Feature Traffic anomaly monitoring for Open Data API and MCP server
- Feature Build Cloudflare MCP server for UnGovr Open Data
- Feature Build UnGovr Open Data API (R2 + CDN static JSON feeds)
- Feature New thematic cross-jury analysis reports: Homelessness, Cybersecurity, Mental Health Services, In-Custody Deaths & Jail Overcrowding, Education & Schools, Infrastructure & Public Works, Financial Oversight, Child & Family Services, Wildfire & Emergency Services
- Feature CGJ: Add source reports page for each analysis report
- Feature Rename CGJ 'Reports' section to 'Analysis'
- Feature CGJ: Improve quote selection quality in thematic reports
- Feature CGJ: Normalize time-series data for historical report availability bias
- Feature Add contact note for report copies after dev preview exit
- Feature CGJ: Add /errors/{county} page showing problematic reports
- Feature CGJ Gavelly: Relax rec-finding-linkage check to account for implicit linking
- Feature Add California charter schools to government entity list
- Feature Complete consolidated CGJ PDF splitting: Layer 1 OCR indexing for ~1,730 image-only PDFs
- Fix MCP: harmonize CGJ daily rate limits between data worker (25) and MCP worker (50)
- Fix MCP: add list_cgj_counties tool (tests expect 6 tools, worker has 5)
- Fix Fix Gavelly style guide upload, LLM parsing, and title case
- Fix Audit pages blocked by Cloudflare WAF managed rules
- Fix security: KV rate limit race condition allows CGJ API quota bypass
- Fix Data quality: 685 consolidated children missing page_range_start/page_range_end
- Fix Data quality: Monterey County has 53 cannabis-related reports — likely duplicates or over-matching
- Fix Investigate path traversal attempt returning 200 on CGJ pms endpoint
- Fix CGJ catch-all route returns 200 for attack probe paths (.php, .aws/, config.)
- Fix fix(cgj): notify_on_completion context manager is broken — JobNotifier is created but never started
- Fix fix(security): real Twilio account and messaging service SIDs may be in notifications.py docstring
- Fix fix(security): cgj_daily_monitor_watchdog.py uses Python % string formatting to build SQL query
- Fix fix(cgj): fitz.Document not closed in finally block in pdf_processor.py
- Fix fix(cgj): CSRF_WARN_ONLY production guard missing in CGJ app (present in ops app)
- Fix fix(cgj): remove_headers_footers() renumbers pages from 1, discarding original page numbers
- Fix fix(gavelly): _jobs dict mutated across await points without coordination — cleanup can cancel mid-write
- Fix fix(gavelly): greedy regex in LLM response parsing silently drops results when response contains multiple JSON objects
- Fix fix(cgj): temp PDF file may be deleted while Surya is still reading it
- Fix CGJ: Fix recommendation-to-finding linkback logic
- Fix Gavelly: report upload clears and style guide upload fails
- Fix [Open Redirect] Unauthenticated user redirect via crossdomain_auth returns_to=<external_url> when silent=1
- Fix [Error Disclosure] Raw exception strings returned in API responses in domain_validator.py and gavelly/routes.py
- Fix [IP Spoofing] X-Forwarded-For trusted without CF-Connecting-IP check in connect.py and gavelly/routes.py rate limiters
- Fix [Path Traversal] DB-sourced county_code used unsanitized in CGJ file paths
- Fix [Path Traversal] Unvalidated DB-sourced extracted_text_path in gavelly routes
- Fix [Path Traversal] Unvalidated DB-sourced file path in CGJ PDF serve endpoint
- Fix [XSS] Unescaped report fields from database in CGJ problem_reports.html
- Fix [Auth] CGJ CSRF exempt paths use exact-match set instead of prefix matching
- Fix [Auth] Gavelly dev-draft endpoints rely on hardcoded email/username allowlist
- Fix Gavelly: LLM respondent extraction times out on web UI
- Fix fix(security): PageFetcher.fetch_page() has no is_safe_url() check — SSRF on entry point used by CGJ and enrichment
- Fix fix(security): domain validator write endpoints missing role-based authorization
- Fix Entity route catch-all returns 200 for attack probe paths (/e/us/.aws/, config/)
- Remove dead Turnstile/CAPTCHA code from Gavelly
- fix(security): verify data-ungovr R2 bucket has public access disabled
- feat(mcp): add authentication and rate limiting to mcp.ungovr.org
- Move /juries/schedule to /data/juries/schedule
- Re-generate thematic analysis reports after CGJ data quality cleanup
- perf(www): publish_www.py purge_everything flushes entire zone, not just www pages
- Fix 19 mypy errors across services
- Fix 3 remaining Ruff errors in services
- Serve CGJ PDFs from R2 via cgj.ungovr.org
- Add rate limiting for civilgrandjury.org
- Update About page to highlight first-of-its-kind archive
- Embedding daemon stalled: 9.8M chunks pending, watchdog disabled
- CGJ: Weekly checker to verify/update problem report status
- Clean up ~878 non-report children in consolidated CGJ data
- Rename CGJ is_authenticated() to has_cgj_access() — name is misleading
- Re-extract consolidated CGJ children with duplicated findings
- Fix CGJ extraction pipeline to scope findings to child page ranges
- Verify CGJ extraction queue completion (1,335 reports queued for OCR re-extraction)
- Review and link 11 CGJ Spanish report orphans (no English match)
- Write incident runbooks for all critical services
- fix(security): add clickjacking protection headers to www.ungovr.org
- i18n: Fix hardcoded <html lang="en"> in 26 ops templates
- Migrate ungovr.org to www.ungovr.org: stop WordPress proxying
- Fix Gavelly: logout URL matched as county slug, shows 'County logout not found'
- Fix CSP nonce missing on several pages
- Gavelly: numbered findings contradiction on Female Inmates report
- Gavelly: contradicts itself — finds numbered F1/R1 items then says they aren't numbered
- Gavelly: respondents-categorized check should be yellow, not red, when sections are missing
- Gavelly: check results should cite their source (published guidelines, best practices, etc.)
- Gavelly: 'AI analysis unavailable' message shown when it shouldn't be
- Gavelly: warn when report targets state-authorized charter schools
- Set up linting infrastructure: Ruff + mypy + Pylint + pre-commit hooks
- Microsoft OAuth: Cloudflare blocks callback for work/org accounts
- Feature Thematic cross-jury reports (homelessness, jails, fire districts, water districts, juvenile services)
- Feature JPA jurisdiction support: direct (§925a) and indirect statewide JPA coverage
- Feature GPU time-sharing scheduler for multi-service coordination
- Feature CGJ entity-centric view pages: per-entity findings, recommendations, and response tracking
- Feature County entities summary page with sorting, grouping, and confidence filter
- Feature /juries/schedule page with non-standard term schedule support (calendar year, COVID extensions)
- Improve CGJ entity matching: domain validation, context disambiguation, LLM enrichment
- Improve Continuity / compliance reports: distinct flag + filtering across UI
- Improve Sitecore /showpublisheddocument duplicate detection
- Improve Cover-page title + publication-date extraction from PDFs
- Improve Scanner-PDF detection and OCR quality scoring
- Fix Many crawl, extraction, and entity-routing improvements through March 2026
- Feature Civil Grand Jury report archive launched: 58 California counties, year and report-level pages, OCR + extraction
- Feature Daily monitor: detect new reports + website-structure changes; per-county digests
- Feature Search: semantic RAG search across all CGJ reports
- Feature Gavelly tooling for CGJ report quality checking (numbered findings, contradiction detection)
- Feature Security middleware, CSRF enforcement, and Web Bot Authentication (RFC 9421) for crawler
- Improve Civil Grand Jury extraction: GPU-accelerated Surya OCR + smart-quality detection + scanner-PDF flagging
- Improve Crawler: automatic escalation from plain HTTP to a full headless browser for hard-to-fetch grand jury sites
- Improve Year extraction priority and 15-minute extraction timeout
- Fix Extraction and crawl pipeline improvements across the 58 California counties