Release notes

What's changed

Weekly release notes for the California Civil Grand Jury archive.

v0.18 Jun 28, 2026 · 63 changes this week #
  • Feature RSS feed for civilgrandjury.org /new (newly discovered grand jury reports)
  • Feature feat(cgj): discoverer should enumerate Google Drive folders + embedded-Drive iframes
  • Fix CGJ: don't nest supporting files under consolidated-split children — use the single end section
  • Fix CGJ consolidated splitter: misses numbered 'List of Reports' TOCs without page numbers (Ventura 2023-24)
  • Fix CGJ discovery: Ventura per-report 'Ref-0NN'/'Att-0NN' reference & attachment files ingested as standalone reports
  • Fix CGJ discovery: Ventura 2025-26 reports mis-titled 'Environment' and mis-filed under 2015-2016 (Elementor h1 layout)
  • Fix fix(cgj): generalize respondent place-token/wrong-function gate (matcher + audit + Gavelly web) + re-match ~109 legacy bad links
  • Fix fix(cgj): discoverer re-enumerates the same Drive folder once per crawled page
  • Fix fix(cgj): phantom pre-1950 jury years from URL digit-strings mis-parsed as years (#analysis shows 1905/1907/1922)
  • Fix CGJ: respondent matcher force-assigns a government entity to non-government / unresolvable names; Gavelly doesn't validate the assignment
  • Fix CGJ: "Permit Sonoma" (Sonoma County dept) matched to City of Sonoma in 21 reports — should be Sonoma County
  • Fix fix: auto-fix code quality issues from global check (2026-06-22)
  • Fix Entities: 193 CA cities duplicated as mis-typed special_district/other entities
  • CGJ /new: swap Discovered/Published column order so sorted column leads
  • Canonicalize civilgrandjury.org: 301 www→apex + pin sitemap to apex
  • CGJ: news-first email subscriptions on /news (inverse of /new)
  • audit(cgj): frozen batch_pattern/context_pattern respondent matches bypass the cross-domain audit
  • fix(cgj): consolidated detector over-splits single reports (section-outline density guard)
  • CGJ: per-jury non-statutory report-count audit script (dedup + cross-year + statutory-gap corrections)
  • CGJ corpus pollution: kern 2020-2021 (redistricting) + mariposa 2023-2024 (market study) mis-seeded years
  • santa-cruz/2020-2021: 2021 redistricting-commission docs mis-crawled into CGJ corpus (maps as reports)
  • civilgrandjury.org/search: add 'most recent' sort option (date vs relevance)
  • CGJ follow-up: reclassify held-back context docs that carry extracted content
  • Widen CGJ NON_REPORT_PATTERNS: rosters, GJ membership, minutes, bare agendas, press releases, jury/applicant demographics, to/from-jury letters
  • CGJ R2 publish has no concurrency lock — parallel runs thrash, manifest never rotates
  • CGJ: review non_report-titled 'report' rows that have extracted findings/recs (held back by backfill)
  • fix(cgj): unify masthead/typography across all page types (two-tier hero system)
  • fix(cgj): Shasta 2024-2025 consolidated mis-split — missing 'Shasta County Jail' report, wrong ranges, duplicate Juvenile Rehab
  • fix(cgj): Shasta 2025-2026 consolidated final report unflagged & unsplit (TOC parse below auto-flag threshold)
  • CGJ: 'Final Report <case-number>' titles not recognized as generic (real title hidden)
  • CGJ: strip leading jury-name + year boilerplate from display titles ( follow-up)
  • fix(cgj): county page news heading → 'This county in the news', drop 'automatically gathered'
  • CGJ classifier: truncated/glued 'Respons<Agency>' titles misclassified as reports (Del Norte Drive)
  • CGJ /new QA pass: continuity reclassification, Marin/Tulare title artifacts, Solano BOS response misclass
  • CGJ: group non-report docs into an 'Additional documents' section on the county-year page
  • CGJ header: long title overflows on phones — needs responsive/hamburger treatment
  • CGJ homepage/header polish: drop logo divider, larger nav, roomier county index, intro lede
  • docs: fix civilgrandjury.org URL examples in CLAUDE.md (/reports/<county> 500s — use /<county>/<year>)
  • Redesign /analysis: unify deep reports + 60-topic taxonomy into one page
  • CGJ: full-width page header — logo + title left, nav center, language/login right; relabel + reorder menu
  • Finding↔recommendation association ignores the number-prefix convention (R1A/R1B→F1)
  • Archive the Gavelly Word add-in — keep the code, remove the (dev-only/broken) public release surface
  • CGJ rate-limiting gaps: standalone POSTs bypass middleware, Gavelly GPU amplifier, /search full scans
  • performance(cgj): fetch_all_news has no LIMIT — unbounded statewide load
  • /news: hide redundant county tag when filtered to one county; clarify report link + mark off-site story links
  • CGJ monitor detect_cms: a Legistar/Granicus link must not override the host CMS
  • /news: sortable Date·Source·Story table + bookmarkable filter/sort; hide entity chips
  • Surface "Report a problem with this page" link inline at end of county-page disclaimer
  • CGJ monitor: name the specific report years added/removed in website-change alerts
  • CGJ email alerts: per-alert Edit/Remove on the /new panel + raise cap to 10
  • What's New page: surface a signed-in user's existing email alerts (+ Manage link)
  • CGJ website-change monitor: reduce false-positive noise + add debug detail
  • fix(cgj): consolidated-PDF splitter produces overlapping page ranges + promotes subtitles to sub-reports
  • test(request/cgj): test coverage gaps for messaging, push-token, and tracking routes
  • Atlas: dedup + acronym-slug standardization for CA regional agencies (SANDAG/SCAG/MTC/ABAG/BART)
  • Document the UnGovr URN scheme on the Den (slash vs colon delimiter)
  • Rate-limit hardening: guard test for unbucketed POSTs, forwarded-IP trust, body-size/slowloris, durability, edge rules
  • fix(ops): show the app waffle (public apps) to logged-out visitors + nudge geo breadcrumb 1px
  • Clean up untracked working-tree backlog (gitignore venvs/nltk/build, delete ephemera)
  • Atlas landing: welcome intro band + new /intro getting-started guide
  • Rate-limit: outbound email/SMS endpoints slip their tight bucket into the write catch-all
  • Codify + apply Cloudflare edge rate-limit rules (request intake + Gavelly) — follow-up
  • Login does not return users to the originating page (request.ungovr.ie → /e; www/law nav has no redirect)
v0.17 Jun 21, 2026 · 10 changes this week #
  • Feature feat(api,mcp): product availability on the entity graph — products field + inverted product→coverage API
  • Feature www /impact: drop team blurb that duplicates /people
  • fix(cgj): consolidated-volume rows leak into the per-report list as 'consolidated report'
  • Remove "Development Preview" banner; add page-seeded content-feedback link
  • chore: untrack ungovr-cgj egg-info build artifacts (already gitignored)
  • Authenticate the civilgrandjury.org subscription-confirmation email
  • bug(cgj): ToU gate leaks /ungovr/cgj mount prefix → strands CGJ-login users on ungovr.org
  • fix(i18n): es-419 localization gaps on civilgrandjury.org home + footer
  • Bump UnGovrBot crawler version to 0.3.53 and unify all UA strings
  • Mail relay TLS: internal clients use mail.ungovr.org but certs only cover mail-00X.ungovr.cc (hostname mismatch)
v0.16 Jun 14, 2026 · 47 changes this week #
  • Fix CGJ static publish can wipe entire R2 site when origin (5002) is down at build time
  • Fix CGJ: consolidated reports with non-'consolidated' link titles ingest as single reports and surface raw on /new (no auto-split)
  • Fix fix(cgj): /consolidate correctness, performance & test gaps
  • Fix security(cgj): harden /consolidate — decompression bomb + dead Turnstile CSRF exemption
  • Fix fix(cgj): digest CLI doesn't init core DB → email suppression check + core email-audit log skipped
  • Fix fix(cgj): subscription digest cron missing doppler run -- → no subscriber emails since launch ( regression)
  • Fix CGJ titles: 'View the document' not recognized as generic (54 Marin reports show anchor text); Plumas Fairgrounds report mistitled
  • Fix reextract_multiprocess.py --county crashes: Pass 0 uses alias 'r' from wrong query
  • Fix CGJ /new: title cleanup — run-on cover OCR, (PDF,NNkb) suffix, acronym casing, numbering noise
  • Fix CGJ /new: generic crawler titles leak when extracted_title is NULL (backfill + always-extract)
  • Fix CGJ /new: agency responses misclassified as reports (classifier blind to generic/typo filenames)
  • Fix ungovr1 postgres: recurring backend SIGSEGV in core executor — symptom of hardware instability, no data corruption found
  • Fix Rate-limit + audit-log IP collapse: worker-proxied traffic presents one Cloudflare egress IP
  • CGJ: wire generated_title into year_detail footnote + audit helper ( residuals)
  • CGJ: account-based ★ Track this jury + homepage tracked-juries band
  • CGJ county page: surface redesign (title, news strip, entities-by-search, breadcrumb, totals)
  • CGJ: is_consolidated_title misses underscore-joined titles (ingest detection gap,)
  • CGJ generated titles not surfaced in /api/new feed, email digest, or JSON API endpoints
  • LLM agency-naming title generator for CGJ reports
  • CGJ audit: teach the workflow prompt to consume the new discoverer.* fields
  • CGJ: fix generic/truncated extracted titles (marin responses; audit cross-county)
  • CGJ: audit_consolidated_candidates cron log lines are doubled (StreamHandler + FileHandler + cron redirect to same file)
  • CGJ report-audit helper: handle JS-interaction-gated report lists (kern tabs) + title-based DB matching (marin) — false unreadable/missing
  • fix(cgj): light/dark variants + larger size for UnGovr popout icon
  • fix(cgj): use canonical UnGovr popout icon on county entities page
  • Polish county 'Keep me updated' CTA: bigger email icon + brand-blue button
  • CGJ /new: link county name in report list to county page
  • cgjweb: same Accept-Language 500 in its ported i18n copy (sibling of)
  • CGJ season-pace re-check reminder (Jun 25-30): one-shot cron re-runs county-coverage comparison + emails verdict
  • CGJ daily monitor: scan all 58 counties per night (raise MAX_REPORT_SCAN_COUNTIES 20→58) for same-day report/response discovery
  • CGJ discoverer blind on courts.ca.gov template — Amador + Mono yield pdfs_on_site=0 (miss-risk for 2025-2026 reports)
  • CGJ News Tracker — Phase 3: Tier 3 statewide /news
  • CGJ News Tracker — Phase 2: Tier 2 county news feed
  • CGJ News Tracker — Phase 1: Tier 1 per-report 'In the News' section
  • Add 'Keep me updated' subscribe CTA to county grand jury pages + news-stories opt-in
  • civilgrandjury.org static assets (logo, favicon, CSS, JS) missing from R2 → broken when origin is down
  • Gavelly: warn when a report uses more than one font
  • New /consolidate tool: merge reports into one DOCX/PDF with report delimiters that let consolidated reports be split back into individual reports — a primary UnGovr concern
  • Gavelly respondent extractor: grouped 'X of the following Y – N days' header drops the member lines
  • Gavelly: infer Findings/Recommendations for sole-respondent reports in the contacts XLSX
  • Gavelly: respondent section-finder grabs §933 boilerplate, missing Orange County 'Comments to the Presiding Judge … required/requested from:' lists
  • Gavelly: respondent entity-matching skipped because county detection is case-sensitive (misses ALL-CAPS headers)
  • CGJ classifier: detect agency response LETTERS misclassified as reports (corpus noise)
  • Gavelly: download XLSX of required respondents + their UnGovr contacts (stateless, browser-mediated)
  • chore(platform): bundle shared maintenance-decision module instead of hand-inlining
  • GPU driver down on ungovr1 — Secure Boot MOK enrollment wiped (nvidia module rejected)
  • Rebrand UnGovr Verbatim → UnGovr Words (final family name; keep codeline meeting)
v0.15 Jun 7, 2026 · 13 changes this week #
  • Add UnGovr profile pop-out icon to CGJ entity detail header
  • CGJ: report-count sanity check false-positives on split consolidated reports (count excludes consolidated:// children)
  • CGJ: Evotiva UserFiles crawler is dead code — reimplement over httpx + wire into discover path (unblinds calaveras + any Evotiva county)
  • Link county pages to sheriffoversight.org when an oversight body exists
  • CGJ discoverer can't reach calaveras (DotNetNuke) grand-jury report listing
  • CGJ reachability audit: false 'missing' alerts from CivicPlus version-tick URL drift (Madera)
  • Public-site leak guard misses GitHub issue numbers in CSS/JS comments + skips CGJ templates
  • i18n Phase 1: language switcher persists to global user setting + text-only + CGJ/CGJA fixes
  • i18n: worker should bypass edge cache for language-negotiated ungovr.org requests
  • i18n: Accept-Language detection is dead code (regional catch-all preempts step 5)
  • Backup hardening: close disaster-recovery audit gaps
  • Meeting: reduce Stage-1 GPU load (diarization dominates; TF32 + batched embeds = ~40% win)
v0.14 May 31, 2026 · 47 changes this week #
  • Feature CGJ topics: per-topic source listings on /analysis/topics/{slug}
  • Feature CGJ discovery: per-county reachability audit (PDFs on landing pages vs cgj_reports)
  • Feature Top 60 topics page at /analysis/topics
  • Feature analysis: link quotes to source report; collapse sources page counties
  • Feature F/R extraction: capture lifted recommender/finder prefix on grouped reports
  • Feature Reusable entity picker widget for 5-100 items; replace CGJ /new comma-separated input
  • Fix CGJ: consolidated audit flags single reports with internal-section TOCs as consolidated (section-TOC false positives)
  • Fix CGJ: blank-divider page text leaks into extracted_title (e.g. Homelessness in Nevada County renders as 'This Page Intentionally Blank')
  • Fix fix(cgj): /new feed goes blank during re-extraction passes
  • Fix CGJ extraction daemon: pdf_url dropped → publication_date stays NULL for newly-discovered reports
  • Fix Tighten State Oversight Context relevance bar: oversight/critique + primary topic only
  • Fix fix(cgj): CGJ digest header — civilgrandjury.org subtitle auto-links as blue-on-blue
  • Fix bug: 28 pre-existing test failures across 6 subsystems (full suite)
  • Fix Merge CA C-tail cross-type duplicates (69 clusters across 9 type-pairs)
  • Fix Dedup CA entities: slug collisions, abbreviation variants, cross-type pairs
  • Remove broken WCAG link from Gavelly accessibility check group
  • Add Privacy Policy + Terms of Use links to CGJ and UnGovr Request footers
  • CGJ home page: remove stats + drop free-service colophon from main site
  • County page search box should say 'Search {County} Reports…'
  • Remove WCAG accessibility audit from public view (hub card + audit pages)
  • Hide Accessibility Audit link on CGJ county pages
  • Broader response-misclassification cleanup across counties
  • Detect responses mis-classified as reports (Inyo 2024-2025 case)
  • Refresh CGJ home + county pages with civic-almanac aesthetic
  • civilgrandjury.org/search: add category pills (jump-to-section + breakdown)
  • fix(cgj): report titles extracted from appendix headings — display canonical title, harden extractor
  • Tone down red on CGJ dev preview banner
  • civilgrandjury.org/search undersells corpus: reports query is title/summary-only, 100-row cap, no relevance sort, no in-result county filter
  • Gavelly: relabel 'fact sheet' citations — no such public document exists
  • CGJ: link reclassified responses to parents + render response-typed pages distinctly
  • CGJ: OCGJ responses misclassified as reports (nested-li DOM signal missing)
  • Document unlisted /m/ memo path in ungovr-cgj/CLAUDE.md
  • CGJ: detect-and-flag de-facto consolidated reports tagged as single
  • Basic/OAuth users bounced to /e on /account/* alias paths (e.g. /account/api-keys)
v0.13 May 17, 2026 · 16 changes this week #
  • Feature Integrate CA Civil Grand Jury investigations of sheriff's office (incl. deaths in custody) into county pages
  • Feature feat(law,www): classify broken links body-vs-core, 2-run debounce for body, /fix-urls skill
  • Feature feat(law,www): weekly external-link checker — email digest of broken outbound links
  • Fix modeleval: Gavelly adapter's model swap is a no-op (module constant read at import time)
  • Fix fix(cgj): worker strips per-request CSP nonce, breaks inline JS on civilgrandjury.org (e.g. /analysis tile clicks)
  • Fix cgj: report titles polluted by TOC artifacts (bullets, dot leaders, HTML, encoding errors)
  • Fix modeleval: Gavelly adapter passes raw Evidence objects to scorer, breaking Jaccard set()
  • Fix fix(cgj): wrong parent_report_id linking on response docs blocks LLM tier
  • Fix fix(www): 39 broken outbound URLs on www.ungovr.org (sources/us, tech, unclaimed, …)
  • Fix law: 5 subnational rows still need source URLs (mh/ts records, ar/s meetings)
  • gavelly: harden JSON extraction — pass format=json to Ollama + strip <think> tags
  • modeleval: clean 4 missing-text reports from Gavelly pin; evaluate qwen3:30b-a3b and qwen3.5:35b
  • CGJ: thread term-window into extract_date_from_pdf_cover_page (prevent in-body year hits)
  • CGJ: pdf-cover publication dates outside term window (e.g., 2006 on a 2011-2012 OC report)
  • civilgrandjury.org on R2: static HTML + scheduled PDF publishing
  • Model evaluation framework: pluggable harness for local + cloud LLM swaps
v0.12 May 10, 2026 · 10 changes this week #
  • Feature Gavelly: detect non-government actors in recommendations (Lompoc/Pajaro pattern)
  • Feature Gavelly: detect 'wrong-respondent' recommendations (action-subject vs respondent mismatch)
  • Feature Trust PDF /Title metadata for CGJ extraction (current jury year forward); always for Gavelly
  • Feature Weekly referrer tracker digest (CF GraphQL → email + Slack)
  • Fix CGJ extraction: subsequent reports/responses/continuity bleed into single-report extracted text
  • Fix Gavelly: detect duplicate / out-of-sequence finding & recommendation labels
  • Convert thematic reports to static dated artifacts (no DB queries on render)
  • Gavelly: regression check — every change must compare prior-year report counts
  • Gavelly: align response-language guidance with Penal Code § 933.05 wording
v0.11 May 3, 2026 · 54 changes this week #
  • Feature Add California Civil Grand Juries to /law/oversight/us/ca
  • Feature Prominent link to civilgrandjury.org from CA grand jury oversight page
  • Feature Law data coordination — FACTS 500 coverage & quality (tracking)
  • Fix CGJ home page is 7s due to cgj_county_summary view Cartesian explosion
  • Fix Gavelly Python score calculation includes gray checks in denominator (caps clean reports below 100)
  • Fix cgj_daily_monitor_watchdog: '\n' in alert subject crashes Resend send
  • Fix security(cgj-changelog): render_html() passes javascript: URIs through to | safe output
  • Fix fix(cgj): Marin sub-page respondents misclassified as 'report' when title is generic
  • Fix CGJ: Sierra County low findings rate — junk docs + narrative-only reports + missing GJ terms
  • Fix CGJ: Kern County 0 extracted titles — backfill never run + null-byte fix
  • Fix CGJ: add term-year-from-year-record fallback to pub date extraction chain
  • Fix CGJ: add YY-YY URL path handler for Sacramento-style docs/reports/21-22/ paths
  • Fix CGJ: investigate 64 reports with NULL publication_date and no cached PDF
  • Fix CGJ: clear 5 zero-byte SJ cached PDFs and mark dead links
  • Fix CGJ: Surya OCR fallback for NULL publication_date rows (scanned PDFs)
  • Fix Fix false-positive alerts in CGJ website monitor
  • Fix bug(cgj): confirm_canonical — no input validation, wrong count returned, self-referential allowed
  • Fix bug(cgj): log_audit_event called inside transaction in confirm_canonical
  • Fix bug(ops): social_media.py file I/O blocks event loop (threading.Lock + sync reads in async routes)
  • Fix mobile: Android build fails on Linux — manifest merger conflict (com.android.support 28 vs AndroidX)
  • Move CGJ daily monitor log/state out of /tmp (wiped on reboot)
  • cgj: validate 285 title_backfill children after page ranges assigned
  • cgj: triage 563 remaining no-range no-findings consolidated children
  • cgj: exclude canonical_report_id rows from cgj_county_summary coverage
  • cgj: GPU OCR index pass for 122 OCR-only consolidated parents
  • CGJ: anchor /new feed to fixed start date (2026-03-01) instead of sliding window
  • CGJ: tighten /new "hide older" cutoff from 1 year to 90 days
  • CGJ: Sitecore dedup leftover — canonicalize pre-existing FILETIME-tokened pdf_urls (Madera/Monterey/Yolo)
  • audit(cgj): tighten NON_REPORT_PATTERNS — false positives + 150+ unflagged candidates
  • fix(cgj): dedupe Sutter sutter.courts.ca.gov vs suttercourts.com aliases
  • fix(cgj): more response/handbook miscategorizations on new-reports page
  • CGJ: data completeness audit — findings/recs extraction rate, dead links, missing terms
  • Discovery scraper captures link accessibility text instead of report title (sjcourts.org)
  • CGJ: weekly changelog at civilgrandjury.org/changelog
  • chore(cgj): confirm_canonical DB update + audit log not in a transaction
  • security(cgj): dismissed.json write is not atomic — race condition + unbounded signature
  • security(cgj): changelog URI sanitizer incomplete — unquoted attrs, vbscript:, encoded colons
  • test(ops): add coverage for dev_mac, mobile_screenshots, and intranet digest routes
  • chore(www): add asgi-lifespan to /tech page and review changelog system listing
  • chore(cgj): move dismissed duplicate signatures from disk JSON to database
  • chore(ops): add CF-Connecting-IP extraction helper for audit IP logging behind Cloudflare
  • chore(tests): add test coverage for changelog_admin, SSH key validation, and social_media validation
  • performance(ops): N+1 queries in crawl_backlog CRUD loops and missing DB indexes
  • Add ISO-8601 timestamps to ungovrd-ops.log and other restart.sh logs
  • chore(ops): add test coverage gate for new route modules (dev_mac, mobile_screenshots)
  • chore(ops): audit all write routes for log_audit_event placement inside vs outside transactions
  • Collapse legacy users.role into users.system_role
  • chore(www): update /tech page — tldextract and google-analytics-data added to requirements.txt
  • perf(geotracker): add partial index on documents WHERE resolved_at IS NULL
  • users API: VALID_ROLES asymmetry between create and update endpoints
  • Add UnGovr Law to /applications page
  • law(oversight): show 'Full text of law →' instead of raw statute URL
v0.10 Apr 26, 2026 · 18 changes this week #
  • Feature CGJ: consolidated child reports have no fetchable URL — link to parent PDF or viewer
  • Feature ADA Title II compliance deadlines extended by one year (DOJ interim final rule, 2026-04-20)
  • Feature Labs: turn /request into a subsection (move existing project + Rumble underneath)
  • Feature Add patch watchdog for OS / pip / ollama security updates across all servers
  • Fix CGJ: Sonoma BoS responses still misclassified as reports
  • Fix security(cgj): actionsContainer innerHTML XSS risk in problem_reports.html
  • Fix a11y: PDF iframe in problem_reports.html missing title attribute (SC 4.1.2)
  • Title extraction fails on 40 San Joaquin reports — "Download Report (PDF)(opens in new tab)"
  • Clean up 39 orphan consolidated:// rows (NULL consolidated_report_id)
  • Image-only cover-page date fallback for 75 reports
  • Re-crawl Sonoma from sonomacourt.org (domain migration)
  • CGJ: improve publication_date extraction (full-text scan + Sitefinity URL ticks fallback)
  • civilgrandjury.org/new: hide newly-discovered older reports by default
  • Audit: re-extract CGJ reports OCR'd despite having a clean native text layer
  • CGJ /new lists 1744 broken consolidated:// links and 2465-row backfill flood
  • Phase 0: Parametrize paths/DB name + adopt Alembic (dev onboarding prereq)
v0.9 Apr 19, 2026 · 16 changes this week #
  • Feature CGJ: per-child F&R re-extraction for 3,904 new synthetic children
  • Fix a11y: easy wins from WCAG audit — form control labels + in-paragraph link underlines
  • Fix a11y harness: wcag_audit.py doesn't validate HTTP status — 404/429 responses are scored as if real pages
  • Fix CGJ: recover 4 orphan parents with broken cached_pdf_path
  • Fix CGJ: improve consolidated_splitter SKIP_SECTIONS to suppress non-report TOC entries
  • Notify IndexNow (Bing/Yandex) when static site content changes
  • chore(crawl-backlog): sync file I/O in async context in gavelly llm_checker
  • CGJ: update sub_report_duplication check to respect consolidated_report_id
  • CGJ: remove country flags from language dropdown (MX flag too narrow a signal)
  • cloudflare-purge: add civilgrandjury.org zone and accept full domain names
  • CGJ: language dropdown shows 'US'/'MX' text instead of flags on Windows
  • CGJ: simplify language dropdown label ('English (US)' wraps to 2 lines)
  • Remove duplicate 'Become a Civil Grand Juror' CTA from CGJ home page
  • Conservative Python venv + Ollama upgrade pass (post-reboot 2026-04-13)
  • chore(csp): tighten app-wide policy — /request doesn't need Turnstile, translate.google, or jsdelivr
v0.8 Apr 12, 2026 · 19 changes this week #
  • Feature Gavelly: extract footnotes and check font size compliance
  • Feature Gavelly: detect tracked changes, warn users, and extract original text
  • Feature Gavelly: add copy buttons for individual issues and full results
  • Feature Gavelly: map appointed staff respondents to their governing board
  • Feature Gavelly home page: simplify layout and move security details to modal
  • Feature Gavelly for Word: landing page with installation instructions
  • Feature Gavelly Word Add-in: real-time compliance checking inside Microsoft Word
  • Feature Gavelly checks page: style 'How to fix' as blue fix-panel
  • Feature Gavelly home: add UnGovr logo, move privacy box, add 'What Gavelly Checks' page
  • Fix CGJ: Flag remaining unflagged continuity/compliance metadata reports
  • Fix Gavelly: fix broken logo + rate limit display mismatch
  • Fix Gavelly: fix run-to-run inconsistency — single prompt, per-report seed, move mechanizable checks out of LLM
  • Labs: restyle from cyan to amber Mission Control theme
  • gavelly: downgrade 'Additional checks require PDF' from yellow warning to end-of-results note
  • docs: update hub labels, add Gavelly page, add Open Records research section
  • docs: fix critical documentation inaccuracies — routing, security, developer workflow, schemas
  • docs: comprehensive hub.ungovr.org documentation audit — 60+ inaccuracies found
  • feat: intranet home page at home.ungovr.org
v0.7 Apr 5, 2026 · 32 changes this week #
  • Feature feat: /audit directory — US state audit & oversight bodies
  • Feature Add What's New page at /new showing recently discovered reports
  • Feature Create fetching-external-pages skill
  • Feature Build CGJ website changes triage page
  • Feature WCAG 2.1 AA compliance audit page for CGJ reports
  • Feature CGJ: Activate report monitoring pipeline (daily crawl, hourly notifier, weekly monitor)
  • Fix Investigate ~611 reports with ALL-CAPS CONCLUSIONS headers missing annotations
  • Fix Audit and fix inline event handlers blocked by nonce-based CSP
  • Fix Data quality: ~1,134 agency responses misclassified as reports
  • Fix Bug: 'View Original PDF' links broken when cached_pdf_path set but file not in R2
  • Fix security(ops,cgj): XSS via unescaped API values in pipeline.html innerHTML
  • Fix CGJ extraction: jury year misassignment on some reports
  • Fix CGJ extraction: related_findings range references (F1-F5) not parsed
  • Fix CGJ extraction: page header/footer text contaminating findings and recommendations
  • Fix Fix daily monitor Phase 1 to use crawler infrastructure
  • Fix CGJ: Fix false positives in schedule monitor analyzers
  • Fix fix(cgj): update Ollama model name and add focused LLM options
  • Labs: move How It Works sections lower on project pages
  • chore(ops,cgj): audit and eliminate | safe filter usage on database/API-sourced content in templates
  • chore(cgj): add regression tests for CGJ PDF extraction pipeline before merging extractor changes
  • chore(ops,cgj,core): write tests for security-critical auth functions (Bearer, OAuth state, unsubscribe HMAC)
  • chore(ops): add lint rule flagging subprocess.run() inside async def functions
  • chore(crawler): add Ruff/lint rule to block raw httpx/requests imports outside ungovr-crawler/
  • chore(cgj): missing tests for CGJ PDF extraction pipeline
  • security(ops,cgj): replace raw httpx usage with HttpClientManager across non-crawler code
  • chore: auto-fix code quality issues from global check
  • Add permanent human-readable IDs to website change alerts
  • Audit codebase for raw HTTP client usage on external URLs
  • Classify CGJ findings by type and extract annotations (conclusions, commendations)
  • CGJ homepage report count inflated by responses and non-reports
v0.6 Mar 29, 2026 · 41 changes this week #
  • Feature Exclude compliance/continuity reports from CGJ counts and WCAG analysis
  • Feature WCAG 2.1 AA compliance tracking for CGJ reports after 4/24/2026
  • Feature feat: migrate to qwen3.5:27b (LLM) and qwen3-embedding:0.6b (embeddings)
  • Feature feat(cgj): add bullying in schools thematic report
  • Feature Add MPP (Machine Payments Protocol) support for MCP server and data API
  • Feature Unblock search engine crawling of www.ungovr.org
  • Feature feat(watchdogs): add semantic page correctness checks to site watchdogs
  • Fix Reports assigned to wrong fiscal year window based on web server date instead of publication date
  • Fix fix(ocr): add CUDA GPU guard to prevent Surya running on CPU
  • Fix fix(cgj): add response title patterns for Humboldt convention
  • Fix fix(cgj): reclassify Orange County agency responses miscategorized as reports (no title/URL signal)
  • Fix fix(cgj): mark Fresno annual report+response bundles as consolidated (11 records)
  • Fix fix(cgj): resolve document_type='consolidated' inconsistencies (35 records)
  • Fix fix(cgj): remove or reclassify ~60 non-report documents (press releases, appendices, cover letters)
  • Fix fix(cgj): reclassify ~79 agency responses miscategorized as reports
  • Fix CGJ: Deduplicate title-based standalone reports with different URLs
  • Fix CGJ: Review Mendocino 2007-2008 duplicate 'Mendocino County District Attorney' reports
  • Fix CGJ: Remove standalone-to-standalone duplicate reports (www vs non-www URLs)
  • Fix fix(mypy): migrate TemplateResponse calls to new Starlette API
  • Fix perf: gavelly_api.py Ollama timeout too short for large documents
  • Fix perf: sync Anthropic client blocks event loop in rematch_cgj_entities.py
  • Fix CGJ: Re-check all reports against current non-report detection patterns
  • Fix CGJ: Garbled extracted_title displayed instead of correct title
  • Fix fix(gavelly): data quality improvements — Spanish skip, response-deadline, large PDFs, consolidated page-range splits
  • Delete non-report PDFs ingested as CGJ reports (demographics, rosters, boilerplate)
  • Fix CGJ reports assigned to wrong fiscal year based on title year
  • Embedding watchdog sends false stall SMS alerts during CGJ GPU yield
  • Change CGJ monitor from weekly to daily at 1 AM Eastern
  • Add 7 missing watchdog scripts to crontab
  • Optimize CGJ extraction GPU utilization with per-pass worker tuning
  • CSP: img-src blocks www.googletagmanager.com
  • chore(mypy): add type annotations to bin utility scripts
  • perf: standardize Ollama keep_alive settings
  • perf: add connection pooling for Ollama HTTP clients
v0.5 Mar 22, 2026 · 28 changes this week #
  • Feature Build status.ungovr.org status page
  • Fix Cloudflared watchdog fails to detect tunnel down when other user's tunnel is running
  • Fix CGJ: Reject impossible future years from crawler, clean up existing bad data
  • Fix fix(gavelly): LLM overrides correct GREEN findings-numbered for Alameda XX-N format
  • Fix fix: Labs GA tag undocumented, audit script parser bug, minor code quality
  • Fix Gavelly: style guide not applied when using preloaded reports or dev drafts
  • Fix fix(auth): Google OAuth fails for cross-domain login to civilgrandjury.org
  • Fix fix(cgj): broken logo and missing report count on civilgrandjury.org homepage
  • Fix civilgrandjury.org blocked by Cloudflare managed challenge
  • Fix civilgrandjury.org has no external health monitoring
  • fix(gavelly): evidence field can be null from LLM, crashing parse loop
  • perf(cgj): fix N+1 on juries schedule page (58 sequential DB queries)
  • perf(cgj): add GIN trigram index on cgj_findings.text + cache thematic reports
  • Fix 6 high-severity security findings from audit
  • perf: full performance audit — fixes and infrastructure
  • Add UG favicon to all UnGovr sites
  • Add external uptime monitoring that doesn't bypass Cloudflare security
  • Auto-purge Cloudflare cache for both zones after tunnel recovery
  • Add civilgrandjury.org to external health monitoring
v0.4 Mar 15, 2026 · 89 changes this week #
  • Feature CGJ: Sources page enhancements and automated report discovery
  • Feature MCP: add offset pagination to search_entities and search_cgj_reports
  • Feature Serve CGJ PDFs from R2 on cgj.ungovr.org
  • Feature data-api: add live endpoint integration tests for data worker and MCP
  • Feature CGJ: Show per-year term exceptions on /juries/schedule page
  • Feature data-api: deploy workers and publish data to R2
  • Feature data-api: run end-to-end build pipeline and verify output
  • Feature CGJ: Support non-standard grand jury term schedules (calendar year, COVID extensions)
  • Feature CGJ analysis: challenges of privatizing government services (mental healthcare focus)
  • Feature Traffic anomaly monitoring for Open Data API and MCP server
  • Feature Build Cloudflare MCP server for UnGovr Open Data
  • Feature Build UnGovr Open Data API (R2 + CDN static JSON feeds)
  • Feature New thematic cross-jury analysis reports: Homelessness, Cybersecurity, Mental Health Services, In-Custody Deaths & Jail Overcrowding, Education & Schools, Infrastructure & Public Works, Financial Oversight, Child & Family Services, Wildfire & Emergency Services
  • Feature CGJ: Add source reports page for each analysis report
  • Feature Rename CGJ 'Reports' section to 'Analysis'
  • Feature CGJ: Improve quote selection quality in thematic reports
  • Feature CGJ: Normalize time-series data for historical report availability bias
  • Feature Add contact note for report copies after dev preview exit
  • Feature CGJ: Add /errors/{county} page showing problematic reports
  • Feature CGJ Gavelly: Relax rec-finding-linkage check to account for implicit linking
  • Feature Add California charter schools to government entity list
  • Feature Complete consolidated CGJ PDF splitting: Layer 1 OCR indexing for ~1,730 image-only PDFs
  • Fix MCP: harmonize CGJ daily rate limits between data worker (25) and MCP worker (50)
  • Fix MCP: add list_cgj_counties tool (tests expect 6 tools, worker has 5)
  • Fix Fix Gavelly style guide upload, LLM parsing, and title case
  • Fix Audit pages blocked by Cloudflare WAF managed rules
  • Fix security: KV rate limit race condition allows CGJ API quota bypass
  • Fix Data quality: 685 consolidated children missing page_range_start/page_range_end
  • Fix Data quality: Monterey County has 53 cannabis-related reports — likely duplicates or over-matching
  • Fix Investigate path traversal attempt returning 200 on CGJ pms endpoint
  • Fix CGJ catch-all route returns 200 for attack probe paths (.php, .aws/, config.)
  • Fix fix(cgj): notify_on_completion context manager is broken — JobNotifier is created but never started
  • Fix fix(security): real Twilio account and messaging service SIDs may be in notifications.py docstring
  • Fix fix(security): cgj_daily_monitor_watchdog.py uses Python % string formatting to build SQL query
  • Fix fix(cgj): fitz.Document not closed in finally block in pdf_processor.py
  • Fix fix(cgj): CSRF_WARN_ONLY production guard missing in CGJ app (present in ops app)
  • Fix fix(cgj): remove_headers_footers() renumbers pages from 1, discarding original page numbers
  • Fix fix(gavelly): _jobs dict mutated across await points without coordination — cleanup can cancel mid-write
  • Fix fix(gavelly): greedy regex in LLM response parsing silently drops results when response contains multiple JSON objects
  • Fix fix(cgj): temp PDF file may be deleted while Surya is still reading it
  • Fix CGJ: Fix recommendation-to-finding linkback logic
  • Fix Gavelly: report upload clears and style guide upload fails
  • Fix [Open Redirect] Unauthenticated user redirect via crossdomain_auth returns_to=<external_url> when silent=1
  • Fix [Error Disclosure] Raw exception strings returned in API responses in domain_validator.py and gavelly/routes.py
  • Fix [IP Spoofing] X-Forwarded-For trusted without CF-Connecting-IP check in connect.py and gavelly/routes.py rate limiters
  • Fix [Path Traversal] DB-sourced county_code used unsanitized in CGJ file paths
  • Fix [Path Traversal] Unvalidated DB-sourced extracted_text_path in gavelly routes
  • Fix [Path Traversal] Unvalidated DB-sourced file path in CGJ PDF serve endpoint
  • Fix [XSS] Unescaped report fields from database in CGJ problem_reports.html
  • Fix [Auth] CGJ CSRF exempt paths use exact-match set instead of prefix matching
  • Fix [Auth] Gavelly dev-draft endpoints rely on hardcoded email/username allowlist
  • Fix Gavelly: LLM respondent extraction times out on web UI
  • Fix fix(security): PageFetcher.fetch_page() has no is_safe_url() check — SSRF on entry point used by CGJ and enrichment
  • Fix fix(security): domain validator write endpoints missing role-based authorization
  • Fix Entity route catch-all returns 200 for attack probe paths (/e/us/.aws/, config/)
  • Remove dead Turnstile/CAPTCHA code from Gavelly
  • fix(security): verify data-ungovr R2 bucket has public access disabled
  • feat(mcp): add authentication and rate limiting to mcp.ungovr.org
  • Move /juries/schedule to /data/juries/schedule
  • Re-generate thematic analysis reports after CGJ data quality cleanup
  • perf(www): publish_www.py purge_everything flushes entire zone, not just www pages
  • Fix 19 mypy errors across services
  • Fix 3 remaining Ruff errors in services
  • Serve CGJ PDFs from R2 via cgj.ungovr.org
  • Add rate limiting for civilgrandjury.org
  • Update About page to highlight first-of-its-kind archive
  • Embedding daemon stalled: 9.8M chunks pending, watchdog disabled
  • CGJ: Weekly checker to verify/update problem report status
  • Clean up ~878 non-report children in consolidated CGJ data
  • Rename CGJ is_authenticated() to has_cgj_access() — name is misleading
  • Re-extract consolidated CGJ children with duplicated findings
  • Fix CGJ extraction pipeline to scope findings to child page ranges
  • Verify CGJ extraction queue completion (1,335 reports queued for OCR re-extraction)
  • Review and link 11 CGJ Spanish report orphans (no English match)
  • Write incident runbooks for all critical services
  • fix(security): add clickjacking protection headers to www.ungovr.org
  • i18n: Fix hardcoded <html lang="en"> in 26 ops templates
  • Migrate ungovr.org to www.ungovr.org: stop WordPress proxying
v0.3 Mar 8, 2026 · 11 changes this week #
  • Fix Gavelly: logout URL matched as county slug, shows 'County logout not found'
  • Fix CSP nonce missing on several pages
  • Gavelly: numbered findings contradiction on Female Inmates report
  • Gavelly: contradicts itself — finds numbered F1/R1 items then says they aren't numbered
  • Gavelly: respondents-categorized check should be yellow, not red, when sections are missing
  • Gavelly: check results should cite their source (published guidelines, best practices, etc.)
  • Gavelly: 'AI analysis unavailable' message shown when it shouldn't be
  • Gavelly: warn when report targets state-authorized charter schools
  • Set up linting infrastructure: Ruff + mypy + Pylint + pre-commit hooks
  • Microsoft OAuth: Cloudflare blocks callback for work/org accounts
v0.2 Mar 1, 2026 · January–February 2026 update #
  • Feature Thematic cross-jury reports (homelessness, jails, fire districts, water districts, juvenile services)
  • Feature JPA jurisdiction support: direct (§925a) and indirect statewide JPA coverage
  • Feature GPU time-sharing scheduler for multi-service coordination
  • Feature CGJ entity-centric view pages: per-entity findings, recommendations, and response tracking
  • Feature County entities summary page with sorting, grouping, and confidence filter
  • Feature /juries/schedule page with non-standard term schedule support (calendar year, COVID extensions)
  • Improve CGJ entity matching: domain validation, context disambiguation, LLM enrichment
  • Improve Continuity / compliance reports: distinct flag + filtering across UI
  • Improve Sitecore /showpublisheddocument duplicate detection
  • Improve Cover-page title + publication-date extraction from PDFs
  • Improve Scanner-PDF detection and OCR quality scoring
  • Fix Many crawl, extraction, and entity-routing improvements through March 2026
v0.1 Dec 31, 2025 · Initial release #
  • Feature Civil Grand Jury report archive launched: 58 California counties, year and report-level pages, OCR + extraction
  • Feature Daily monitor: detect new reports + website-structure changes; per-county digests
  • Feature Search: semantic RAG search across all CGJ reports
  • Feature Gavelly tooling for CGJ report quality checking (numbered findings, contradiction detection)
  • Feature Security middleware, CSRF enforcement, and Web Bot Authentication (RFC 9421) for crawler
  • Improve Civil Grand Jury extraction: GPU-accelerated Surya OCR + smart-quality detection + scanner-PDF flagging
  • Improve Crawler: automatic escalation from plain HTTP to a full headless browser for hard-to-fetch grand jury sites
  • Improve Year extraction priority and 15-minute extraction timeout
  • Fix Extraction and crawl pipeline improvements across the 58 California counties