From 54a3eab511b339f4bd4fdc3c4a5089312bd51f51 Mon Sep 17 00:00:00 2001 From: hiina Date: Fri, 11 Apr 2025 18:05:09 -0600 Subject: [PATCH] fix typos and a few boogs, link git --- src/full-text-search.md | 3 +++ src/index.md | 9 ++++++++- src/substring-search.md | 7 +++++++ src/thread-browser.md | 5 ++++- 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/src/full-text-search.md b/src/full-text-search.md index 60c195a..11f3054 100644 --- a/src/full-text-search.md +++ b/src/full-text-search.md @@ -11,6 +11,9 @@ is built offline and loaded as parquet files. It is a bit limited, in that it only searches stems of common words, and no search operators (I think). But still, very fast for ~1.3 million posts. +If you see errors, try a different browser, or turn off the "advanced tracking protection". Not sure why it +breaks this script particularly when the javascript is just as horrifying on the other pages, but oh well. + ```js const schema_sql = ` LOAD fts; diff --git a/src/index.md b/src/index.md index 1c521c1..ad981ec 100644 --- a/src/index.md +++ b/src/index.md @@ -18,7 +18,7 @@ I don't have thumbnails or images (yet), but I'm working on it. Until then, enjoy the data.
All queries in your browser, which means you'll download a -fair amount (~100MB) of data. So probably don't browse this your phone.
+fair amount (~100MB) of data. So probably don't browse this on your phone. ## Pages @@ -74,6 +74,13 @@ This site uses [Observable Framework](https://observablehq.com), which includes a [DuckDB](https://duckdb.org) wasm build, which queries the archive as parquet files. It's kind of horrifying yeah but also cool. +https://git.vrg.party/hiina/vrg-archive has the source if you want to stare that the sql. + +I don't have any scraper code uploaded yet, but full disclosure: it's all +(almost) one-shot python slop by gemini 2.5 pro, so you might as well ask "I want to +scrape a fuuka-based archiver for a single thread" and have it slop it out for +you yourself. + ### Can you add X feature? Maybe, post in the thread about it. If you don't want to wait, you can also just diff --git a/src/substring-search.md b/src/substring-search.md index 00d5aeb..18e610e 100644 --- a/src/substring-search.md +++ b/src/substring-search.md @@ -73,6 +73,13 @@ WHERE GROUP BY themonth, st.term -- Group by month AND by term +ORDER BY + themonth, + st.term -- Sort by month AND by term +``` + +```js +const width = 600; ``` ```js diff --git a/src/thread-browser.md b/src/thread-browser.md index a82659b..d531ef1 100644 --- a/src/thread-browser.md +++ b/src/thread-browser.md @@ -7,7 +7,10 @@ sql: # Thread Browser -Browser posts in old threads in a somewhat faithful format. It takes a bit to load and this framework is bad about indicating loading progress, so wait around if you don't see anything. +Browser posts in old threads in a somewhat faithful format. It takes a bit to +load and this framework is bad about indicating loading progress, so wait around +if you don't see anything. **Select a thread to display from the table, else +you'll see some (transient) errors**. There are no thumbnails or full images (yet).