Latest updates to D'ni Sphider

Latest news and information on the D'ni Sphider search engine.

December 8, 2022

D'ni Sphider - end of life

This service has gone largely unmaintained for the past 9 years and has had almost new material added to it for several years. Many of the sites that were previously indexed longer exist, so search results may well be useless.

The code for the search engine is now well out-of-date. A planned update, although started, was never completed due to lack of time, and given the facts above it now seems pointless to continue this exercise. So, with some reluctance, I am closing D'ni Sphider permanently.

August 21, 2013

Top twenty queries

Now that D'ni Sphider has accumulated over 9 million keyword-link relations, and a bit of usage, here's a top 20 list of search queries, just for a bit of fun:

QueryCount Average resultsLast queried
realxcv 5 9.0 2013-08-15 01:47:37
Sleepers 4 13.0 2013-08-14 01:35:01
Chuckles58 3 10.0 2013-08-13 01:40:59
uru 3 154.3 2013-08-17 09:21:25
Myst 3 180.3 2013-08-19 11:32:25
OPenUru 3 47.0 2013-08-16 21:39:41
"It's begun whether we like it or not" 3 0.0 2013-08-11 16:28:38
RAWA 3 68.3 2013-08-16 19:12:51
mysterium 2013 2 33.0 2013-08-20 03:17:34
ae'gura 2 2.0 2013-08-16 05:47:09
"realXCV" 2 9.0 2013-08-10 06:14:13
mystpedia 2 11.0 2013-08-11 15:12:48
drc 2 106.0 2013-08-16 18:43:49
religion 2 34.0 2013-08-21 18:39:28
"revelation editor" 2 2.0 2013-08-15 01:47:37
relto 2 86.0 2013-08-15 00:05:27
"shed some light on the myst" 2 0.0 2013-08-11 17:23:48
open cave 2 62.0 2013-08-15 17:18:33
"mentioned the Myst music as original" 2 0.0 2013-08-16 19:21:19
atrus 2 117.5 2013-08-13 05:56:41

Planned downtime

I'm planning to take D'ni Sphider down for some maintenance on Wednesday, August 21. When I originally set up the database for D'ni Sphider I forgot to check that it'd handle non-Latin characters properly, so there's a mix of Latin-1 and UTF-8 between the code and the database now (d'oh!). To fix that, I need to convert the database tables (and all their contents) from Latin-1 to UTF-8. Since the database is now around 1.4GB that could take a bit of time.

The outage is likely to be 7-9 PM BST (2-4 PM EDT, 8-10 PM CET), but I'll post here when the work is done.

The database has been modified and D'ni Sphider is back online. There are still some oddities with UTF-8 hanging around, but they're in the code and easier to pick off as I find them than the database issue.

August 15, 2013

Update: New feedback link

I've added some links to the top of the search page (that's probably how you found this page, so it may be stating the obvious). What's maybe more important though is that one of the links is for "feedback": That can be in many forms, whether you want to ask us to add a site (or indeed remove one), report links that D'ni Spider is returning that you don't think ought to be there or just make a general comment, observation or suggestion. It saves trying to PM me through the forums!
August 14, 2013

Update: Now with added PDFs (sort of)

D'ni Sphider can now index (some) PDFs: This is an incomplete PDF extraction, but it should deal with most common things. It won't, obviously, extract data from any PDF that has security settings enabled to prevent copying of text; it doesn't seem to handle some older versions of PDF (not sure why yet, though); it can't locate URLs embedded within a PDF; text added as captions to images seems to be getting dropped. But it's better than just letting all those PDFs out there go unindexed.

I'm sure that Tai'lahr (chief indexer and acting unpaid bot driver) has several sites identified as hosting PDFs that will now need to be reindexed.

I cleaned up the database this morning and removed just shy of 10,000 keywords that weren't associated with any valid link in the database. Those will have come from sites that were deleted or edited because the searches were going out of scope and catching things that weren't Uru/Myst/Cyan related. Even with that I've still got almost 230,000 unique keywords in the database which in itself seems quite a remarkable statistic. I also got rid of nearly 1/4 million entries that had got stuck in the temporary tables of the database (probably as a result of timeouts or errors during indexing).

The metrics as of this moment for D'ni Sphider are: