Resurrecting MindTouch DekiWiki

Posted by Jedd on 2015-08-30

My favourite wiki -- DekiWiki by MindTouch -- used to run on a cranky old Pentium-based server that had started suffering from sufficiently frequent random reboots, plus MindTouch had evidently pivoted into a company that doesn't actually provide any of the stuff they said they'd provide forever, so a few years ago I powered that beast down and started what I thought would be a short search for a new personal wiki for me and the family to use.

General consensus was that the state of the art had improved substantially over the past decade, but also that there was less of a demand or requirement for a personal (or minimally shared) wikis. Alas people were wrong on both counts. I experimented with dokuwiki, mediawiki, moinmoin, wordpress (with many plugins), and even revisited joomla and drupal (also with many plugins).

But nothing quite got me to the same place that DekiWiki was at nearly a decade ago -- a fast wiki with a hierarchical structure and a decent editor.

Then, a few weeks ago, having lunch with a friend from a place that we'd worked and set up a DekiWiki for their internal document repository back in 2006 IIRC, he mentioned that they're still running that same VM as their production wiki for some dozens of staff.

This made me realise -- I don't need something that's robust to the outside world, and I don't need to spend a few weeks trying to get wordpress, drupal, dokuwiki, joomla, or whatever, to kind of but not really play nice, with dodgy hierarchy & WYSIWYG plugins -- I can just stick with unsupported, but still (mostly) solid software.

A brief history of DekiWiki

Wikipedia has a concise page describing the potted history of MindTouch and the DekiWiki product. This is fortunate as the MindTouch page(s) do a very bad job of describing what the fuck happened to their product lineup .. and instead keep trying to force you to agreeto be contacted by a sales person so that you can then evaluate a product that they don't actually care to describe on their web site.

The closest thing I found to an explanation of the abandonment was this blog written back in 2013-04, by Aaron Fulkerson, describing The End of MindTouch Core & Platform (the things that most people recognised at the time as the wiki / cms product).

So, to backtrack a smidge, and very loosely -- MindTouch was a commercial entity that built an enterprise-quality CMS. They originally forked MediaWiki (that which Wikipedia runs on), then wrote a funky front end (in PHP) and back end (in C#), and produced a community edition, providing on-prem self-hosting, as well as a hosted (SaaS) edition later.

The system shipped as a virtual machine image -- and was a bit fragile if you tried to fiddle with the underlying OS too much -- they declined to ship robust .deb or .rpm packages, citing platform differences and development overheads in trying to do so (apt-cache stats on my current Debian jessie workstation suggests that some 73,000 packages are available, so 'development overhead' for Debian packages is perhaps a solved problem for many developers).

Anyway, the point was that you had to be running VMware (though I once did get the image converted to run under VirtualBox), and you had to be happy to suck it up and trust in their occasional update releases, while trying to resist the urge to be too eager in updating the underlying shipped OS image. Having said that, it wasn't too difficult or risky to update key components -- mysql, apache, php, mono, and so on. And given it was a VM, it was close to painless to rollback if necessary.

Although, as noted below, there are some good examples of an evidence of basic system bewilderment in the way they configured some services, for example cron, that suggests the developers didn't really get (or didn't really care about) the way Debian systems should be managed.

Anyhoo.

In 2011 they announced the shift, and in 2013 they confirmed the product was dead, and if anyone wanted support or new versions they had to host it in the cloud -- a horrendous phrase that almost no one understood at the time.

Not that anyone really understands what in the cloud means today, either.

Fuck I hate that phrase.

Anyhoo.

The irony of posting a blog saying "We're going to drastically change our business model, and the software you relied upon yesterday will not be supported tomorrow", while then going on to say "Instead we recommend you move all your data into our hosted systems -- trust us -- they'll be there forever" was evidently lost on them at the time.

I'm sure plenty went along for the ride. Others were kinda stuck (or a word that kinda rhymes).

Because the underlying architecture was to store content in XML files (I shit thee nay), migration away to another platform wasn't especially easy. And that's ignoring the problem of trying to find a replacement product that would satisfy a range of technical and non-technical users. That last bit is the real challenge.

On the other hand, the product was rock solid, and people using it on intranets weren't quite so worried about flakiness in the PHP front end or other unpatched issues that may pop up on the underlying Debian OS. As noted, general security patching worked fine for the OS components, and major Debian releases didn't happen too frequently.

On the other other hand, the thing had always been a bit painful simply because of its dodgy delivery (here, have a whole VM), challenges with maintaining the base OS, and flakiness in choice of platform (OMG -- it's c#, mono, and php -- the three worse bits of technology currently available -- and they weren't installed as Debian packages).

So the volume of people who were abandoned was not spectacularly high, and one consequence of that was that no one stepped up to start a community supported edition of the product. In retrospect, MindTouch probably would not have been happy about this -- they appear to be disinclined to maintain (m)any of the old web resources (forums, faqs, developer guides, etc) that were tremendously useful when troubleshooting this bit of free software, which gives some small insight.

Pondering this, even aside from the stinginess aspect (come on, it's a tiny $-cost to keep a VM with old forums and developer pages up especially if only in read-only mode), if they'd kept the forums going, even if they didn't contribute any staff time for helping users, it would have kept the largest set of potential customers for their new in the cloud products close by and in frequent communication with the company.

Oh well.

Getting the VM image

So, Sourceforge still maintains the canonical copy of DekiWki -- and despite SF's relative evilness, it's fairly safe to download a zip file that contains a vmdk/vmx pairing, as they're unlikely to be inserting advertisements into binary blobs ... at least not just yet.

The DekiWiki at SourceForge Files Download page has the penultimate version of MindTouch Deki VM -- 10.1.3, the Pipestone release. 10.1.4 is available, but we can only get it from within the VM itself for some reason.

Note that 10.1.3 has a release date of 2012-01-26

Annoyingly it's using Lenny (see below), and Lenny was deprecated by Squeeze in 2011-02.

Fixing the VM image

First, we need to transform the disk image to something ESXi can use. ESXi is VMware's $-free variant of their hypervisor. As noted above it's possible to wrangle the DekiWiki image such that it runs on other hypervisors, but it's a bit of a mess, and as they say in the classics it's beyond the scope of this document. I'm including the idea of running it under VMware Workstation, which probably wouldn't be too painful to get going, just a little bit silly.

The image file that comes off Sourceforge is only 600MB. The VM's disk image is sized to 30GB.

On ESXi it's possible to have Thin Provisioning, and I tend to pick that out of habit, as the performance benefits of Thick Provisioning seldom outweigh the capacity management issues of lots of reserved but never used space. Especially with applications like this that won't be trying to do large writes very often. Anyway, take your pick.

Copy (using scp) the MindTouch.vmdk and MindTouch.vmx files to your ESXi box -- pick somewhere other than where you host your VM's. The file is 1.5GB at rest, but may blow out to ~30GB image when it lands on a VMFS partition. We'll fix that shortly.

Enable SSH access to ESXi, if you haven't already, then connect to it. Create a directory that will house your actual VM -- on my system that's /vmfs/volumes/FAST3/py-deki-01

Assuming my path name above, and that you copied the files to /vmfs/volumes/scratch/foo/ (obviously adjust to suit your directory structure)

cd /vmfs/volumes/FAST3/py-deki-01
vmkfstools -i /vmfs/volumes/scratch/foo/MindTouch.vmdk -d thin py-deki-01.vmdk

This will convert the file to an ESXi-friendly image, set provisioning to Thin, reducing the on-disk size back to around 1 or 2GB.

Copy over the MindTouch.vmx file, and edit it (or edit it before you send it to ESXi if you don't like using vi).

Change two lines:

scsi0:0.fileName = "py-deki-01.vmdk"

and

displayName = "py-deki-01"

At this point we're ready to import it into ESXi's Inventory.

Import appliance and upgrade hardware

On ESXi GUI -- I'm using the free VIClient (Win32) -- go to Storage, DataStore Browser, and then navigate to the directory you've set up for this VM. Right click on the vmx file and 'Add to Inventory'. Some gronking will occur.

DO NOT right-click on the appliance and choose 'upgrade virtual hardware', as this defaults to moving to Version 10, which then becomes unmanageable if you're using the free client. If you're a paid-up user of vSphere, then knock yourself out, I guess.

Start the guest machine. Login to the virtual appliance using the ESXi console, using user root and password of password. Then run shutdown. This will give you all the other goodies (files) that a virtual machine needs, and tidies up any automatically tidyable confusion about the format of the vmx file.

To upgrade the VM image to version 9 (still accessible from the free VIClient) do the following:

With the guest powered off, from the ESXi CLI run:

vim-cmd vmsvc/getallvms

Make a note of the VM ID of the deki appliance, and then:

vim-cmd  vmsvc/upgrade {vm id}  vmx-09

Substitute in the ID as appropriate.

Your guest is now in version 9 format.

I edited the guest's settings -- removing the existing network adapter, and adding a new one, using the E1000 NIC type.

I also bumped the memory from 256MB up to 512MB, as I'm not short of RAM on this ESXi host.

At this point we're ready to start the appliance.

First steps with the dekiwiki virtual machine

Power-on the guest, visually monitoring the output in the console. Especially look that the network connects okay.

I use a DHCP-issue IP address from my ADSL router, based upon the MAC address of the newly created E1000 NIC. You can alternatively force a fixed IP in /etc/network/interfaces if that's your preference.

Login at the console and change the root password (remember, it's currently set to password).

Change the /etc/ssh/sshd_config and change PermitRootLogin to yes. Restart SSH, then copy across your preferred public keys to /root/.ssh/authorized_keys, change sshd_config to PermitRootLogin without-password, and restart ssh server once more.

This will give you passwordless login to the box, with half-way decent security (with the obvious caveat that it's running Lenny-era php).

Navigate to the IP address (or if you've set up address allocation, and a DNS name, to the name instead) of the VM.

On first login you get three installation options -- Desktop Suite, MindTouch Core, and some Success Cruft. Choose the MindTouch Core.

Enter some data as requested, and proceed.

A licence is sent to your (publically accessible) email address here. This is an XML file that comes through as an attachment, and you have to upload. This part of the process is quite painless, though if / when MindTouch abandons whatever system they have for generating these licences we're a bit fucked. (You may care to repeat this process a couple of times at this point if you want to hoard some potential VM's, as some kind of insurance against a one-day sysadmin at the company wondering what happens if they flick that switch over there ...)

As noted, we're now running 10.1.3 -- but 10.1.4 is available and can be updated via the CLI using /usr/bin/updateWiki.sh

At this point I do the first full backup of the appliance (shut the guest down, go into ESXi, do a copy of the vmdk file using the same method described above to convert the image).

On reboot, run /usr/bin/updateWiki.sh

Run this -- perhaps surprisingly it works fine, despite coming off a by-definition crufty subversion repository somewhere in MindTouch's network, specifically https://svn.mindtouch.com/source/trunk/product/deki/web

Via the GUI, go into system Settings / Extensions and disable all the cruft you don't care for -- in my case this included: dapper, accuweather, addthis, digg, flickr, flowplayer, linked in, quicktime, rtm, skype, twitter, windows live, yahoo (seriously?).

(It may actually work fine as a 256MB spec once these are removed.)

I note that I get a follow up, automated email from MindTouch saying that this appliance was last updated 2010, and that I should follow the link provided to get a hosted version or supported. Depressingly both links actually get redirected to the main www.mindtouch.com page with the suspiciously information-absent Making Success Easy bollocks page.

Now we're ready to try a Debian upgrade -- we have to do this in two steps, via squeeze, and I'll spoil the later surprise that I haven't (yet) been able to get onto jessie, only as far as wheezy.

Upgrading to squeeze

Remember, do backups, especially if you're going to try something slightly different.

I've done the following twice, with a few variations, so I'm reasonably confident it'll work for you.

The following are all my use case, so adjust to suit your requirements of course.

Update the /etc/hosts file with your hostname. Not essential, but good practice.

192.168.1.250   py-deki-01 py-deki-01.int.jeddi.org

Change /etc/hostname from mindtouch to py-deki-01

Reboot

Remove the shipped SSH keys. I can't recall when in Debian's history we had the problem with these, so I did this again later when I got to wheezy. No great cost.

cd /etc/ssh
rm ssh_host_*
dpkg-reconfigure openssh-server

This gives us unique (non-shipped) server keys that match the new hostname. Changing the hostname hasn't, so far as I've seen, break anything later.

I noted errors at boot about eth1 not existing -- so those three lines are removed from /etc/network/interfaces

I had to apt-get install the linux-image (2.6.26) package as it was showing (dpkg -l | grep -v ^ii) as being not fully installed -- and dpkg-reconfigure didn't fix it.

I changed /etc/apt/sources.list to:

deb http://mirror.internode.on.net/pub/debian/ squeeze         main
deb http://security.debian.org/                squeeze/updates main

Obviously pick a more local mirror, after ensuring squeeze is on it.

Then run:

apt-get update ; apt-get dist-upgrade

Many many warnings float past, and lots of package config file questions come up. I said 'No' to pretty much everything in terms of replacing configuration files. Usually 'no' is safe enough. Despite all the missing locale data, mktemp failing with dodgy parameters, legacy grub fallback warning, updating of disk id's, and so on, the upgrade works okay.

Pay particular attention to the my.cnf (mysql) file -- I compared the existing and the new one, and decided the new one was safe. It was not. It's either the bump from 128k to 192k for the thread_stack parameter (very unlikely), or either of the skib_bdb or skip_innodb settings in the mindtouch-shipped my.cnf.

In any case, you have been warned.

Once done, reboot, and marvel at the smarts of the Debian developers

Navigate to the web site to make sure that your wiki is there and happy to let you login.

Run the upgrade-from-grub-legacy script, choose to install to sda only, as per the recommendation.

Reboot again to confirm grub is still happy.

We are now running kernel 2.6.32, up from 2.6.26

You can resolve the annoying Could not reliably determine the server's fully qualified domain name error generated when Apache starts up by doing the following.

Edit /etc/hosts and change the line:

127.0.1.1       mindtouch.localdomain   mindtouch

to:

127.0.1.1       mindtouch.localdomain   mindtouch  py-deki-01

(Or whatever host name you're using.) I'm keeping the mindtouch names there in case some mono / php components talk to each other with that hostname rather than localhost.

Run apachectl configtest to confirm that it's happy now.

Maybe do another backup of your guest VM, as we're about to do an upgrade to wheezy.

Upgrading to wheezy

Feeling a bit more cautious about dragging some of this cruft into the future, I did the following to help troubleshooting.

It's entirely optional, but equally it's a tremendously good idea for all your boxes.

From bash on the dekiwiki CLI:

apt-get install git
apt-get install etckeeper

Change the /etc/apt/sources.list to point towards wheezy:

deb http://mirror.internode.on.net/pub/debian/ wheezy         main
deb http://security.debian.org/                wheezy/updates main

And kicked off another:

apt-get update

Noted the following errors:

W: There is no public key available for the following key IDs:
7638D0442B90D010
W: There is no public key available for the following key IDs:
9D6D8F6BC857C906

Fixed these by running:

apt-get install debian-keyring debian-archive-keyring
apt-get update

All good now. Run the distribution upgrade proper:

apt-get dist-upgrade -u

A warning is raised about wanting to upgrade /etc/crontab -- a file that should never be changed by local administrators, but it turns out the mindtouch didn't understand this, and put in a five-minute periodicity call to /usr/bin/checkdeki {sigh}

Aside: I note in that script that there's a MON_MEM_THRESHOLD of 300MB -- no idea what that threshold actually refers to (script documentation wasn't their strong point either) but the decision to bump to 512MB physical memory is probably sound.

The bigger point is that people shouldn't be fiddling with /etc/crontab -- such fragments should be dropped into /etc/cron.d/ -- and it's something that I (and you) may want to fix up later.

Also /etc/sudoers wants to update its configuration file, but while the bulk is consistent boilerplate, in the shipped sudoers there's a line:

www-data ALL=(root) NOPASSWD: /etc/init.d/dekiwiki

Which will allow Apache (or rather, the web app) to restart the wiki service at will.

Another thing that may need revisiting if I ever finish all the other stuff on my ToDo list.

Mind, it's also an entry in the other very long list of things that are likely to savagely break the guest when we try an upgrade to systemd (on jessie).

Anyway, the dist-upgrade goes fine, and after a reboot we are running kernel version 3.2.0

Shut it down, do another snapshot, bring it back, and proceed to use your dekiwiki.

There are a couple of things that are outstanding for me ...

Things I've yet to sort out

Upgrading to Jessie (Debian stable release as of 2015-04)

This fails -- monumentally -- and I need to revisit it.

I always suspected it would, as there are massive changes to the OS on this release, most notably systemd.

The approach will be to take a ps aux and perhaps pstree output from a running wheezy snapshot, and compare after upgrading to a jessie one.

Also there are lots of log files for apache and dekiwiki itself, and I recall some old mailing list items that talked about ways to interrogate the MindTouch REST API (via http) to determine the state of the daemon, and what ails it. Unfortunately much of that data is no longer on public web sites, and access to people who may be able to help is pretty much gone, also.

Alternatively rather than dist-upgrade, try to make the move to systemd semi-manually, as it may not be systemd that's causing us grief, and/or (more likely) it's a combination of systemd and other upgrades.

Backup and archive of my dekiwiki

I have some scripts, somewhere, that I need to relocate -- or just write some new ones -- that connect to the running appliance and do a mysqldump, then copy the resultant data back to my archiving VM for long term backup and storage.

I'm sorely tempted to just have a script running from my core VM that ssh's into the dekiwiki, shuts it down, then ssh's to the ESXi CLI, does a vmkfstools based backup of the entire VM, and then restarts the guest. There's only three of us that'll be using the VM, and we're all in the same (mostly) timezone, so this isn't going to be a huge inconvenience, especially as it'll involve a 5-10 minute outage once a week, and be used in addition to a daily mysql backup.

I recall that dekiwiki used to store its attachments / photos / other files in a weird and whacky location, and the references to them was based on something vaguely incrementally indexed, rather than hashes, which meant occasionally if you did a restore and didn't have the DB aligned with your attachments, you'd get the wrong photos appearing on some pages. Very messy, but I may be misremembering or misattributing the root cause of those old woes.

Either way, 3GB for a base backup is pretty tight, and I only need to keep a half dozen of these cycling through over the past several weeks.

On the subject of sizing, you can probably gain some benefit from (after all the updates done above) running:

apt-get clean

This removes the ~500Mb of .deb packages in /var/cache/apt/archives. I don't know if ESXi will attempt to re-use previously allocated logical disk space first -- if it doesn't then it's not too much of a pain, but if so then it means your backups are up to 500MB smaller (at least until your wiki grows to more than 500MB of content).

Either way, du on ESXi tells me that my virtual dekiwiki guest is now only 3.7GB, which is easy to manage in 2015.

Migrating content from my old wiki

While I had been experimenting with dokuwiki and moinmoin, much of my historical data is in a very old dekiwiki instance -- and at the moment my thinking is to not entertain any idea of a backup and restore process, but rather to go through all the pages and migrate stuff sensibly.

This will work for me because a) it's not a monumental number of pages, b) they need some cleansing, c) I prefer to be confident of data integrity / accuracy with these bits of information, d) I don't have many files or photos attached within my old wiki, and e) I really, really, don't trust the export / import process, especially between different wiki database versions.


If you decide to go down this path, good luck!