State of my toolchain 2022

Welcome to my now-nearly-yearly State of my Toolchain report (you can see previous editions for 2021, 2019, 2018 and 2016). I began these posts as a way to document the tools, applications and hardware that were useful to me in the work that I did, but also to help observe how they shifted over time – as technology evolved, my tasks changed, and as the underpinning assumptions of usage shifted. In this year’s post, I’m still going to cover my toolchain at a glance, and report on what’s changed, and what gaps I still have in my workflow – and importantly – reflect on the shifts that have occurred over 5 years.

At a glance

Hardware, wearables and accessories

Software

  • Atom with a range of plugins for writing code, thesis notes (no change since last report)
  • Pandoc for document generation from MarkDown (no change since last report)
  • Zotero for referencing (using Better BibTeX extension) (no change since last report)
  • OneNote for Linux by @patrikx3 (no change since last report)
  • Nightly edition of Firefox (no change since last report)
  • Zoom (no change since last report)
  • Microsoft Teams for Linux (no change since last report)
  • Gogh for Linux terminal preferences (no change since last report)
  • Super productivity (instead of Task Warrior) (changed since last report)
  • Cuckoo Timer for Pomodoro sessions (changed since last report)
  • RescueTime for time tracking (no change since last report)
  • BeeMindr for commitment based goals (no change since last report)
  • Mycroft as my Linux-based voice assistant (no change since last report)
  • Okular as my preferred PDF reader (instead of Evince on Linux and Adobe Acrobat on Windows) (changed since last report)
  • NocoDB for visual database work (changed this report)
  • ObservableHQ for data visualistion (changed this report)

Techniques

  • Pomodoro (no change since last report)
  • Passion Planner for planning (no change since last report)
  • Time blocking (used on and off, but a lot more recently)

What’s changed since the last report?

There’s very little that’s changed since my last State of My Toolchain report in 2021: I’m still doing a PhD at the Australian National University’s School of Cybernetics, and the majority of my work is researching, writing, interviewing, and working with data.

Tools for PhD work

My key tools are MaxQDA for qualitative data analysis – Windows only, unfortunately, and prone to being buggy with OneDrive. My writing workflow is done using Atom. One particularly useful tool I’ve adopted in the last year has been NocoDB – it’s an opensource alternative to visual database interfaces like Notion and AirTable, and I found it very useful – even if the front end was a little clunky. Working across Windows and Linux, I’ve settled on Okular as my preferred PDF reader and annotator – I read on average about 300-400 pages of PDF content a week, and Adobe Acrobat was buggy as hell. Okular has fine-grained annotating tools, and the interface is the same across Windows and Linux. Another tool I’ve started to use a lot this year is ObservableHQ – it’s like Jupyter notebook, but for d3.js data visualisations. Unfortunately, they’ve recently brought in a change to their pricing structure, and it’s going to cost me $USD15 a month for private notebooks – and I don’t think the price point is worth it.

Hardware and wearables

The key changes this year are a phone upgrade – my Pixel 3 screen died, and the cost to replace the screen was exorbitant – a classic example of planned obsolescence. I’ve been happy with Google’s phones – as long as I disable all the spyware voice enabled features, and settled on the Pixel 4a 5G. It’s been a great choice – clear, crisp photos, snappy processor, and excellent battery life.

After nearly four years, my Mobvoi Ticwatch Pro started suffering the “ghost touch” problem, where the touch interface started picking up non-existent taps. A factory restart didn’t solve the problem, so I got the next model up – the Ticwatch Pro 2020 – at 50% off. This wearable has been one of my favourite pieces of hardware – fast, responsive, durable – and I can’t imagine not having a smartwatch now. I’ve settled on the Flower watch face after using Pujie Black for a long time – both heavily customisable. The love Google is giving to Wear OS is telling – I have much smoother integration between phone apps and Wear OS apps than even 1-2 years ago.

After having two Plantronics Backbeat Pro headphones – one from around 2017 and the other circa 2021, both still going, but the first with a very poor battery life and battered earpads, I invested in my first pair of reasonable headphones – the Sennheiser Momentum Pro 3. The sound quality is incredible – I got them for $AUD 300 which I thought was a lot to pay for headphones, but they’ve been worth every penny – particularly when listening to speech recognition data.

With so much PhD research and typing, I found my Logitech MK240 just wasn’t what I needed – it’s a great little unit if you don’t have anything else, but it was time for a mechanical keyboard because I love expensive hobbies. After some research, and a mis-step with the far too small HuoJi Z-88, (the keypresses for linux command line tasks were horrendous) I settled on the Keychron K8 and haven’t looked back. Solid, sturdy, blue Gateron switches – it’s a dream to type on, and works well across Windows and Linux. However, on Linux it is using a Mac keyboard layout and I had to do some tweaking with a keymapper – and used keyd. My only disappointment with Keychron is the hackyness needed to get it working properly on Linux.

Productivity

My Passion Planner is still going strong, but I haven’t been as diligent as using it as a second brain as I have been in the past, and the price changes this year meant that shipping one to Australia cost me nearly $AUD 120 in total – and that’s unaffordable in the longer term – so I’m actively looking at alternatives as as Bullet Journalling. The Passion Planner is great – it’s just expensive.

I’ve also dropped Task Warrior in favour of Super Productivity this year. Task Warrior isn’t cross-platform – I can’t use it on Android, or on Windows, and thanks to MaxQDA software, I’m spending a lot more of my time in Windows. The Gothenberg Bit Factory are actively developing Task Warrior – full transparency, I’m a GitHub sponsor of theirs – but the cloud-based and cross-platform features seem to be taking a while to come to fruition.

I’m also using time-blocking a lot more, and am regularly using Cuckoo as a pomodoro timer with a PhD cohort colleague, T. We have an idea for a web app that optimises the timing of Pomodoros based on a feedback loop – but more on that next year.

Current gaps in my toolchain

Visual Git editor

In my last State of My Toolchain report, I lamented having a good Visual Git Editor. That’s been solved in Windows with GitHub’s desktop application, but as of writing the Linux variant appears to be permanently mothballed. I’m sure this has nothing to do with Microsoft buying GitHub. So I am still on the lookout for a good Linux desktop Git GUI. On the other hand, doing everything by CLI is always good practice.

Second Brain

In my last report I also mentioned having taken Huginn for a spin, and being let down at its immaturity. It doesn’t seem to have come very far. So I’m still on the lookout for “Second Brain” software – this is more than the knowledge management software in the space that tools like Roam and Obsidian occupy, but much more an organise-your-life tool. The Microsoft suite – Office, Teams, and their stablemates – are trying to fill this niche, but I want something that’s not dependent on an enterprise login. But I’ve decided to reframe this gap as a “Second Brain” gap – after reading Tiago Forte’s book on the topic.

The Fediverse

Triggered by Elon Musk’s purchase, and subsequent transformation of Twitter into a flaming dumpster fire, I’ve become re-acquainted with the Fediverse – you can find me on Mastodon here, on Pixelfed.au here, and on Bookwyrm here. However, the tooling infrastructure around the Fediverse isn’t as mature – understandably – as commercial platforms. I’m using Tusky as my Android app, and the advanced web interface. But there are a lack of hosting options for the Fediverse – I can’t find a pre-configured Digital Ocean Droplet for Mastodon, for example – and I think the next year will see some development in this space. If you’re not across Mastodon, I wrote a piece that uses cybernetic principles to compare and contrast it with Twitter.

5 years of toolchain trends

After five years of the State of My Toolchain report, I want to share some reflections on the longer-term trends that have been influential in my choice of tools.

Cross-platform availability and dropping support for Linux

I work across three main operating systems – Linux, Windows (because I have to for certain applications) and Android. The tools I use need to work seamlessly across all three. There’s been a distinct trend over the last five years for applications to start providing Linux support but then move to a “community” model or drop support altogether. Two cases in point were Pomodone – which I dropped because of its lack of Linux support, and RescueTime – which still works on Linux for me, albeit with some quirks (such as not restarting properly when the machine awakes from suspend). This is counter-intuitive given the increasing usage of Linux on the desktop. The aspiration of many Linux aficionados that the current year will be “The Year of Linux on the Desktop” is not close to fruition – but the statistics show a continued, steady rise – if small – in the number of Linux desktop users. This is understandable though – startups and small SaaS providers cannot justify supporting such a small user base. That said, they shouldn’t claim to support the operating system then drop support – as both Pomodone and RescueTime have done.

Takeaway: products I use need to work cross-platform, anywhere, anytime – and especially on Linux.

Please don’t make me change my infrastructure to work with your product

A key reason for choosing the Ticwatch Pro 2020 over other Mobvoi offerings was that the watch’s charger was the same between hardware models. I’d bought a couple of extra chargers to have handy, and didn’t want to have to buy more “spares”. This mirrors a broader issue with hardware – it has a secondary ecosystem. I don’t just need a mobile phone, I need a charger, a case, and glass screen protectors – a bunch of accessories. These are all different – they exhibit variety – a deliberate reduction in re-usability and a buffer against commodification. But in choosing hardware, one of my selection criteria is now re-usability or upgradeability – how can I re-use this hardware’s supporting infrastructure. The recent decision by Europe to standardise on USB-C is the right one.

Takeaway: don’t make me buy a second infrastructure to use your product.

I’m happy to pay for your product, but it has to represent value for money, or it’s gone

Several of my tools are open source – Super Productivity, NocoDB, Atom, Pandoc – and where I can, I GitHub sponsor them or provide a monetary contribution.On the whole, these pieces of software are often worth a lot more too me than the paid proprietary software I used – for example, MaxQDA is over $AUD 300 a year – predominantly because it only has one main competitor, NVIVO. I have no issue paying for software, but it has to represent value for money. If I can get the same value – or nearly equivalent – from an open source product, then I’m choosing open source. Taguette wasn’t there over MaxQDA, but Super Productivity has equivalent functionality to Pomodone. Open source products keep proprietary products competitive – and this is a great reason to invest in open source where you are able.

That’s it! Are there any products or platforms you’ve found particularly helpful? Let me know in the comments.

Solving MaxQDA error 1001: Error while converting the project!

As folx might know, I’m currently undertaking a PhD at ANU’s School of Cybernetics – where I’m researching voice and speech data and datasets that are used to train machine learning models used for things like speech recognition and wake word detection. And if you’ve been following my posts on the State of my Toolchain, and my previous post exploring Taguette, you’ll know that I’ve settled on MaxQDA as my qualitative data analysis software. In general, I’ve found MaxQDA to be great software – the user interface is intuitive and the analytical features it have make qualitative data analysis faster. It’s expensive – and is a yearly subscription – but at the moment, it’s earning its price tag.

One definite bugbear I have though is how MaxQDA interacts with SharePoint. As part of my ethics protocol, I am storing my PhD data on university systems – not to external cloud tools like DropBox or Next Cloud. Instead, I save the MaxQDA files to my local (Windows – MaxQDA doesn’t have a Linux client, unfortunately) machine. This is then synced with OneDrive to the University’s SharePoint server.

This works well. Except when it doesn’t.

A couple of months ago I had an error that seemed like a once-off; an error where MaxQDA apparently couldn’t convert the project file. This error was presented when I opened the MaxQDA file (a .mx22 file):

MaxQDA error 1001: Error while converting the project!
MaxQDA Error code: 1001 “Error while converting the project!”

Like so many error messages, it violated design principles for good error messages; it wasn’t a precise description of what had gone wrong, it wasn’t human readable, and it didn’t give me any helpful advice on how to solve the problem. So, I had to figure it out myself.

I tried the obvious things first;

  • I closed MaxQDA and re-launched the software; the error persisted.
  • I restarted my computer and then re-launched the MaxQDA software; the error persisted.
  • I stopped and started the OneDrive service; the error persisted.

At this point, it was clear I’d have to dig deeper into OneDrive. In File Explorer, I could see that the file was still synchronising with OneDrive:

Windows file explorer showing MaxQDA file still synchronising in OneDrive
MaxQDA file still synchronising with OneDrive

By rights, stopping and starting OneDrive should have re-synchronised the file; but it hadn’t.

OneDrive was also showing a synchronisation error:

MaxQDA file causing a Sync issue in OneDrive. OneDrive thinks that the file is in use.
MaxQDA – file still open in OneDrive

Clearly, OneDrive thought that the MaxQDA file was still open; and was not syncing the file to the cloud for this reason. However, closing MaxQDA, OneDrive and a whole reboot had not fixed this error.

My conclusion from this investigation is that MaxQDA somehow leaves an open file handle; for example if the application closes unexpectedly. The open file handle is not cleared via MaxQDA, via OneDrive, or via the underlying Windows operating system. So how else might you clear an open file handle?

Windows isn’t my preferred operating system; and I don’t know enough about the OS internals to go digging into file handles and how to clear them. So I went rm -rf; or about as close to it as you can get on Windows …

The solution

The only thing that did fix the issue was uninstalling OneDrive, re-installing OneDrive, and then re-authenticating OneDrive and allowing it to sync to the cloud. My working hypothesis is that the uninstallation of OneDrive forces Windows to clear any open OneDrive file handles; then the re-installation returns the MaxQDA file to a known good state.

All in all, this took about an hour of investigation to identify the issue and find a workaround. And to be clear – the solution is just a workaround – it doesn’t address the underlying problem – which is that MaxQDA files that synchronise from a local machine to the cloud via OneDrive or SharePoint are prone to synchronisation failure that manifests in an open file handle; which in turn leads to an obscure error message.

This has now happened to me twice – but at least now I know how to fix it next time …

Update: Resetting the OneDrive cache appears to resolve this issue

So, after encountering this issue with MaxQDA for around the fifth time, and even after uninstalling and reinstalling OneDrive, I did a bit more digging, and found some blog posts that suggested resetting the OneDrive cache.

This is covered in this how-to-guide, but the commands are essentially:

  1. Open the Windows command tool as Administrator (you need to be an administrator to clear the OneDrive cache)
  2. Run the command %localappdata%\Microsoft\OneDrive\onedrive.exe /reset\
  3. Then restart OneDrive by running the application

This worked for me – so it’s another possible workaround for this very frustrating issue!

A review of Taguette – an open source alternative for qualitative data coding

Motivation and context

As you might know, I’m currently undertaking a PhD program at Australian National University’s School of Cybernetics, looking at voice dataset documentation practices, and what we might be able to improve about them to reduce statistical and experienced bias in voice technologies like speech recognition and wake words. As part of this journey, I’ve learned an array of new research methods – surveys, interviews, ethics approaches, literature review and so on. I’m now embarking on some early qualitative data analysis.

The default tool in the qualitative data analysis space is NVIVO, made by Melbourne-based company, QSR. However, NVIVO has both a steep learning curve and a hefty price tag. I’m lucky enough that this pricing is abstracted away from me – ANU provides NVIVO for free to HDR students and staff – but reports suggest that the enterprise licensing starts at around $USD 85 per user. NVIVO operates predominantly as a desktop-based pieces of software and is only available for Mac or Windows. My preferred operating system is Linux – as that is what my academic writing toolchain based on LaTeX, Atom and Pandoc – is based on – and I wanted to see if there was a tool with equivalent functionality that aligned with this toolchain.

About Taguette

Taguette is a BSD-3 licensed qualitative coding tool, positioned as an alternative to NVIVO. It’s written by a small team of library specialists and software developers, based in New York. The developers are very clear about their motivation in creating Taguette;

Qualitative methods generate rich, detailed research materials that leave individuals’ perspectives intact as well as provide multiple contexts for understanding the phenomenon under study. Qualitative methods are used in a wide range of fields, such as anthropology, education, nursing, psychology, sociology, and marketing. Qualitative data has a similarly wide range: observations, interviews, documents, audiovisual materials, and more. However – the software options for qualitative researchers are either far too expensive, don’t allow for the seminal method of highlighting and tagging materials, or actually perform quantitative analysis, just on text. It’s not right or fair that qualitative researchers without massive research funds cannot afford the basic software to do their research. So, to bolster a fair and equitable entry into qualitative methods, we’ve made Taguette!

Taguette.org website, “About” page

This motivation spoke to me, and aligned with my own interest in free and open source software.

Running Taguette and identifying its limitations

For reproduceability, I ran Taguette version 1.1.1 on Ubuntu 20.04 LTS with Python 3.8.10

Taguette can be run in the cloud, and the website provides a demo server so that you can explore the cloud offering. However, I was more interested in the locally-hosted option, which runs on a combination of python, calibre, and I believe sqlite as the database backend, with SQLAlchemy for mappings. The install instructions recommend running Taguette in a virtual environment, and this worked well for me – presumably running the binary from the command line spawns a flask– or gunicorn– type web application, which you can then access in your browser. This locally hosted feature was super helpful for me, as my ethics protocol has restrictions on what cloud services I could use.

To try Taguette, I first created a project, then uploaded a Word document in docx format, and began highlighting. This was smooth and seamless. However, I soon ran into my first limitation. My coding approach is to use nested codes. Taguette has no functionality for nested codes, and no concomitant functionality for “rolling up” nested codes. This was a major blocker for me.

However, I was impressed that I could add tags in multiple languages, including non-Latin orthographies, such as Japanese and Arabic. Presumably, although I didn’t check this, Taguette uses Unicode under the hood – so it’s foreseeable that you could use emojis as tags as well, which might be useful for researchers of social media.

Taguette has no statistical analysis tools built in, such as word frequency distributions, clustering or other corpus-type methods. While these weren’t as important for me at this stage of my research, they are functions that I envisage using in the future.

Taguette’s CodeBook export and import functions work really well, and I was impressed with the range of formats that could be imported or exported.

What I would like Taguette to do in the future

I really need nested tags that have aggregation functionality for Taguette to be a a viable software tool for my qualitative data analysis – this is a high priority feature, followed by statistical analysis tools.

Some thoughts on the broader academic software ecosystem

Even though I won’t be adopting Taguette, I admire and respect the vision it has – to free qualitative researchers from being anchored to expensive, limiting tools. While I’m fortunate enough to be afforded an NVIVO license, many smaller, less wealthy or less research-intensive universities will struggle to provide a license seat for all qualitative researchers.

This is another manifestation of universities becoming increasingly beholden to large software manufacturers, rather than having in-house capabilities to produce and manage software that directly adds value to a university’s core capability of generating new knowledge. We’ve seen it in academic journals – with companies like EBSCO, Sage and Elsevier intermediating the publication of journals, hording copyrights to articles and collecting a tidy profit in the process – and we’re increasingly seeing it in academic software. Learning Management Systems such as Desire2Learn and Blackboard are now prohibitively expensive, while open source alternatives such as Moodle still require skilled (and therefore expensive) staff to be maintained and integrated – a challenge when universities are shedding staff in the post-COVID era.

Moreover, tools like NVIVO are imbricated in other structures which reinforce their dominance. University HDR training courses and resource guides are devoted to software tools which are in common use. Additionally, supervisors and senior academics are likely to use the dominant software, and so are in an influential position to recommend its use to their students. This support infrastructure reinforces their dominance by ascribing them a special, or reified status within the institution. At a broader level, even though open source has become a dominant business model, the advocacy behind free and open source software (FOSS) appears to be waning; open source is now the mainstream, and it no longer requires a rebel army of misfits, nerds and outliers (myself included) to be its flag-bearers. This begs the question – who advocates for FOSS within the academy? And more importantly – what influence do they have compared with a slick marketing and sales effort from a global multi-national? I’m reminded here of Eben Moglen’s wise words at linux.conf.au 2015 in Auckland in the context of opposing patent trolls through collective efforts – “freedom itself depends upon how we make use of the technologies we are creating”. That is, universities themselves have created the dependence on academic technologies which now restrict them.

There is hope, however. Platforms like ArXiv – the free distribution service and open access archive for nearly two million pre-prints in mathematics, computer science and other (primarily quant) fields – are starting to challenge the status quo. For example, the Australian Research Council recently overturned their prohibition on the citation of pre-prints in competitive grant applications.

Imagine if universities combined their resources – like they have done with ArXiv – to provide an open source qualitative coding tool, locally hosted, and accessible to everyone. In the words of Freire,

“Reading is not walking on the words; it’s grasping the soul of them.”

Paulo Freire, Pedagogy of the Oppressed

Qualitative analysis tools allow us to grasp the soul of the artefacts we create through research; and that ability should be afforded to everyone – not just those that can afford it.