/ / News

For the past couple weeks, I’ve been working with Thomas “cmdln” Gideon (host of the fabulously nerdy Command Line podcast) on a free software project for writers called “Flashbake.” This is a set of Python scripts that check your hot files for changes every 15 minutes, and checks in any changed files to a local git repository. Git is a free “source control” program used by programmers to track changes to source-code, but it works equally well on any text file. If you write in a text-editor like I do, then Flashbake can keep track of your changes for you as you go.

I was prompted to do this after discussions with several digital archivists who complained that, prior to the computerized era, writers produced a series complete drafts on the way to publications, complete with erasures, annotations, and so on. These are archival gold, since they illuminate the creative process in a way that often reveals the hidden stories behind the books we care about. By contrast, many writers produce only a single (or a few) digital files that are modified right up to publication time, without any real systematic records of the interim states between the first bit of composition and the final draft.

Enter Flashbake. Every 15 minutes, Flashbake looks at any files that you ask it to check (I have it looking at all my fiction-in-progress, my todo list, my file of useful bits of information, and the completed electronic versions of my recent books), and records any changes made since the last check, annotating them with the current timezone on the system-clock, the weather in that timezone as fetched from Google, and the last three headlines with your by-line under them in your blog’s RSS feed (I’ve been characterizing this as “Where am I, what’s it like there, and what am I thinking about?”). It also records your computer’s uptime. For a future version, I think it’d be fun to have the most recent three songs played by your music player.

The effect of this is to thoroughly — exhaustively — annotate the entire creative process, almost down to the keystroke level. Want to know what day you wrote a particular passage? Flashbake can tell you. Want to know what passage you wrote on a given day? That too. Plus, keeping track of my todo.txt file means that I get a searchable database of all the todo items I’ve ever used, with timestamps for their appearance and erasure.

Additionally, since git repositories are made to replicate, you can publish some or all of your projects to the public web or to a private site. I’m hoping that my publisher will use a public git repo to check out the most recent versions of my in-print books every time they go back to press for a new edition, and use the built-in compare (“diff”) function to find all the typos I’ve fixed since the last edition.

It’s all pretty nerdy, I admit. But if you’re running some kind of Unix variant (I use Ubuntu Intrepid Ibex, but this’d probably do fine on a Mac with OS X, too) and you want to give it a whirl, Thomas has made all the scripts available as free software. He’s working on a new version now with plugin support, which is exciting!

Cory wanted the version to carry prompts, snapshots of where he was at the time an automated commit occurred and what he was thinking. I quickly sketched out a Python script to pull the contextual information he wanted and started hacking together a shell script to drive git, using the Python script’s output for the commit comment when a cron job invoked the shell wrapper.

I added my own idea to the project, borrowing from continuous integration build systems the idea of a quiet period. I could easily imagine Cory actively working on a story, saving continually and a commit happening mechanically in the midst of that writing being less useful than if the script could find a quiet time to commit. This enhancement prompted me to ditch my shell script wrapper and pull that logic all into Python.

Flashbake

(Thanks, Thomas!)

14 Responses to “Flashbake: Free version-control for writers using git”

  1. shaydchara

    Very cool idea! I’d switched to Scrivener on my Mac after confusing my non-versioned text files on a PC too many times. This sounds like a great alternative.

  2. John Shimek

    Very cool idea that I will play with. But one thing, why Flashbake hosted on something like github? It seems like a good fit, an open source program that interacts with git.

  3. Rich Vazquez

    I started to use git manually. One of the problems I had is I wanted to use the diff options to see text changes. But I have to use plain text or something similar. Do you just ignore this feature or do you use something where you can see the plain text differences when you write?

  4. Max Battcher

    Rich: Cory mentioned above that he uses a plain-text based format most of the time.

    There are some useful plain-text based formats for writing. I often use reStructuredText (rst2a.com has a stylish intro for the non-coder) for that purpouse.

    I know of a trick that should work with newer, richer formats, that you probably even integrate into flashbake. The .docx format of Word (standard in Word 07 and available as a plugin to earlier versions) and the .odf format of OpenOffice both are zipped wrappers around a collection of files including an XML of the actual text. (That is, they are just about exactly like any other .zip file.) What this means is that a tool like flashbake should be able to unzip the .docx/.odf into a directory structure to record more meaningful commits that show some almost-plain-text differences. If you want to get a point-in-time copy of the file you just need to zip the contents back up into the original filename…

  5. Kasper Souren

    Funny, I started writing a book 2 days ago (using pyroom) and I instinctively set up a git repo and a cron job that pushes my changes to my server. Synchronicity.

  6. Utopiah

    Very cool but… what’s the advantage over a wiki? A wiki stores the history version of each file, it’s easy to set up and can embed visual and edition tools right within.

  7. Cory Doctorow

    Well, a Wiki isn’t as nice for entering text, for starters. The way this works now is, I just maintain my existing text-files and git and the scripts manage the changelog for them. I don’t want visual tools, I want the EMACS keybindings I’ve used for 15 years and the full-suite of in-editor plugins and macros that are part of my writing practice already.

  8. LDJessee

    I struggle with many things, and version tracking my writing is one of them. As my first request for a plugin is one that takes a shot with a webcam (Mac, so iSight builtin). Just more archiving. :)

  9. Tim

    Just listened to your TwiT podcast appearance, and thought you missed a point in the DTV that flashbake is the exponent of.
    When VCR’s become obsolete, and the providers have de-facto control over the media player, who’s going to have the digital copies of the crap that i don’t track?
    The BBC is transmitting on BBC7 and other channels stuff that it got back from the void with its ‘amnesty’ and re-mastered for re-broadcast. Lost Goon shows, and other old radio are now back in the archive. If you love a show, and you cannot keep a permanent copy because you don’t own your media player….

    Just a quick FYC (for your consideration)

  10. Francis Fish

    I’m trying to work out what format you use for your text files – and this is the nearest-ish article google found.

    I use emacs for coding and want to use it for everything else, but don’t want to type HTML or something equally nasty. HTML is for computers to read – I am not one. I will look at rst, but my blog will be using redcloth (probably). What would you recomment, Cory? What do you use?

    Thanks

  11. tycho garen

    I’ve been using git (manually, without cool add-ons like this) for nearly a year (and subversion for a year or so before that) to track this kind of thing–though not as automatically–and it’s been a great experience, and it works really great. Basically, at this point my entire text-based digital existence gets stored in git repositories. Config files for emacs (and other programs), wikis using the ikiwiki engine, todolists (via org-mode, but same deal), and so forth. I even replicate my MailDir inside of a git repo (which works oh so much better than IMAP).

    Despite how cool version control is (and it is), the distributed nature of git is what seals the deal for me. I have clones of my repositories on my desktop and my laptop, as well as on a server. So I can bounce changesets between computers and servers as I need to. Running multiple machines is no longer a headache. Being out in the wild without my machine is fine as long as I have a thumb drive with putty on it.

    re #14 Francis Fish: I use Markdown. Between the HTML converters, the plugins for most blogging applications, and the ability to translate it to LaTeX>PDF with Maruku it is made out of win.

  12. J3r1m1ah

    i read ur book little brother, & loved it . i was totally inspired by everything. im trying to get in2 this scene but i need tech answers and help. what do i read? who do i talk 2? can i ask u?

  13. Max Battcher

    Was thinking about flashbake again today and read through the comments. Thought it might be interesting to belatedly point out that I actually built the tool I hinted at above. Called musdex (http://pythonhosted.org/musdex/), which is an intentionally obscure old Zork universe magic word, it’s a tool you can set up in pre/post hooks to auto-unzip and format things like Word’s docx or OO’s odt files (and Celtx files and more) into a source-controlled directory. Some people may find it a nice side-by-side tool.

    Just noticed I’ve still not written Git integration documentation, but that is fairly straightforward to write.

  14. Max Battcher

    Documented using musdex with git, if anyone is interested (documentation link in #34). I may have missed something, so feedback would be appreciated.

Leave a Reply