Digitizing and indexing handwritten notes and sketches with linknotes

2012-05-26 by mira

I made it a habit to jot down ideas in a paper notebook. Those ideas range from conceptual sketches to concrete todo lists and are either related to concrete projects I'm working on or candidates for future projects. In short I write down everything that seems worth to be persisted, even if there's no immediate use for it. Paper notebooks have the disadvantage that they support only sequential data access - one needs to flip through all pages to find a particular note. The more notes one has, the more this disadvantage shows.

To permit easier recovery of specific notes, I implemented the following system:

  1. scan new pages from the notebook periodically to get a digital representation (bitmap images)
  2. tag the images with keywords which classify their content (topic, project, ...)
  3. create a browsable index based on the tags

The two driving goals for the design process were:

  1. create minimal work overhead for the system user (me)
  2. just write the absolutely necessary code (favor a straightforward and pragmatic implementation)

This led to the following key design decisions:

  1. tags are contained in the filename: addding/editing/deleting tags means renaming a file
  2. tag parsing is simplified by naming convention: fixed index in filename marks begin of tag section
  3. browsable index is simply one directory for each tag containing symlinks to all related images
  4. using the index means launching an image viewer inside the directory for a given tag

Let's have a look at an example:

Linknotes Sample

I don't bother to split an image into two if a notebook page contains two unrelated sections. Tags are applied to an image, not to a section and sections exist only conceptually.

mira@apu:~/temp/linknotes_test$ tree
.
|-- index
|-- linknotes.jar
`-- notes
    |-- 2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg
    `-- 2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg

2 directories, 3 files
Filename tokens: In this example, there are four different tag-values: td4j, mdsd, sitegen, semweb

If we run the indexer on the examples, we get the following result:

mira@apu:~/temp/linknotes_test$ java -jar linknotes.jar notes/ index/
notes: /home/mira/temp/linknotes_test/notes
index: /home/mira/temp/linknotes_test/index

mira@apu:~/temp/linknotes_test$ tree
.
|-- index
|   |-- mdsd
|   |   |-- 2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg -> /home/mira/temp/linknotes_test/notes/2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg
|   |   `-- 2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg -> /home/mira/temp/linknotes_test/notes/2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg
|   |-- semweb
|   |   `-- 2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg -> /home/mira/temp/linknotes_test/notes/2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg
|   |-- sitegen
|   |   `-- 2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg -> /home/mira/temp/linknotes_test/notes/2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg
|   `-- td4j
|       |-- 2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg -> /home/mira/temp/linknotes_test/notes/2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg
|       `-- 2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg -> /home/mira/temp/linknotes_test/notes/2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg
|-- linknotes.jar
`-- notes
    |-- 2011-07-13_nb_001_001_td4j_mdsd_sitegen.jpg
    `-- 2011-07-13_nb_001_002_mdsd_td4j_semweb.jpg

6 directories, 9 files

Sourcecode is hosted at bitbucket.

Archive

architecture