How I Built a Personal Knowledge Base for Myself

Sung Cho Sep 9, 2018 Stories

During the past year and a half, I have been building and using my own personal knowledge base. This knowledge base contains virtually everything I have learned during this period. It supports categorization and search and automates spaced repetition via email. In this post, I would like to share how I built it and highlight the ways that it has been particularly helpful so far.

It all started with my programming notes in Evernote. Years ago, I used to keep a notebook in Evernote about coding knowledge that I thought would have been valuable to my future self. The notebook contained things such as gotchas and useful snippets, as well as computer science theories. I spent a lot of time elaborating the notes and applying appropriate tags in a hope that I can make that knowledge truly mine. Yet, this system was not particularly effective.

Organizing knowledge should never become the goal

A super clean organization system does not actually help us retain knowledge. Rather, such system requires so much maintenance effort that it subverts the very thing that it promises to do–help us learn–. I ended up expanding an exorbitant amount of time managing tags, keywords, and hierarchy of data that I may never even retrieve. All in all, organizing knowledge should never become the goal when building a personal knowledge base.

Instead, the goal should be to retain knowledge with minimal effort. With this in mind, I abandoned my years worth of technical notes in Evernote and decided instead to just jot down everything I learned as quickly as possible. I wrote a simple command line program called Dnote that just kept appending stuff to a YAML file. There was no tagging. In the spirit of keeping minimal overhead, all notes had to be a single line. I used to run it like this:

$ dnote add javascript "You cannot make a synchronous result out of an asynchronous operation"

And the note would be appended to a YAML file in the home directory:

javascript:
- .bind() creates a new function
- You cannot make a synchronous result out of an asynchronous operation

linux:
- Rename stuff using `rename -n "s/ /_/g" *`
- SIGHUP stands for 'signal hangup' and is sent to a process when its controlling terminal is closed

This minimalist approach helped me see that knowledge was all that mattered, and developing a sophisticated system should have never become the goal in itself.

Searching is far more efficient than organizing

Even though the new system did not have an elaborate organizational scheme, all my knowledge were now easily reachable. I could just grep a keyword because everything was in a single text file! Later I started syncing my notes to a remote Postgres database and made a web interface that could do a full-text search. Even without complicated tags or taxonomies, all my knowledge could be easily retrieved. I have learned that, when it comes to building a knowledge base, a search is far more important and effective than super neat organization system.

At this point, this newly established knowledge base had become a buffer for my web search. And that buffer was surprisingly effective. As programmers, we often search for a solution on the web by scavenging websites such as Stack Overflow. It takes time not only to find the right information but also to digest it. But when we already have information that we ourselves have written down readily available at our disposal, the time to get to the solution is dramatically reduced.

For instance, despite having used Linux for a decade, I still do not know the correct arcane combination of tar flags for decompressing an archive. Usually, I would search for “how to unzip tar,” land on a Stack Overflow answer, and read the answer to get the solution. But now I simply search for “tar” from Dnote or do cat ~/.dnote/dnote | grep tar.gz. And I am presented with the following information that I composed myself.

Reading my knowledge base helps me get to the solution much faster than searching for the web because I compose my own knowledge base. On the web, we bookmark or ‘star’ a Stack Overflow question at best, and the information still remains at large in the vast sea of the Internet, in someone else’s words. In contrast, a knowledge base only contains the things we know we have already learned, and in our voice. All we need to do is to jog the memory back rather than trying to re-learn every time.

A knowledge base needs to be local first and open source

When we are building a personal knowledge base, we are building something that should last a lifetime because learning is for life. Therefore we should not have to surrender our data to a software-as-a-service company or to a closed source program. If the service terminates or a program is no longer maintained, we jeopardize not only our accumulated knowledge but also the precious habit of keeping a knowledge base we will have acquired the hard way.

I find that it is imperative for any kind of knowledge base to be local-first and open source. That way, we are not putting our knowledge at the mercy of someone else. Instead, we control and own our data. It has been said that the knowledge is the most important asset. Why should we have to give up our control over it to a proprietary software? Therefore Dnote is built to support local use first and is a free software under the GPL license. This setup gives power back to learners such as myself. We can access our knowledge whenever and however we want, and control the software rather than the other way around.

In addition, I find that a knowledge base should store our knowledge in a format that is as simple and as easily accessible as possible. That way, we can easily transition to another software if a need arises for any reason. As we have seen before, I chose to store my knowledge in YAML at first. Now I have migrated to JSON to add some metadata such as timestamps. Now the file looks like this:

{
  "algorithm": {
    "name": "algorithm",
    "notes": [
      {
        "uuid": "9a7634ff-8dc7-4afd-9d61-7fc6aff08452",
        "content": "in-place means no extra space required. it mutates the input",
        "added_on": 1506148290,
        "edited_on": 0,
        "public": false
      },
      {
        "uuid": "6155b21a-894b-47e4-babc-4c72aeb462c3",
        "content": "stable sort means elemenets of equal keys appear in the same order in the output as input",
        "added_on": 1515966992,
        "edited_on": 0,
        "public": false
      },
      ...
    ]
  },
  ...
}

This way, we can easily transfer our knowledge to some other tools or do all sort of creative things with it. We can write a simple script to change the format to suit our needs. Or we could write a script to automatically do spaced repetition on our notes to help us retain the knowledge. That is actually one of the first things I have done since making Dnote–it has been automatically sending me email digest for more than a year now. The possibilities are endless when we are in control of our knowledge.

The bottom line is that a knowledge base should not force us to put our knowledge at the mercy of a proprietary technology. Instead, we should be given the full control of our knowledge because learning is for life.

We almost never look back at our notes

Spaced repetition is no longer supported in an effrot to simplify Dnote. Find out more in this product update .

Another reason not to spend too much time on organizing our notes is that we do not actually go back to our notes all that often. Simply put, there are not enough incentives for us to regularly sit down and read what we have put in the knowledge base thus far. All my technical notes in Evernote had rarely seen the light once they had been saved. We can streamline knowledge into a pot, but we need something to stir the pot to prevent a knowledge rot.

That is where the automatic email digest came in. I wrote a simple Go program to send me email digest every Friday to remind me of what I had learned that week, the week before in the further past. I think this method has been effective because it does not require any extra effort. We just need to go on with our lives and occasionally check our inboxes as we would normally do, and the spaced repetition will automatically happen. This feature has also been available to all Dnote users.

Again, keeping a neat knowledge base in itself should never be the end goal. It is all the other creative things you can do with the knowledge base that makes it worthwhile to keep one. I am proud to say that, over the years, the automated email digest has helped me learn things that I would have otherwise forgotten. These creative things are more reasons for a knowledge base to be open source and to store knowledge in the most accessible format possible.

Note sharing is no longer supported in an effrot to simplify Dnote. Find out more in this product update .

A neat benefit of having a personal knowledge base is being able to package bite-sized knowledge into easily consumable units that can help us in everyday situations. It can save time to have a well indexed personal repository of knowledge easily sharable with the others. In a technical discussion, debugging session or in group learning situations, it reduces the communication overhead to be able to simply share a piece of information we have carefully composed in the past.

For instance, while pair-programming, I am able to simply quote something from my personal knowledge base to concretely explain an abstract concept. Or even better, I can just share a link to my knowledge using the web client. Without the knowledge base, I would have shared a long blog article or a Stack Overflow answer to reasonably convey technical ideas. But since ideas were already packaged and indexed by none other than myself, I could just share it and others could immediately understand the gist of ideas no matter how complex the ideas were.

It is as though all my knowledge now has a unique address on the Internet that I can easily share with the others. And I can choose to break them down or compose them however I want and make them private or public.

All in all, it has been like having a second brain that I can share with people to clearly convey complex ideas in minimal time. I can imagine this collaborative aspect of a personal knowledge base being useful in many other situations in which ideas have to be expressed.

Conclusion

Building and using a personal knowledge base has helped me acquire knowledge in subject areas beyond just programming. Nowadays, I find myself using this system to learn things such as vocabulary, foreign language, finance, and philosophy.

Saving ideas to this system feels as though committing my ideas into a source control. Doing so is oddly satisfying because I know that the knowledge is securely stored into an open source system that I fully control and that the piece of knowledge will remain forever accessible at my fingertips and easily sharable with someone else. At the same time, it is almost therapeutic in that I do not have to consciously bear the burden of an idea. I think I have built a system on which I can unload some of my cognitive overhead.

Learning is for life. And I feel that our knowledge deserves a safe home for life. Through building and using a personal knowledge base, I have been able to safely retain ideas that would have normally escaped my limited memory. I would suggest anyone aspiring for growth to give a proper house to their learning. We all need a personal repository of knowledge because otherwise, our knowledge might not stand the test of time.