Books can sometimes change your perspective on things. Make you reconsider the way you have done things before. I’ve recently read one of those books. One that made me revaluate the openness of projects.

Living in Data

In Living in Data, Jer Thorp suggests some questions that help you score the openness of a dataset. They are:

  1. Does the project have comprehensible documentation, examples, and tutorials?
  2. Are there materials that offer context around the data so someone unfamiliar with the project can understand why it might be important?
  3. Can a nonprogrammer access the data?
  4. Is there documentation available in more than one language?
  5. Is your documentation and the site it is hosted on compatible with screen readers?

Although my project isn’t just data, it doesn’t do to well when scoring it with these questions. Let’s review them one by one.

1: Does the project the project have comprehensible documentation, examples, and tutorials?

Well, sort of. I do have a GitHub repository with a decent readme (I think). And I have some blogposts about my project. If you have a GitHub account and access to the internet, I think the information available is okay. I’ll be nice and give the project 1 point here.

2: Are there materials that offer context around the data so someone unfamiliar with the project can understand why it might be important?

Maybe. In my blogposts, I try to a be a bit more generic in my terminology. But I don’t really try to explain why my project might be important, not even to people familiar with the project. So that’s my first 0.

3: Can a nonprogrammer access the data?

No they can’t. My project is built on top of a Google API. Users need to have access to Google, need a Google account, and they need a credit card to be able to use the API. Only then can they generate any data and continue with the project. Another 0 points here.

4: Is there documentation available in more than one language?

Nope. Only in English. I might add Dutch, but I think that won’t increase the size of my audience by much.

5: Is your documentation and the site it is hosted on compatible with screen reader?

Maybe it is, but I haven’t tested it. And that gets me another zero points.


So my score comes down to 1+0+0+0+0. 1 out of 5. That’s not good now is it?

Even though the questions are aimed at open data repositories, I think it does a decent job at reviewing the openness of code projects. It got me thinking on ways to improve the openness of my data project. And that is a good thing.

Here are a few things I want to implement in the near future.

How I plan to improve the openness of my project

My main idea is to create a dedicated project page that points people to the right information. There should at least be a technical resource (GitHub) and a page written for nonprogrammers.

The page should also include various assets of the project. For a full DIY experience, people can clone the repository. But the code of the project consists of two parts: text analysis and data visualisations. Why not support users that want to use either one?

The text analysis is done through a Google Cloud API. There could be various reasons that this might be a problem:

  • A firewall blocks access.
  • They don’t feel like creating a Google account just to give my project a try.
  • With a Google account, but no credit card, you can’t access the APIs.

To support users that don’t have a access to Google, the least I could do is include some sample data sets for people to work with. This is also helpful for nonprogrammers that want to look at the data.

Next up is the data visualisation part. What if people like to work with the generated visuals, but don’t have the technical skills to generate them? They should be able to use the asset if they want to.

The main idea is to deconstruct the project into various phases and allow users to start from any point they want, based on their skills and needs.

When that is all set and done, I’ll start looking at other languages for the page.

Relearning how to do open code projects

Before reading the book, I thought my project was very open because I shared the code on GitHub. After reading the book, I’m not so sure if it is that open. Luckily, I can learn, unlearn and relearn things. Let’s see where relearning ‘how to open up a project’ takes me.