Sometimes, it feels like life is pushing me to do something. I first heard about W.E.B. Du Bois in April 2021, during S-H-O-W 2021 (a data visualisation event). I asked a speaker for book recommendations, and he recommended The Souls of Black Folk.

After that, it was quiet for a while. But recently a few things happened in the span of a couple of weeks:

  • After my first year of full-time freelancing (2023), I decide to reward myself with some inspiration and I got myself a copy of W.E.B. Du Bois’s Data Portraits, Visualising Black America. 
  • Around the same time, I started reading the scifi book Seveneves, in which one of the main characters is called Dubois.
  • When I finished that read, I started Underground Railroads. This book gave me some new insights in the history of slavery.
  • And a few days later I took time to sit down on a quiet afternoon to finally read W.E.B. Du Bois’s Data Portraits.

Not long after that, I see an announcement from Data Visualisation Society (DVS) about an open competition:

The goal of the challenge is to celebrate the data visualization legacy of W.E.B Du Bois by recreating the visualizations from the 1900 Paris Exposition using modern tools.

And I feel like I just have to give this a go.


Making Data Portraits in Matplotlib

The competition feels like a good new step in my personal Du Bois journey, and a good reason to exercise my Matplotlib skills. Recreating some of the data visualisation plates he and his team made by hand(!) in 1900, is an interesting way to get up close and personal with his work.

The #DuBoisChallenge2024 consists of 10 weeks and each week features one of the data portraits. They provide all the necessary data for each. My goal, as stated in the title, is to create this using only Python and Matplotlib (and some Pandas).

Besides that, I try to limit myself to 2-3 hours of work per week on one plate. There are some I think I can manage to make in that time, but some others (like the first one), feel a bit more tricky.


Jupyter Notebooks

Curious about the code? You can view the Jupyter Notebook here. Updated to include challenge 01 to 03. For general thoughts and notes, continue reading below (I put my most recent plot first).

(Notebook for challenge 04 is still a mess. I will add it when I spend some more time on it.)


4: The Georgia Negro (plate 01)

You can view the original here. And if you do, you can clearly see that I haven’t managed to recreate this one properly.

Here’s my version.

It turns out that the spherical map projection and projecting custom shapes onto them is something I’m not familiar with yet. I’ve tried a few approaches, but after about 2-3 hours worth of attempts, I decided I wasn’t going to make anything work in my planned 2-3 hour limit per challenge.

So I opted to give me a few more hours, tried to make something different but related, and got to the image you just saw.

There are a few things I like. I tried to put focus the destination ports that slaves were transported to. There’s a dot for each destination port and an open circle sized by the number of slaves that were transported there. 

The low opacity lines that represent slave routes (from source port to destination port) makes them harder to pinpoint, which is something most people were not interested in doing after buying slaves back then (I think). I also decided to not highlight the source ports to contribute to this effect.

The learning of this week: there are still some things that Du Bois did in 1900 that I can’t easily recreate using only code. (And it’s a lot harder to make something nice if I have to design an alternative myself 😅).

A day later…

I found another hour to work on it and gave the geospatial projection another try. Thanks to swatchai I got the data projection working on the Matplotlib Basemap projection:

After this version, I decided to slightly rotate the right globe to make the two semi circles in the centre align. It does ‘hide’ Africa a bit too much, but it helps emphasise the point of origin of the lines in the left globe.

It’s still far from a perfect match of the original, but at least it has the spherical maps with data on them.

Yes! 🙂


3: Acres of Land Owned by Negroes in Georgia (plate 19)

You can view the original here.

It feels like this is very matplotlib-able. And it turns out it is. Here’s my version of Du Bois’s plate 19.

You may notice some changes. I decided to go for a serif font for the title and annotations (Roboto Slab). Besides that, I changed the aspect ratio to an A-paper size one. I decided I want to print my versions when they are all done. So that is a preparation for later.

I like how this plate about ownership of land leverages the full plot area available. Compared to the previous plot, a lot of margins are removed. I see this as a powerful way to emphasise the size of the lands.

Besides that, this week is a good week to show the impact that design makes on code.

Adding Du Bois’s style to the code, changes the code from this:

fig, ax = plt.subplots()
ax.barh(df.index.values, df['acres'])
ax.set_ylim(max(df.index.values)+1, -1)

To this:

fig, ax = plt.subplots(   
    figsize=(7.4,10.5), 
    facecolor=dubois_colors['bg']
)

rob_font_heavy = {'fontname':'Roboto Slab', 'fontweight': 'black'}
rob_font_light = {'fontname':'Roboto', 'fontweight': 'light'}

ax.barh(df.index.values, df['acres'], color=dubois_colors['crimson'], height=.55, alpha=.95)

ax.tick_params(left=False)
ax.patch.set_alpha(0)
ax.get_xaxis().set_visible(False)
ax.spines[['left', 'top', 'right', 'bottom']].set_visible(False)

ax.set_yticks(df.index.values, df['year'], fontsize=12, **rob_font_light)
ax.set_ylim(max(df.index.values)+1, -1)

plt.title('ACRES OF LAND OWNED BY NEGROES\nIN GEORGIA.', pad=-5, fontsize=18, **rob_font_heavy)

annotation_1 = int(df['acres'][0])
annotate_kwargs = {
    'fontsize': 14,
    'horizontalalignment': 'center',
    'verticalalignment': 'center',
}
ax.annotate(format(annotation_1,','), (annotation_1/2, 0+.01), **annotate_kwargs, **rob_font_heavy)

last_index = len(df.index.values) - 1
annotation_2 = int(df['acres'][last_index])
ax.annotate(format(annotation_2,','), (annotation_2/2, last_index+.01), **annotate_kwargs, **rob_font_heavy)

plt.subplots_adjust(top=0.93, bottom=.00, left=0.08, right=1)

And the visual from this:

To this:

It’s again a special feeling to be handling Du Bois’s style in Python.


Challenge 02: Slave and Free Negroes (plate 12)

You can see the original here and my version below:

There is one thing that stands out to me about this challenge. The value of a well-developed sense of design. To help mimic the feeling a bit. Have a look at the graph below:

Now this is technically the same thing as the one that W.E.B. Du Bois made, give and take the names of the y-axis and a title.

There is a lot of things that Du Bois does to design his data:

  • y axis labels
  • x axis labels
  • x axis scales
  • title of plot
  • title of x axis
  • smart usage of a double y-axis
  • colour pallet
  • ripped paper effect on the left (which I recreated using a random data set)
  • breathing room for the plot to be in

And there is probably a lot more that I don’t see.

Besides all that, there is one thing the original does that I haven’t implemented. In the final period, the percentage changes from 0.8% to 100%. Du Bois chooses to show an incline there, but with the x-axis range set to 3% – 0%, that should not be that visible.

That is what I like about this challenge. I get to experience some of the design decisions Du Bois and his team made. As a reminder of that, I keep the line uncorrected in my version.


1: Negro Population of Georgia by Counties, 1870, 1880 (plate 06)

You can view the original here and my version below:

Developing this was an interesting experience, as I haven’t work with maps a lot in Matplotlib. And I somehow got off on the wrong foot. I tried to load all the .csv files, instead of the very handy shape file…

When I develop my data visualisations, I often start by using system-default (in this case Matplotlib-default) colours. It allows me to focus on the functional setup of my code. The creators of the competition were kind enough to include a style guide. So when the code for the layout was set, I swapped the system colours with Du Bois colours. And that felt nice. It felt like applying the feel for style of Du Bois to my visual.

The GIF below rotates between the system colours and Du Bois colours.

Du Bois’s colour pallet improves it a lot right?

I’d also like to give you an insight into the technical setup. I’ve added an extra image below that shows the four elements of the overall plot:

As you can see, I use four overlapping subplots and make the background of the plots transparent. The legends in top right and bottom left plot are technically scatter plots. 

That’s it for now. See you next week 🙂


Coming up:

  • Challenge 04
  • Challenge 05
  • Challenge 06
  • Challenge 07
  • Challenge 08
  • Challenge 09
  • Challenge 10