top of page

Wikipedia Growth (Part 2 - Analysis)

Updated: Sep 22, 2020

How has Wikipedia grown in the past two decades?

After summarizing Wikipedia growth (seen in Part 1), it seemed visually evident that multiple measures of growth followed similar trends. I decided to plot them together to understand the scope of their similarities.



I used Python and pandas in a Jupyter notebook to aggregate and sort through the data before visualizing the numbers with matplotlib.

(Note: This data pertains only to the English Wikipedia.)


Data Analysis



The above chart is a comparison of several types of measurements of Wikipedia health/growth. They appear to loosely follow a similar trajectory, increasing before leveling off around 2008; however, their trajectories, while similar, are scaled differently and can be difficult to visually compare in this way.

When we compare a logarithmic version of these trends, their similarities become much more apparent. Active editors, edited pages, bytes added/removed, user edits, and new pages all seem to be very correlated, even if the nature of their relationships cannot be discerned from this graph.

I decided to also look at the cumulative changes to Wikipedia over time by summing its monthly changes. As expected, they have remained constant after leveling off near 2008. Note how the number of user edits and edited pages increase at a higher rate than the number of new pages and total size.

I also visualized a logarithmic comparison of cumulative changes to Wikipedia over time, to demonstrate the similarities better. Interestingly, the total size of Wikipedia seems to have leveled off noticeably more than the other aspects.




Again, thanks to the Wikipedia/Wikimedia people for publishing their data for public use.



bottom of page