How much would it cost to edit, proof, index and print Wikipedia? (all of it)

In my project-management capacity, I generally have an encyclopaedia or two on the go at any one time. These usually range from around 500,000 to around 1.5 million words. The largest modern encyclopaedias are upwards of 40 million words (Britannica’s 2013 print edition has 44 million).

These are difficult works to handle, with a whole raft of consistency and data-handling considerations that simply don’t apply to ‘normal’ books.

Compared to Wikipedia, though, they’re like children’s picture books. The largest encyclopaedia I’ve ever worked on had four volumes and was around 2 million words. That’s 0.075% of Wikipedia, which according to its own figures currently contains approximately 2.6 billion words.

Just for squeaks and giggles, let’s pretend we’ve been asked to manage the production of Wikipedia and estimate the costs and time involved in putting all 2.6 billion words, or around 4.5 million articles, through the standard process of readying a book for publication.

Brace yourself: there will be maths.

(If you want to skip straight to the summary, click here.)

Mountain of books

Getting started

The following flights of fancy will necessarily be selective. They cover only what is generally called the ‘production’ stage of producing a book. For a normal encyclopaedia, before production can start, the ‘commissioning’ process has to happen. In this process, academic editors who are experts in their fields decide what areas they want to cover and then commission articles on those subjects. The articles are then reviewed and revised until they meet the requisite quality standard.

This process alone can take many years for an encyclopaedia of normal length, but one of the few advantages of printing Wikipedia is that the articles already exist. And, assuming the publisher wanted to print Wikipedia as it is, we could skip right over assessment of article quality and checking for gaps or gluts in coverage.

‘Hooray! We’ve saved time and money before we’ve even started!’ you exclaim.

But not so fast. The lack of a peer-review process will mean:

  1. many of the articles produced will be complete garbage and not worth printing, but, seeing as we’re taking the purist option of printing all of Wikipedia, they will nevertheless need to be edited into a form that makes some sort of sense;
  2. provision will need to be made to have a panel of experts available to answer the copy-editors’ queries (on everything from axoplasmic transport to the skirt dancer Kate Vaughan);
  3. the copy-editors, indexers and proofreaders themselves will need to be of a very high calibre, able to make decisions with little guidance (but maximum communication between themselves) on innumerable style and content issues.

Still interested in managing this beast? Let’s take a closer look at what it will involve.

Copy-editing

Let’s pick an average copy-editing speed of 2,000 words per hour (ignoring the possibility for vast variation in the quality of the text and therefore in editing speeds). That’s 1.3 million hours of editing, or 162,500 days (at eight hours’ editing per day), or 677 years (working 240 days per year).
To complete the copy-editing in anything like a reasonable timeframe, say a year, you’d therefore need upwards of 650 copy-editors – probably more than an entire country’s worth of editors with the necessary experience level. As a result, all other publishers will hate you (you’ll have swiped all the good editors). And, if the copy-editors wise up to the fact you’ve effectively handed them a monopoly, they might be tempted to put their rates up, meaning your already insane copy-editing budget will skyrocket.
Just how insane would that budget be? The UK’s Society for Editors and Proofreaders (SfEP) suggests a minimum copy-editing rate of £25.70 per hour. Many experienced editors charge more, but let’s take that figure as our ballpark number. £25.70 x 1,300,000 is approximately £33.4 million.
How insane? Very.

Typesetting

I’m not too hot on the intricacies of typesetting rates, but £4 per page is a reasonable rough-and-ready figure. This includes:

  1. flowing the text on the pages;
  2. processing and redrawing images and tables, and arranging them nicely within the text;
  3. producing PDF proofs;
  4. implementing the proofreaders’ corrections;
  5. producing another round of PDF proofs;
  6. implementing the inevitable tweaks that will still be needed;
  7. generating a final shiny batch of proofs for printing.

At an estimated 5 million pages, including space for images and tables, that’s another £20 million in costs.

Proofreading and indexing

Next we have proofreading and indexing, which happen simultaneously once the proofs have been generated by the typesetter. The SfEP suggests a minimum proofreading rate of £22 per hour. Let’s be super-optimistic and assume our copy-editors have done such a brilliant job that the proofreaders (of which, by the way, we’ll need around 270 to get the proofreading done in a year) can manage 5,000 words per hour. That’s 520,000 hours at a cost of £11.4 million.
Indexing is often charged by the page. We’ll budget £2.50 per page, which gives a cost of £12.5 million. And I’m sure you’ve noticed the pattern by now and can deduce that the number of indexers required will be similarly silly.
As an aside, most encyclopaedias have extensive sections of ‘prelims’ – introductory material such as tables of contents, lists of contributors, lists of abbreviations, and perhaps an introduction and a preface. Seeing as even a table of contents for Wikipedia would likely be around 100 volumes and that compiling (let alone attempting to print) any kind of list of contributors (around 22.8 million) would be a task of truly frightening complications, let’s give the prelims up as a bad job. No one will notice, anyway – they’ll be too busy calculating how many miles of shelving they’ll need to buy to house their new purchase.

Revisions

A second group of proofreaders then has to check that the corrections have been implemented fully and without errors being introduced. Assuming the corrections were minimal, let’s hope these proofreaders can get through 100 pages an hour. At the SfEP’s suggested rate, that’s another £1.1 million of work.

Printing

Once these rounds of corrections have been completed and the final PDF proofs have been signed off by the publisher, we can proceed to printing. Wikipedia itself gives an estimate of £0.03 (US$0.05) per page for printing alone – but we’ll need to bind the books too, so let’s double the price as a rough-and-ready estimate. At 5 million pages, the printing and binding cost would therefore be a relatively modest £300,000 per copy.

Who’s steering this thing?

A final consideration is the cost of hiring a team to manage this whole process. Sometimes publishers do this in-house, but increasingly the task is outsourced to freelancers or specialist project-management companies.
This project-management team talks to the contributors, the publisher, the expert advisers, the copy-editors, the typesetters, the illustrators, the proofreaders, the indexers and the printers and attempts to maintain a degree of sanity and direction. This team also needs to be paid – let’s say £4 per 1000 words, adding another £10.4 million to our budget.

How much did you say?

So, here’s your summary of the costs for quick reference next time someone rings you up and asks you to manage the production of a 2.6-billion-word book:

ComponentEst. cost (£ millions)Est. hours
Copy-editing33.41,300,000
Typesetting*20.0800,000
Proofreading11.4520,000
Indexing*12.5600,000
Checking corrections made11.150,000
Printing (per copy)0.3n/a
Management10.4n/a
Total (one printed copy of Wikipedia)89.13,270,000 hours, or 1.7 millennia for one person working on their own for 8 hours a day, 240 days per year

* The estimated numbers of hours for these components are even more guesstimated than the others, as detailed calculations of typesetting and indexing times are outside my expertise. Corrections from better-informed people welcomed!
I haven’t even included fees for overheads such as:

  1. the publishing staff themselves;
  2. the cost of retaining expert advisors;
  3. software to handle the project’s looming mountains of data (Excel’s maximum of just over a million rows is paltry in the face of 4.5 million articles);
  4. courier costs of freighting the encyclopaedia’s 17,000 volumes (roughly an entire 20-ft shipping container per copy).

Cost to purchase (sale price)

The above calculations have only dealt with the unit cost – the manufacturing costs. But, unless the publisher is a charity, they’re going to want to justify this whole crazy endeavour by making some sort of profit.

However, here’s where things fall apart (if indeed they were ever stuck together). Based on the fact that a printed Wikipedia would be uselessly unwieldy and out of date before copy-editing even started – not to mention that storing it would require purchasing a small fleet of delivery lorries’ worth of shelving – the market for a printed Wikipedia would quite certainly be zero. And I do not know of any costings model that is capable of suggesting a sale price for a product with no market.


In summary, Wikipedia, despite its shortcomings, is a stupendous human achievement. Were it to be printed, in terms of the basics it could be managed much like any other encyclopaedia. But attempting to print it would be a monument to human insanity the likes of which have rarely been seen.


Have you ever been asked to work on a 2.6-billion-word book? How did it go? What challenges did you face and how did you overcome them? Please consider sharing your experiences below!

Last updated 15 November 2022

About Hazel Bird

Hazel works with non-fiction clients around the world to help them deliver some of their most prestigious publications in areas such as charity and peace work, digital and technology, and business and leadership. An editor since 2007, she aims to see the big picture while pinpointing every detail. She has been described as ‘superhuman’ and a ‘secret weapon', but until Tony Stark comes calling she's dedicating her superpowers to text-based endeavours.

Leave a Comment





This site uses Akismet to reduce spam. Learn how your comment data is processed.