Work on the Statutes Project in November 2016:
0: The big news is that the OCRing of the digitized volumes of statutes is now complete. That’s a total of 137 separate volumes. Quite how many words that is I haven’t checked yet, but the Danby Pickering series alone contains around 13 million words. There should be a more or less complete set of public acts from 1761 to 1875, 115 years worth of legislation in the volumes published contemporaneously. Before 1761, the statutes are incomplete as many acts that had either been repealed or had just expired were not included in the collections. The numbers missing are yet to be ascertained.
The raw OCR is available via github: https://github.com/Anterotesis/statutes
This stage complete, I now need to consider how best to correct the OCR and organize the texts. News on this next month.
1: Added the Riot Act of 1714 and the Cruelty to Animals Act 1876 to the collection of miscellaneous statutes.