Category Archives: Meta

New Year, new social media

As Twitter has been turned into a neo-fascist cesspit by its white-supremacist owner, I am no longer posting on the Statutes Project account there. Once I have done some tidying, I will be locking the account.

In its place, I have started accounts on two other social media netwroks. I will endeavour to keep these accounts in sync, manually at first, pending some sort of general social media app that can handle multiple systems.

You can find the Statutes Project on Mastodon, courtesy of Humanities Commons, at https://hcommons.social/@Statutes, and on Blue Sky at @statutes.bsky.social

One hundred thousand page views

A milestone: This site has just had it’s one hundred thousandth page view.

The most popular pages – leaving aside the home page – have been the bibliography of the Statutes of the Realm, and the Chronological Bibliography; the most viewed laws have been James 1’s notorious Act against Witchcraft, and George 1’s Transportation Act.

It’s very gratifying that this site has been so useful, even if, as yet, it is a rather random collection of statutes, and a fraction of the total number passed.

But I persist in thinking that this material is of considerable historical significance and utility. As such, it requires cataloguing and curation – and I’m amazed that this hasn’t been done already – and also rendering into formats useful for humans, and the computers they use.

Obviously, this project is constrained by time and abilities; it is taking considerably longer than I had envisaged to correct the texts, and in the meantime given me a thorough course in regular expressions. Not only can the machine do only so much correcting, every correction has to be pattern has to be formulated by a human.

Currently, I am focusing on producing tables of the acts, public, local and private, from Restoration to Irish Independence, 1660 to 1921. I hope these will allow both the finding of both individual acts and acts by type, and open a way towards statistical analysis of legislative patterns over two and a half centuries.

To see the hundreds of tables already transcribed, see these directories in the Statutes’ Github Repository:

Public Acts; Local Acts; Private Acts.

If you find this site, and the larger project, useful, feel free to make a donation via Kofi. Contributions will be used to pay the hosting charges and expenses related to research, reward the editor with a nice meal, and if the donations are significant, hire people to develop the site and proof the texts.

It should go without saying, however, that all content on this site is, and will always be, free to read, use and download, either public domain (which applies to all the legislation ) or open access by CC-SA-4 (my own contributions).

Standardizing Statutes

I have just added the 1689 act ‘Absence of King William‘ to the statutes text section.

I took the text from Wikisource, which in turn transcribed it from the Statutes of the Realm collection, volume 6. It is also available from British History Online, which has transcribed three volumes of that series.

Statutes of the Realm is the most complete collection of pre-Union legislation available; it was commissioned to collect all the laws up to the union with Scotland, without regard to whether an act was in force or not. The act is not included in either Pickering’s or Ruffhead’s ‘Statutes At Large’ series, presumably because it had long since expired at the time those were published, and those collections were more pragmatically focused.

The text I’ve posted is different from the other transcriptions, in that I have standardized it. The Statutes of the Realm sought fidelity to the original manuscripts, and reconciling the originals and the inrolled copies, noting their differences, omissions, and discrepancies, and strictly following original spellings. This makes for difficult, interrupted reading for humans; similarly, it is an obstacle to ‘distant reading’, that is, the digital analysis analysis of large volumes of text.

Consequently, with the help of a simple line of code and a short, hand compiled list of obsolete spellings, the version I publish is readable both for people and machines.

All the changes to the text are quite minor: replacing antiquated and inconsistent spellings with regular, modern ones, often just removing a superfluous last letter (Regal for Regall, public for publick, etc.). The list of standardization couples available on github. It’s short, just 52 pairs, but it’s a start. I haven’t uploaded a script to utilise them yet, mainly because just one line is adequate:

while read n k; do sed -i.bak "s/\b$n\b/$k/g" target/*.txt; done < word-standardization-couples.txt

This should produce corrected versions of texts in the folder called target (insert your own path), with the originals renamed to *.txt.bak.

Note this has been tested on Lubuntu 18.04 and Mac OS High Sierra; other operating systems are available.

There is obviously a great deal more to say about manipulating texts in this way, covering matters ethical, academic, technical, and typographical. For the moment I leave all that aside, but it is worth noting these issues.

Tables of Statutes of the United Kingdom, 1801 to 1921.

I have now completed tables of the full, long titles of public statutes passed by the parliament of the United Kingdom of Great Britain and Ireland, from the Act of Union in 1801 up to 1921, when Ireland was divided and the south achieved independence. They can be found on github.  All these tables are public domain, and can be reused for any purpose and in any way one wishes.

I am currently working on generating tables of abbreviated titles of private and local acts for this period, using the annotated lists of local acts and private acts produced by Legislation.gov.uk.

This will be quicker than working through the full titles in the volumes of statutes for this period, although at the cost of less detail. (Tables giving full titles will be produced eventually as I work on correcting the OCR of the scanned volumes, but this will take some time.)

Once the private and local tables have been created, I will produce a more convenient package of these lists, easy to download and suitable for searching and text mining.

Updates, August and September 2017

The last two months have seen: continuing automated correction of the OCR-generated text of Pickering’s Statutes At Large, and some of the Butterworths-published volumes (1807 to 1819, in other words those using the ‘long s‘). The bash script I have written for this is improving, and I hope to release it soon on github (under a free license of course).

A side effect of hunting down erroneous OCR is the production of lists of such mistranscriptions. I have started to put those on Github; used with the forthcoming script this will constitute an easy way of improving raw OCR of eighteenth century books.

I have started a page collecting volumes of historic American state legislation, mainly colonial, but with some post-revolutionary laws.

SSL has been enabled for the site, courtesy of a free certificate via my hosts Evohosting and Let’s Encrypt! I will be making all URLs secure by default at some point in the future; this should not break any pages you have bookmarked. Until then, simply starting any them with ‘https://’ will call up the secure advise

New laws added to the site, including: the 1807 Abolition of Slavery Act; from 1740, encouragement of mariners; and Hogarth’s act for protecting copyright in engravings of 1735.

There will now be a hiatus until November, whilst I concentrate upon writing my PhD thesis.

May, June and July 2017 updates

Work on the Statutes Project in the last three months has mainly consisted of running bash scripts on the OCR of Pickering’s Statutes At Large, to correct the more obvious errors. It’s slowly getting into easonable shape. You can find the latest plain text on the Statutes Github repository.

More tables of acts have been added: they now run from 1716 to 1736, with some others up to 1760. The aim is to have a complete set covering the reigns of Georges one and two by autumn. Then some text mining can begin. Again, find them on Github.

Various individual acts have been added to this website, including the Licensing Act of 1737, the Irish Dependency Act, it’s repeal and a clarification of the repeal. Plus the Equal Franchise Act of 1928, to accompany the recent election and its surrounding ballyhoo.

Also added is 1661 Tumultuous petitioning act, taken from the Ruffhead edition of the Statutes at Large, as it does not appear in the Danby Pickering series I have been concentrating on. Consequently, it looks a little different, as the two versions have different standards and protocols.

On the agenda for the next couple of months: more automatic OCR correction, and more tables.

 

 

 

March and April 2017 Updates

Work on the Statutes Project in March and April 2017:

0: Numerous corrections to Pickering’s series of Statutes at Large. Latest versions to be found, as ever, on Github.

1: More tables of statutes uploaded to Github. Currently, there are tables for public acts 1716 to 1736, with just 1721 missing. This I’ll upload shortly.

2: More legislation collected, to the point that the menus are getting unweildly and I’ll have to do some reorganizing. Acts added include:
The Murder Act of 1751, giving the corpses of the hanged to the surgeons (and occasioning many a riot).
The Regency Act 1729, allowing the Queen to govern whilst George the Second went off to Hanover.
The Septennial Act, extending the life of a parliament to seven years. A quite undemocratic act, had there been any meaningful suffrage

On the to do list for May 2017: due to the demands of my PhD, I’ll be working on the insolvent debtor relief acts from 1649 to 1813 over the next month; consequently, those texts will be corrected and added.

February 2017 Updates

Work on the Statutes Project in February 2017:

0: Numerous corrections to the OCR of the Pickering and Ruffhead editions of the Statutes At Large, uploaded to Github. Still a long way from readable, but getting there.

1: A new series OCR’d, or at least half a series. The Statutes of the Realm was the most academic, comprehensive and careful collection of acts, the text generally taken from the statute rolls themselves. Consequently, it is a typographical nightmare, and the OCR is  worse than for the – admittedly less reliable – series of the Statutes At Large. I have put on Github the text for two volumes (numbers 3 and 5) found on Google, and, thanks to the University of Southampton waving their No Derivatives license, the text for volumes 6 to 11 from the British Parliamentary Publications set, digitized by Soton, on archive.org.

2: I have also started extracting the tables of acts from the OCR’d volumes, and uploading them to Github. The idea is to create a reliable list of legislation enacted, with the long title of each act. Given the length of titles, this will constitute a corpus of sufficient size for text mining and distance reading (I hope). It also constitutes the first step in creating metadata for this project.

3: Laws collected from around the web:

1536 27 Henry 8 c.19: An act limiting an order for Sanctuaries and Sanctuary persons.

The 1918 Parliament (Qualification of Women) Act: Allowing women to sit in Parliament, and the shortest statute at a mere 27 words, when preamble and short title clause are put aside.

4: And also: a short post on James I’s laws on sanctuary, over at my  Alsatia blog.

Planned for March: More acts collected from round the web, and more tables of statutes.

January Updates

Work on the Statutes project for January 2017:

0: OCRed two volumes of ‘The Statutes Of The Realm’, as digitized by Google. This is an important collection of legislation from Magna Charta to 1714, derived from close study of the original manuscripts, and contains laws not found in other collections. Raw OCR can be found on Github, and be warned, it’s very raw, as this series goes to great lengths to transcribe the original texts, with all their irregularities and without softening them for the modern eye. I hope to have more news concerning other volumes soon.

1: Tidying up of, and various corrections made to – by hand and by find-and-replace – the OCR of the Pickering and Ruffhead editions of The Statutes At Large. I needed to locate every use of the term ‘sanctuary’, so waded through the alphanumeric soup. Investigations into automatic correction are ongoing.

2: Added a bibliography for the digitized volumes of Ruffhead’s series of ‘The Statutes At Large.’ This gives the particular volumes I have OCRed, each with their own idiosyncracies and missing pages. Also added a bibliography for historic French statutes up to the revolution of 1789, though I have no plans to do anything with these right now.

3: Added to the statutes collected from round the web:
The Quartering Act 1774
1849: 12 & 13 Victoria c.92: Cruelty to Animals Act
The Debtors Act 1869

December Updates

Work on the statutes project in December 2016:

Not much accomplished this month, given the seasonal festivities. However:

0: On a whim I have OCR’d the 12 volumes collecting the statutes of Ireland, 1310 – 1800. The raw and very messy OCR can be found on Github. As with many of these retrospective collections, they are far from complete, neglecting statutes expired or repealed. I have not decided what to do with them, given that the English / British statutes take priority, and are daunting enough a task alone, but I will probably be producing corrected versions of legislation dealing with debt, as and when I need them. After the Union of 1800, Irish legislation is to be found in the main body of United Kingdom law.

1: Added to the general collection of statutes gathered from around the internet: the infamous ‘Cat and Mouse’ act of 1913, used against imprisoned Suffragette hunger-strikers.

On the to do list for January: thinking about automatic correction of OCR’d text, using titles of statutes as metadata, and hopefully some long overdue blog posts.