New Year, new social media

As Twitter has been turned into a neo-fascist cesspit by its white-supremacist owner, I am no longer posting on the Statutes Project account there. Once I have done some tidying, I will be locking the account.

In its place, I have started accounts on two other social media networks. I will endeavour to keep these accounts in sync, manually at first, pending some sort of general social media app that can handle multiple systems.

You can find the Statutes Project on Mastodon, courtesy of Humanities Commons, at https://hcommons.social/@Statutes, and on Blue Sky at @statutes.bsky.social

One hundred thousand page views

A milestone: This site has just had it’s one hundred thousandth page view.

The most popular pages – leaving aside the home page – have been the bibliography of the Statutes of the Realm, and the Chronological Bibliography; the most viewed laws have been James 1’s notorious Act against Witchcraft, and George 1’s Transportation Act.

It’s very gratifying that this site has been so useful, even if, as yet, it is a rather random collection of statutes, and a fraction of the total number passed.

But I persist in thinking that this material is of considerable historical significance and utility. As such, it requires cataloguing and curation – and I’m amazed that this hasn’t been done already – and also rendering into formats useful for humans, and the computers they use.

Obviously, this project is constrained by time and abilities; it is taking considerably longer than I had envisaged to correct the texts, and in the meantime given me a thorough course in regular expressions. Not only can the machine do only so much correcting, every correction has to be pattern has to be formulated by a human.

Currently, I am focusing on producing tables of the acts, public, local and private, from Restoration to Irish Independence, 1660 to 1921. I hope these will allow both the finding of both individual acts and acts by type, and open a way towards statistical analysis of legislative patterns over two and a half centuries.

To see the hundreds of tables already transcribed, see these directories in the Statutes’ Github Repository:

Public Acts; Local Acts; Private Acts.

If you find this site, and the larger project, useful, feel free to make a donation via Kofi. Contributions will be used to pay the hosting charges and expenses related to research, reward the editor with a nice meal, and if the donations are significant, hire people to develop the site and proof the texts.

It should go without saying, however, that all content on this site is, and will always be, free to read, use and download, either public domain (which applies to all the legislation ) or open access by CC-SA-4 (my own contributions).

Chronological Bibliography, from Magna Carta to 1970.

The Chronological Bibliography of British and U.K. public statutes now runs up to 1970, and so, for copyright reasons, can be regarded as complete. Together with the Statutes of the Realm series, this means that the majority of public legislation made over circa 765 years should be fairly easy to access.

(The earliest statute in the Realm series is that of Merton, 1235-6, 20 Henry 3; but it is preceded by a variety of charters dating back to 1101; the starting point for Pickering and Ruffhead is Magna Carta of 1215. Many of the earlier acts are given in Latin or French, sometimes without translation.)

I managed to get Google to release the later volumes by citing the Copyright, Designs and Patents Act 1988, section 164: , which states that Royal copyright “subsists in the case of an Act or a Measure of the General Synod of the Church of England, until the end of the period of 50 years from the end of the calendar year in which Royal Assent was given.”

This 50 year term makes 1970 the limit for this bibliography, although there is the possibility of annual extensions, depending on whether the volumes have actually been digitized. From 1988 on, Legislation.gov.uk have a complete set of U.K. public general acts.

These volumes contain the vast majority of laws, especially from the early nineteenth century. But they do not contain every public act passed, and the eighteenth century is much abbreviated. In some cases, the scanned copies of the books are severely marked and worn; and fonts introduce ambiguities. Inevitably, there are also flaws and occlusions in the digital images.

Nevertheless, this collation is both useful and useable. Any errors found, any better quality scans located, please leave a comment.

I am now scanning these volumes, and uploading the OCR’d text to Github. It is my aim to start publishing the complete, corrected texts en masse, in an easy to navigate archival format, next year.

The English Reports, updated.

Further to my last post, I have located another 9 freely available volumes of the English Reports. The bibliography is now missing volumes 42, 68, 74, 83, 101, 112, 117, 127 and 165, due to Google, Hathi Trust and Internet Archive not having digital copies; volumes 170 to 176 are missing as although Google have copies, the full text is not being made available, presumably due to copyright issues. In all, just 16 volumes, around 9%, are absent from the full set of 178 books.

I have also adapted the table of the reports archived in the Wayback Machine, to include links to the individual volumes. This provides a convenient way of locating material by original publication, and by the abbreviations generally used to refer to them.

Update, 11 May 2021: As pointed out in the comments, Common LII hold a full set of the Engish Reports, broken down into single PDFs. I don’t find their database particularly easy to search or browse, though: some reports appear misfiled under the wrong date, and the PDFs often have fragments of other cases included, which means searching often throws up false positives. The OCR, as ever, also leaves something to be desired. Nevertheless, it is the most convenient, complete and openly accessible set available.

The English Reports

Although this project is focused on the acts passed by the British parliament, the law made by Judges, in the courts, is a constituent part of the common law system. It is also just as rich as historical source material, in both quantity and quality. And similarly, it requires much the same sifting and organization as the statutes do, although thankfully much of the heavy lifting was done in the early twentieth century with the consolidation of the many historic series of law reports in the form of the English Reports.

To this end, I have now added to the bibliographies of British legal materials one listing all the freely available volumes of the English Reports. Of the 178 volumes published between 1901 and 1930, I have found 153 accessible  digitizations online, held variously by Google Books, Hathi Trust, and Internet Archive. In all, I estimate they contain approximately 200,000 pages of text, covering cases from 1220 to 1865. (In 1866, the ICLR began publication of their own series of Law Reports.)

These Reports are very different from the Proceedings of the Old Bailey; those are the records of the regular business of London’s main courthouse, whilst these are selected, important precedents set out in the high courts. They are not intended to gather the legally mundane, or record charges, facts and judgements, but collate those decisions and opinions that interpreted and clarified the statute law. But notwithstanding their specific legal purpose, there is considerable larger historical interest in these volumes. For example, Somerset v Stewart, 1772, is the case that led to the end of slavery in Britain, and the King v Thames Ditton the case that showed how ambiguous and precarious that ending was. The Reports are also international and far-reaching: there are cases concerning India, the Caribbean, and relating to international and maritime law.

But such is the volume of material – most volumes are well over one thousand pages – that it is difficult to know what exactly is within them, or get any sense of how matters are distributed across time and space.

Happily, that the two index volumes are freely available goes some way to making this vast compilation useable; there is also a table of the books comprising each volume available via the Wayback Machine. Some volumes of the Reports – it is unclear which – can be found on CommonLII in searchable PDF format. I am looking for other reference material, and considering the best way to present it digitally. The ideal solution would be to OCR it all and have it as plain text, but that is too much work for one person to do on top of producing a standardized collection of the statutes.

Building upon Google Books.

Some months ago I finished compiling a Chronological Bibliography of British and U.K. statutes – volumes of statutes organized by regnal year or years. This is an easier way of locating (British) laws than via the other bibliographies I’ve compiled. Each link is to an openly accessible, public domain book, the majority digitized by Google, and hosted on either Google Books or the Internet Archive. In the course of searching for these I’ve been able to extend the coverage up to 1920, 10 & 11 George 5. In a very few cases, this is because I had overlooked volumes; but mostly, it is because I raised an issue via the Google Books Inquiry form.

Through this, I was able to request that the full content of out-of-copyright volumes available in ‘snippet view’ be made available. And in the vast majority of cases – just one refusal, and one request unresolved – the full text has been made available, and promptly so.

Without Google Books, and the similar Internet Archive, this project, based on nearly 200 volumes of British statutes, would not be possible. It would be just too difficult and time-consuming for a single person to approach and negotiate with however many organisations and libraries, obtain hundreds of books and digitize them, before getting to the stage I am at, of correcting the OCR’d text. This vast, free to access library of out-of-copyright and out-of-print volumes, can be a foundation on which to build all sorts of historical resources, investigations and analyses.

Against this, of course, is a whole series of problems relating to how Google Books was conceived and run: as an industrial process, on a huge scale, producing a vast reservoir of data, aiming simply to get enough right, the maximum return from the smallest possible investment. This is Google Books literal ‘darker’ side: precarious and poorly paid workers, frequently women, frequently black.

A direct consequence of this labour-intensive, high-tempo factory system is the poor curation. There’s the notoriously poor metadata – a veritable train wreck – attached to the books; the hideous OCR, although there has been some automated correction of it; the many poor scans, distorted and obscured; the worn, worn-out books indiscriminately put through the production line.

Even worse than all these specific flaws are, is just how opaque the library is as a whole. There seems to be no way to comprehend it as an archive, no way to know what is in it, no way to extract subsets of books or their metadata. Even something as simple as listing all the titles in their archive for a year of decade isn’t possible. Given that search is Google’s forte, this obscurity has to be deliberate; the public-facing library is fundamentally a side product of a big (linguistic) data haul, a negotiation with the libraries that provide the books, and a swerve round the publishers that hold copyrights. (And I wonder if the absence of a list of the half million titles recently added to Google from the British Library has been contractually forbidden, perhaps under clause 4.7, restricting automated access. It’s impossible that there isn’t such a manifest, and one has been released for the Microsoft-digitized volumes held by the B.L. Of course it is possible the B.L. just doesn’t want to release it.) By contrast, the Internet Archive goes to great lengths to allow deep searches and bulk downloads of their holdings. That they take in Google’s scanned books frees them from these obstacles.

The limitations and restrictions of Google Books may well disuade the building of projects upon it. Really, it is just a large repository of page images in PDFs without much support. But if one accepts its limitations and expects no more, it is still useful. Projects like this one can curate a subset of interrelated documents within certain parameters. Even if there is considerable work to be done, a significant part has been done. And it is better that the creation of historical archives is made by historians than corporations.

Update, 8 October 2021: StoryTracer has published a step by step guide to requesting Google Books release public domain books.

 

The post Peterloo ‘Six Acts’

2019 is the centenary of the Peterloo massacre, when a pro-reform demonstration in Manchester was attacked by Yeomanry and Hussars, resulting in as many as 18 protestors being killed and up to 700 more injured. (Figures are disputed: these are taken from the Peterloo Massacre website.)

If the historical event itself is well-known, the ramifications and repercussions are perhaps less so. It became a national event, with pamphlets recounting the bloodshed and condemning the government widely circulated, protests and demonstrations in support of the victims held nationwide, and reports of imminent uprising sent from all over to the Home Office.

(For an interesting way of presenting the fall out, see the Peterloo 1819 news twitter account; a remarkable and comprehensive tracking of events.)

In response, the Government passed a series of laws – the ‘Six Acts’, as they became known – at the end of 1819, a legislative program against the democratic movement.  These statutes firstly strengthened the state’s local presence by giving exceptional powers to the Justices of the Peace. The J.P.s could act in neighbouring jurisidctions, issue warrants to raid houses and oblige public meetings to be authorized. Legal procedure was quickened, to the detriment of the accused. Rights of assembly and organization were limited, public meeting and military drilling alike (and give the state a monopoly over the latter).

The last two statutes dealt with publications: the Seditious Libel act permitting the seizure of works critical of state and church and punishing repeat offenders with banishment and transportation, and the Stamp Duties Act taxing printed works, to make them too expensive for their plebian and proletarian audience.

Notwithstanding the few concessions wrung out of the government by the Whig opposition, these acts offer both an anatomy of, and a program against, the radical movement. It considers it as geographically diffuse, present all over the country so local authorities are given powers to oppose it. Each locale is a point for gathering people together to communicate with each other, so meetings and ‘military’ associating are repressed. The locales are connected with each other, made national, through the medium of print, so publications are taxed and seized.

The acts also describe the government of the day: as fundamentally repressive and based in the final instance on brute military force, the violence of which provoked the subsequent movement.

Although the drilling provisions were the longest lasting legally (until 2008), and as time limits were set on the seizure and meetings acts, the tax on print was the most repressive measure. However, it led to the ‘War of the Unstamped‘, the refusal of publishers and vendors to pay the duty, and their willingness to go to prison for their pains. The stamp on newspapers was lowered to a penny in 1836, then abolished in 1855, thirty six years after its passing.

If the Peterloo massacre can be fixed to a time and place, its consequences, of which these statutes are just a few, were very directly felt for years afterwards.

60 George 3 & 1 George 4 c.1: The Unlawful Drilling Act

60 George 3 & 1 George 4 c.2: The Seizure of Arms Act

60 George 3 & 1 George 4 c.4: The Misdemeanours Act

60 George 3 & 1 George 4 c.6: The Seditious Meetings Act

60 George 3 & 1 George 4 c.8: Blasphemous and Seditious Libels

60 George 3 & 1 George 4 c.9: The Newspaper and Stamp Duties Act

Standardizing Statutes

I have just added the 1689 act ‘Absence of King William‘ to the statutes text section.

I took the text from Wikisource, which in turn transcribed it from the Statutes of the Realm collection, volume 6. It is also available from British History Online, which has transcribed three volumes of that series.

Statutes of the Realm is the most complete collection of pre-Union legislation available; it was commissioned to collect all the laws up to the union with Scotland, without regard to whether an act was in force or not. The act is not included in either Pickering’s or Ruffhead’s ‘Statutes At Large’ series, presumably because it had long since expired at the time those were published, and those collections were more pragmatically focused.

The text I’ve posted is different from the other transcriptions, in that I have standardized it. The Statutes of the Realm sought fidelity to the original manuscripts, and reconciling the originals and the inrolled copies, noting their differences, omissions, and discrepancies, and strictly following original spellings. This makes for difficult, interrupted reading for humans; similarly, it is an obstacle to ‘distant reading’, that is, the digital analysis analysis of large volumes of text.

Consequently, with the help of a simple line of code and a short, hand compiled list of obsolete spellings, the version I publish is readable both for people and machines.

All the changes to the text are quite minor: replacing antiquated and inconsistent spellings with regular, modern ones, often just removing a superfluous last letter (Regal for Regall, public for publick, etc.). The list of standardization couples available on github. It’s short, just 52 pairs, but it’s a start. I haven’t uploaded a script to utilise them yet, mainly because just one line is adequate:

while read n k; do sed -i.bak "s/\b$n\b/$k/g" target/*.txt; done < word-standardization-couples.txt

This should produce corrected versions of texts in the folder called target (insert your own path), with the originals renamed to *.txt.bak.

Note this has been tested on Lubuntu 18.04 and Mac OS High Sierra; other operating systems are available.

There is obviously a great deal more to say about manipulating texts in this way, covering matters ethical, academic, technical, and typographical. For the moment I leave all that aside, but it is worth noting these issues.

A Chronological Bibliography

Following an exchange on twitter with the Victorian Commons project, I have rejigged part of my first listing of volumes of statutes, and published a chronological bibliography of nineteenth century law.

This will make it easier to locate the texts of laws in the editions held by Google Books and the Internet Archive, as long as you know the correct calendar and regnal years for an act.

At the moment, this bibliography covers the years 1806 to 1908, but many later nineteenth century volumes are missing. These will be added as they are located, and when I have time.

 

Statutes in the Parliament.UK Digital Archive

I have recently found a new digital archive of English, British and U.K. statutes, at the parliament.uk website.

It appears to have around 1,200 items of legislation, some of which are professionally photographed manuscripts, and some of which are PDFs. The vast majority are of local acts; there’s only 56 (at the time of writing) public statutes available. The reproductions of the rolls and manuscripts are of high quality, and hosted externally on a system called ‘CollectionsBase.’ There is a download button in the bottom left hand corner, which, with the ‘Gallery’ view (top right corner) allows all the pages of a document to be downloaded in .jpg format.

Unfortunately, the system for items hosted on their own site is less usable. I have not found a single PDF file with the extension .pdf, even though the links to these documents claim them to be so and have such. This can cause problems with displaying the document, whether through the browser or using a desktop app, and creates work for the user in that every PDF downloaded needs to be renamed. Many local acts have the pseudo extension .local, though I have also found .South, .Western,  and .Clydebank. I presume the latter is due to the use of multiple full stops in the file names; the processing software seems to have truncated the name at the first of them.

Furthermore, it is difficult to navigate the catalogue other than with the search function. This means that it is difficult to know what is generally available, such as how many enclosure acts are there, how many there are, and what proportion it constitutes of the total legislation passed.

However, there are ways of finding all the public and private acts using the search function. These links are on the site, but I had difficulty finding them

Find all digitized public acts.

Find all digitized private acts.

In total, right now there are over 5,000 digitized documents. Find them all here.