AI and the law – the future is here

Artificial intelligence (AI) is already making significant inroads to the practice of law and producing efficiencies and cost savings.  This article looks at how AI is being utilised in different parts of legal practice and the transformation of legal practice that is already underway in the delivery of legal services from litigation through to contract management and Chatbots.

Litigation & eDiscovery 

The production of documents has traditionally been a very expensive part of the litigation process.  The development of eDiscovery software tools to identify, retrieve, process, filter and search provides significant costs savings in the litigation process.  These cost savings are even more significant with latest software tools and right expertise are utilised.  The latest developments in the eDiscovery industry include the use of AI technology.

Early forms of AI were built into the globally dominant eDiscovery platforms.  For the past 10 years these platforms enabled document clustering and concept extraction and over time increasingly sophisticated visualisations were included. These features were then expanded to include a ‘find similar’ function (against a document, a phrase, a sentence). About 3-4 years ago, predictive coding engines as well as sampling functions were introduced to the platforms. Predictive coding is typically referred to as technology assisted review or TAR.   The use of TAR have been accepted by courts in various jurisdictions  - firstly with several US cases, then in early 2016 in the UK with the High Court Phrrho Investments Ltd v MWB Property LTD case [2016] EWHC 256 (Cth), and finally in December 2016 in Australia in a decision of the Supreme Court of Victoria (McConnell Dowell Constructors (Aust) Pty Ltd v Santam Ltd & Ors (No 1) [2016] VSC 734), and orders in a Federal Court of Australia matter (Money Max Int v QBE Insurance, VID513/2015) relating to the TAR algorithms used and methodology in the training and validation.  In January 2017 the Supreme Court of Victoria issued a revised practice note, which included a TAR protocol (Practice Note SC GEN 5 Guidelines for the Use of Technology).

The AI tools in eDiscovery have become more important as the size of the pool of potentially relevant documents has steadily increased. A typical run-of-the-mill commercial litigation matter in 2017 is broadly starting with between 1-2 million documents and so by necessity a variety of filtering techniques are employed to reduce this including utilising the clustering and other AI tools as set out below.  The filtering techniques enable the costs of eDiscovery to be minimised, notwithstanding that the volume of data held by organisations is increasing.


Sampling is a tool that is extensively used for the selection of documents for training/coding. It involves selecting a sample size (a % of corpus or a set number) and then the sampling system selects a document set that is a statistical representation of the larger corpus. By using small sample sizes (e.g. 100 documents) and multiple samples (i.e if you have 5 lawyers create 5 samples of 100 relevant documents to the issues in dispute) you can use the coding made against these 500 documents to provide an indication of relevancy of the larger corpus.  The more documents that are reviewed the greater the precision of the prediction system.

Technology Assisted Review - TAR

TAR can be used in many ways in large scale document review. The principle is that lawyers code a set of documents, and the TAR system the uses the lawyer coding to predict the % of relevance of similar yet uncoded documents. The TAR system also presents a set of coding anomalies where the lawyer’s coding is contradictory on similar documents. In some circumstances, you can use this mechanism to complete a production where you are producing documents that have not been reviewed. In this circumstance, you would then have TAR protocols in place with the other parties where you are agreeing to the size of the sample or training set that is coded, the % of recall and the % of precision.

Other ways of using TAR include using it as a way of prioritising or ranking the review. The same principle is applied where you commence the review via sampling and then schedule a regular (i.e. hourly) update of the training/coding and then the prediction percentage of relevance is recalculated. This can be a very effective way of ensuring consistency of coding decisions amongst a review team.

TAR is also a very useful way of helping senior lawyers conduct second round review and ensuring that all privileged or highly sensitive documents have been correctly identified and coded. Depending upon the TAR engine, you can also have many different TAR models running concurrently – for example you may have a TAR model for privilege which is using the privilege coding field, however it is running against the non-responsive document set.

Other features that can be very helpful to organisations include:

  • entity extraction (person, organisation, location, currency, date/time, number); and
  • search mechanisms to find patterns for Personally Identifiable Information (PII) such as -Medicare (or social security) numbers, credit card numbers, phone numbers, email addresses (note that a lot of these are strings of numbers.

It is important to note, that generally these techniques only apply to text and not numbers.  While these tools are very effective for emails, documents and to some extent presentations, they are not as effective for spreadsheets and drawings/plans/maps.

AI – differences between eDiscovery & Contract Management

While eDiscovery is a new industry and that has grown to a $10 billion per annum global industry, there is now a significant revolution underway focusing on AI for contract management and due diligence in merger and acquisition legal work.  The major difference in the use of AI is that in contract management the AI system is focused on identifying similar clauses from a bank of pre-trained clauses. In contrast, the eDiscovery systems typically don’t utilise pre-training, as each litigation matter requires the creation of a seed set of documents relevant to the issues in that particular dispute. 

AI in Contract Management

Over the past 5 years there has been a legal technology focus in developing AI based platforms that focus on large scale contract review for due diligence and which can also be used for contract management. There are now many technology providers competing in this market. While there are globally dominant providers there are also many promising and innovative start-ups.

Earlier this year it was reported that JP Morgan Chase & Co have developed a program called COIN for Contract Intelligence which interprets commercial-loan agreements, that had previously taken 360,000 hours of lawyers’ time annually.  It was reported that the ‘software reviews documents in seconds, is less error-prone and never asks for time off’.

The way in which AI is being developed to extract value for organisations to enable improved performance including financial performance directly to the bottom line is explained below.

AI extracting value

As the JP Morgan Chase & Co example illustrates, these AI platforms are designed to identify clauses or standard agreements and generally come pre-trained with a bank of clauses or standard agreements that can be automatically generated at significantly reduced cost and applied to the particular deal.  As an agreement is generated by the AI system, it is utilising machine learning to automatically extract the relevant clauses to the particular situation. In addition, most of these platforms also have a custom clause feature where each client (the bulk of their clients are typically law firms) train their own specific banks of clauses. 

A major benefit of AI in contracts is that risks can be minimised by the automatic auditing which ensures contracts are standardised – this is important in particularly large organisations operating across different regions or globally.   Other ways in which value is extracted is by, for example, improved and automatic auditing of contracts - such as billing and contract renewals – ensuring that maximum dollars are extracted in accordance with the contract.  

Dashboards and visualisations

In addition to clause banks, the contract AI systems also have dashboards and visualisation mechanisms that enable insights into the corpus at a glance. For example, you are able to see which documents contain a clause, or more importantly which documents omit a clause, as well as other insights such as entity extraction (person, organisation, location, currency, date/time, numbers).

In some instances where the corpus is largely standard contracts that have little variance from the standard, you are able to use these insights very effectively because the review of the standard and then sample similar copies can be significantly expedited.

Clustering of similar contracts

Another feature is the ability to navigate a set of documents via clusters of similar documents. The similarity of these clusters are influenced by the clauses contained within the documents. Clustering can be used as a way to quickly navigate between a corpus, and this feature coupled with the dashboards and other visualisations enables you to rapidly gain insights into the contracts prior to actually reviewing the contracts.

Identification of non-standard and anomalous clauses and associated risk identification

Another very useful function is to have the capability to identify non-standard clauses, or clauses that have a significant variance, as well as the ability to train the system on the risk associated with variance on a standard clause.  This can be important flag as to any potential risks or issues or a beneficial change that should be made to future AI generated contracts.

Auto-routing of contracts to SMEs

These systems also have the capability to quickly and efficiently route or allocate documents to subject matter experts (SMEs). For example, having the ability to route real estate leases to real estate experts based upon the clauses identified in the document.

Automated report generation

As the lawyers review the document and the associated clauses that have been identified, they also make annotations and comments against the document. These can then be automatically exported and presented to clients with minimal editing.

Contract management systems

These systems and techniques can also be used for augmenting contract management systems. Key differences would be scale - in typical due diligence you are looking at a selection of key contracts, whereas these systems can be utilised to give insights across all the contracts. If these systems are used within an organisation, in theory there would be a lot less variance in the pool of template or standard contracts, which enables greater focus on fine tuning the organisation’s bank of standard clauses, which in turn would result in higher accuracy and efficiency.

Other features of AI in contracts include:

  • auto identification of legislation / jurisdiction and location
  • auto identification of parties
  • entity extraction (currency / numbers)
  • key dates
  • key clauses
  • identification of non-standard and anomalous clauses/contracts and associated risk identification

Chat bots

Over the past 12 months, there has been some interesting developments with Chat bots entering the legal area. A chat bot is a chatter robot, which is a type of conversational agent.  It is a computer program designed to simulate a conversation with human users through textual or auditory methods.    A chat bot is typically suited for simple question and answers - i.e. Frequently Asked Questions (FAQ), where there is a set amount of answers on topics.  The bot will analyse a question and attempt to provide an answer on the related topic. Chat bots require considerable training so that the system is able to match the question text against its training set for a relevant topic.

The more powerful chat bots are AI powered where they can readily understand free text (called Natural Language Processing (NLP) or Natural Language Understanding (NLU). These can be applied in many different scenarios – and it is easy for foresee the societal benefits they could bring, for example, to community legal aid centres where they are able to help triage and quantify basic legal questions, such as tenancy issues, managing debt issues etc.

The next frontier is the expansion to speech - i.e. Apple Siri, Alphabet Ok Google, Microsoft Cortana, Amazon Alexa.

A very interesting and arguably the most well-known use of the Chat bots is the 'Do not Pay' ( bot which had astonishing success in the UK.  In its first 21 months, it took on 250,000 cases relating to parking tickets and won 160,000 (a success rate of 64%) - worth over USD$4 million in parking tickets.    Over the past 3 months, the service has expanded for refugees applying for asylum in the US, the UK and in Canada.

Following the recent Equifax data breach in September 2017, which is reported to have exposed personal information of nearly half the U.S population, about 143 million people,  the author of the Do Not Pay chatbot (Joshua Browder) has launched a chatbot designed to help people make a claim against Equifax for up to $25,000– click here to view.

The development in Chat bots highlight the many innovative uses of AI in the legal industry across the spectrum - from start-ups, incumbent technology companies, law firms, as well as the increasing trend for in-house counsel deploying solutions internally. As the existing technology systems evolve and new systems emerge, we will continue to see the transformation and convergence of technology and the law, which will improve efficiency and deliver cost savings to consumers.

Matthew Golab
Gilbert + Tobin