Building An Evidence Base
When litigation is taken, the person taking the case must prove their case to the court.
This can be an incredibly difficult exercise when it comes to litigating on issues concerning machine learning and automated decision-making technologies, as the specifics of how they are used and operate are often kept hidden and confidential.
When taking a case, it is necessary to show the court that what is being claimed to have happened had actually happened. The court will look for evidence that:
the behaviour that is being alleged to have breached human rights has happened (i.e. the conduct of the other party), for example that machine learning technologies have been developed or used in a certain way;
this behaviour has resulted in a harmful impact on an individual’s or individuals’ human rights (i.e. the type of harm suffered by the person taking the case, and where, when and how it happened). If the person is seeking financial compensation, there will need to be evidence of the loss or damage suffered by this harm;
the person or entity against whom the case has been taken was responsible for this behaviour; and
that this particular behaviour has caused the harm.
The behaviour complained of might not be an action taken by the other party, but a failure to have taken action. This is common in cases concerning automated systems where, for example, a public body failed to take steps to safeguard rights against a harmful technology. Proving that something has not happened can sometimes be more difficult than demonstrating that it had happened.
Who has to prove the case?
It is usually up to the person taking the case to demonstrate all the elements outlined above. However, there are some exceptions.
For example, where discrimination is being alleged, once the person taking the case has shown that they were treated differently by the other party to persons in a similar position on the basis of a protected status (e.g. race, or gender), then it will be for the other party to show that there was a legitimate and reasonable explanation for this difference in treatment. It will, therefore, not be necessary for the person taking the case to show actual discrimination against a specific individual.
This happened in the case of Italian General Confederation of Labour v. Deliveroo. Once Deliveroo riders could produce evidence from which discrimination by use of an algorithm could be inferred, the burden was placed on Deliveroo to show that their behaviour was non-discriminatory.
The specifics of what needs to be shown, and the extent to which it needs to be demonstrated, can vary depending on the law that is being enforced before the court and the legal system in which the case is taken. As a result, it is always a good idea to talk to a lawyer about what evidence you might need and the best way to gather, record and store it.
In human rights cases, the rule is normally that the party taking the case is trying to show that it's more likely than not that their human rights had been violated by the other party. This is lower than what needs to be proven in criminal cases, which is usually “beyond reasonable doubt.”
What evidence might be needed?
There are various forms of evidence that can be produced before a court. The key types of evidence that might be used include:
Real (tangible) evidence: this is usually a material object of some kind, that can be produced for inspection. This can be used as evidence to show that the object itself exists, or so that inferences can be drawn as to its condition. In cases concerning automated systems, this might be the physical entity that uses such machine learning (e.g. robotics technology).
Documentary evidence: this is evidence that is capable of being recorded in documentary form. It can include a whole spectrum of hard copy and digital documentation, including contracts, letters, notes, certificates, emails, reports (e.g. research and civil society reports), photographs, videos, audio files, source code, statistical models, data sets, web content, manuals and legal memos. Many courts will accept computer printouts and copies of digital evidence as originals.
Expert evidence: these are written reports from experts that are produced for submission before the court. This expert can provide their opinion on a set of facts, and they will do so in order to help the court understand and make sense of the facts before it. Courts may, of their own initiative, appoint a single expert to give evidence on a particular matter. This expert might, for example, help a court understand the statistical or probabilistic modelling that underpins a system under scrutiny in a particular case. In some circumstances, judges might get the experts of both sides together, a process known in some places as “hot tubbing,” to have a conversation on their area of expertise. Expert evidence has been used successfully in a UK challenge to police use of facial recognition technology. In that case, Dr. Anil Jain explained to the court (among other things) that, without access to the training data for the technology, the police would not be able to sufficiently convince itself that the technology was not biased on the basis of race or gender.
Witness statements and testimony: witness statements are a written document that record the evidence (i.e. assertions of fact) of a particular person. This is the primary means by which an individual can explain to the court what has happened to them, and the harm they have sustained. It is also a means by which others can give evidence on relevant things that they saw, heard or felt. The witness statement is usually signed by the individual whose account it is, in order to attest to the truth of the statement. It may also be necessary for the witness to give their evidence orally in open court. In Italian General Confederation of Labour v. Deliveroo, witness evidence was given by a trade union leader, Silvia Simoncini, on the fact that a Deliveroo performance algorithm had deterred riders against taking strike or protest action because they were worried about how such activities would impact their future job opportunities. There are freely available resources available that provide guidance on preparing witness statements, including here and here.
Broadly speaking, real and documentary evidence will carry the greatest weight before a court.
Courts will also have detailed rules around the kind of evidence it will accept in particular cases.
However, as a general rule, they will only consider evidence that is relevant to the case. The court will also want to be sure of the reliability of the evidence presented to it.
With documents, the court will want to be assured that the documents submitted are originals or authentic copies. It will also want to be sure that evidence is handled and stored in a way that preserves its integrity, authenticity and reliability, so it is always important for evidence gathering to include clear and detailed records and a paper trail showing the chain of custody of the evidence and the context in which it was created.
Where do I find this evidence?
Collecting evidence to prove a case can be a long and arduous process, utilising a range of avenues of research. A number of sources for evidence include:
Speaking with people: the person who has suffered the legal harm is central to any human rights claim, and they will provide evidence through their own witness testimony. However, there will also be others with valuable insights and evidence to share. For instance, further evidence may be gathered by speaking with those in a similar position or who have experienced a similar harm. They may end up taking their own legal action. Talking with others can even provide insight into the specifics of the automated system being used. Take the case of Glenn Rodriquez, who was looking to challenge a risk scoring algorithm that was being relied on to deny him parole. He started speaking with other inmates and, in doing so, was able to work out the specific input that was giving him a poor risk score by comparing his input fields with the input fields of other inmates (these inputs were recorded in a questionnaire carried out by a corrections officer). He found one person who had all the same input fields as him except one, and he worked out that this data point had been incorrectly recorded by a corrections officer. He was then able to use this evidence at his next parole hearing to show the inaccuracy of the risk score.
Publicly available information: it may be that evidence can be collected through research of public records and documents. This might be done through an online search, or by trawling through other forms of media. For example, a challenge that was taken in the UK against a government algorithm used to stream visa applicants was taken after a national newspaper had uncovered its use. Marketing materials of companies developing “AI” technologies might also provide insight into how their technologies eventually get used. In 2019, a contract between a police force in Greece and a technology company for “smart policing” technology was made public by the vendor while the local police kept the contract under wraps. Public registers may also be a source of valuable evidence, in some countries there may be registers of automated decision-making systems used by government departments or of public contracts. Details might be published on the budgets of public bodies, or on the procurement or tendering of government contracts, that can give an insight into technologies being used or acquired by governments. It may even be possible to find certain policy or guidance documents online, such as impact assessments (here is one from a UK police force on live facial recognition technology). You will be surprised what you might find freely available online. In Italian General Confederation of Labour v. Deliveroo, for example, the court cited Deliveroo’s own website several times as evidence that they used an algorithm to unlawfully discriminate against riders.
Reports: it may be possible to gather evidence from the reports of NGOs, researchers, independent experts, and government organisations. There are a number of NGOs and non-profits, for example, that publish reports on machine learning technologies, from the Algorithm Justice League to Algorithm Watch. It may also be possible to gather evidence from reports published by government or official bodies. In 2019, for example, the Law Society of England and Wales (a professional body that governs lawyers) published a detailed report on the use of algorithms in the criminal justice system.
Journalism and research sources: more forensic evidence gathering might require the help of an investigative journalist, technologist or computer scientist. A good place to start for finding help from journalists is the membership of the Global Investigative Journalism Network, which includes organisations that conduct investigative and data journalism. One of their members, the Bureau of Investigative Journalism, for example, currently has a project on “Decision Machines.” Many great computer scientists and researchers can also be found amongst the organisers and participants of the ACM FAccT Conference (ACM Conference on Fairness, Accountability, and Transparency).
Pre-litigation correspondence and disclosure: before taking a case, there will usually be correspondence between the person thinking of taking the case and the potential target for the litigation. This process, in itself, can be a useful means of asking for and receiving information about a system. When the case is finally taken, a process of disclosure will take place. This is where both parties are expected to share all evidence in their possession that has a bearing on the case. It is more likely than not that the party with access to details of an automated system will refuse to disclose such information because it is a “trade secret” or is otherwise protected by confidentiality. Nonetheless, the courts have mechanisms for protecting confidential information to be exchanged in legal disputes. For instance, a “confidentiality ring” allows parties to exchange information in a safe and secure environment, with access made available to a limited number of individuals under the supervision of each party. In a recent commercial dispute between Foundem and Google, for example, a court in the UK was willing to allow Foundem’s expert witness to have access to a “confidentiality ring” in order to inspect information disclosing the operation of Google's search algorithms.
Other cases: it may be that other court cases have been taken on the same or similar matters in the past. Other cases decisions, or the documents filed in these other decisions, can be a useful source for evidence.
Other accountability mechanisms: there may be other mechanisms of accountability, other than the courts, that can provide insight into an entity and the automated systems it might be using. This could involve asking politicians to raise certain questions in parliament or at committee hearings where they involve the interrogation of individuals or entities using certain systems. Regulatory bodies can also play a fact-finding role in investigating the legal and ethical implications of certain technologies. In 2020, for example, the controversial facial recognition technology company, Clearview AI, was subject to a joint investigation by data protection authorities in the UK and Australia.
Subject access requests: data protection laws often recognise the right of the individual to ask for copies of their personal data from a company or other entity. This can be a means by which an individual can find out what personal information an entity holds about them, how they are using it (including by automated means), who they are sharing it with, and where they got the data from. There are freely available resources on putting together such requests, one is available here.
Freedom of information requests: it is an established principle of international human rights law that countries must facilitate access to “information” held by public bodies. This is often referred to as the right to access information (ATI), or freedom of information (FOI). The type of “information” that countries must facilitate such access to has been given an incredibly broad definition and has been recognised by a number of countries to include algorithms and computer codes. There are cases from France, Italy, and the US supporting this. In one case from the US, a court reasoned that computer programmes should be subject to such laws because they “preserve information and perpetuate knowledge.” If a public body refuses to disclose such information without a lawful reason, it may be necessary to challenge this refusal before a court. This was successfully done, for example, in Sweden, where a journalist took a case to an administrative court of appeal arguing that the source code of an algorithm used to process social security payments should be disclosed under freedom of information law. The court agreed. There are a number of great resources on freedom of information requests, including here. Some resources look specifically at accessing information on automated systems, see here and here.
Because laws on what kind of information can be admitted as evidence to court, and what information might be disclosed in response to subject access requests and freedom of information requests, vary from jurisdiction to jurisdiction, it is always recommended to consult with a local lawyer practicing in the relevant country where the legal action is to be taken before gathering evidence.
What questions might be asked?
Gathering evidence is a long process of asking questions.
To try to gather as much information as possible about an automated system, you may try to ask a public body or (where possible, and with slight adaptation) another entity to facilitate access to the following information.
Identifying the Use of Relevant Systems
If you suspect that algorithmically driven systems are being used by an entity, you might ask questions aimed at confirming whether such systems are in fact being used by that entity and the purpose of the systems.
To do this, you may want to ask for all records, inventories, briefing notes, reports, descriptions, studies, evaluations or summaries concerning or detailing:
the algorithmically driven data systems currently (or previously) being used by the entity;
the purpose of the algorithmically driven data systems (i.e. what they do for the relevant entity); and
the technical functioning of the algorithmically driven data systems, including but not limited to details about the algorithms themselves, their software, hardware, operations, source code, models, developer documentation, operator manuals, the types of input data they use, and other relevant technical information.
Development and Procurement
It is also worth asking for information around how the system has been designed, developed and procured. In doing so, you may wish to clarify whether the relevant entity uses a commercial/off-the-shelf product or whether they have developed their own system. For example, you may ask for:
any records referencing or relating to the public process preceding the procurement or acquisition of the algorithmically driven data system, including public meeting agendas or minutes, public notices, objective setting documents, project proposals, analyses, studies, tender documents, or communications between the body and elected officials or public servants;
any contracts, agreements, or licenses related to the system or related services, including any agreements for or permission to develop, use, test, or evaluate such tools and services with any third-party vendor or consultant(s);
any documents, including data processing agreements, summaries, descriptions and visualisations, regarding the data that has been shared with third parties to develop, train or test the system and its previous iterations/prototypes, including for regression analyses.
Note: If you already have a specific system that you are trying to get further information on, then it is always worth asking for all contracts relating to that specific system. Be prepared for the entity to rely on commercial sensitivity to not release these contracts. Indicate that the contract can be redacted, but details that could be released include details about the existence of a system, it's purpose, the tasks being contracted, continued relationships, general descriptions, types of data being used, and how this data is shared.
Functioning of the System
It may also be worth asking for specific information around the functioning of the system, including the input data it uses and how it may profile individuals. For example, you may ask for:
all records about the traits, characteristics, or factors used to develop the data fields in the system;
all records showing the full list of data fields in the system;
all records of de-identified input data processed by the system;
all records of de-identified output data from the system;
details of the variables that form the basis of the system;
any data visualisation outputs connected to the system;
an overview about how system outputs are produced;
the source code, algorithms and algorithmic processes incorporated in the system, including mathematical descriptions of such algorithms;
any promotional material or presentation material related to the system, in particular those describing the function and purpose of the system.
Note: You may want to ask the extent to which they have access to or control of the above information. If this information is not available to the entity, then this raises concerns over the extent to which they have oversight and control of the system.
As well as determining how the system itself functions, the request might also try to clarify how the system is used within or by the entity itself. For example, you might ask for:
any internal communications, policies, practices, procedures, reports, memoranda, and training/educational materials (e.g. slides, manuals etc.) for using the system and for collecting, storing, accessing, and sharing information, data and analysis processed by the system, including transfers to third parties;
all records of the decisions the system is asked to make or assist;
all records showing how staff use system outputs in decision-making;
information on the protocol, process or method that is adopted to inform an individual that they have been subject to a decision made or aided by the system;
any records showing which entities outside of the entity have accessed, used or requested to use the system.
Note: You may also ask more targeted questions aimed at confirming the suspected purpose or function of the system. E.g. Does the system predict behavior or actions? Is the system used to determine what resources a person will receive or how they may be treated? Is the system classifying or profiling an individual or group of people? Is the system making or assisting decisions that have an impact on the legal rights or entitlements of other individuals?
Testing and Auditing
Your request may seek information that discloses the extent to which the system is tested, audited, assessed and verified, particularly with reference to whether the system is operating within the law and in compliance with human rights. For example, you may ask for:
all assessments or evaluations of algorithmically driven data systems, including all records of audits, and internal reviews of validation studies of the system;
all impact assessments, including Data Protection Impact Assessments, Human Rights Impact Assessments, and Equality Impact Assessments, conducted in relation to the system;
summaries of all legal actions or complaints that have been taken in relation to decisions that have been made or aided by the system.
Sometimes it is best to tailor your requests to be as specific as possible, to avoid the other party arguing that they do not have capacity or resources to process a request for information in a timely manner. That means, if the name of the system is known, it is best to use that in requests.
When gathering all this information, it is always worth consulting with a lawyer, computer scientist, technologist and/or social scientist to understand the implications of the information that has been disclosed.
It is usual for evidence to be gathered right up to the filing of a case to court, even beyond. Another thing that will need to be worked on at the same time is a strategy. You can read more about what that entails in the explainer on Working Out A Strategy.
If you notice anything incorrect or missing from our explanations, please let us know through the contact form!