ホーム
リソース
ブログ

Becoming a Digital Archaeologist by Gathering Early Case Intelligence

Written By

Published: Aug 08, 2025

Updated:

As organizations’ volumes of ESI (electronically stored information) have grown, early case assessment (ECA) has become a kind of “digital archaeology” in search of actionable intelligence: sifting through enormous digital dig sites, finding the key players, determining the facts, unearthing the documents, and curating your case. We call this broader evolution of ECA “Early Case Intelligence” or ECI, encompassing traditional ECA, early data intelligence (EDA) and review preparation.

It’s not enough for lawyers to merely understand the law and their clients; they must build this ECI competency as digital archaeologists.

To explore the set of practices referred to as “ECI” or Early Case Intelligence, we must first explore the digital archaeology that drives traditional ECA.

In this article, we review how to gather early case intelligence (ECI) so that legal practitioners can become digital archaeologists capable of effectively unearthing actionable ECI.

Starting with ECA and ECI

In August 2012, the American Bar Association (ABA) implemented changes to its Model Rules of Professional Conduct, making the need to maintain technology competence explicit.  Since then, this requirement or its variations have been implemented in forty states.

When it comes to being an effective archaeologist, it is not a question of utilizing as many tools as possible.  Rather, it is a question of selecting the right ones to best serve your current goal, whether that’s traditional ECA, data assessment, investigation, or review preparation, then building on those steps to achieve all your project’s goals. 

Pursuing Traditional ECA

Traditional ECA is a set of tactics focused on lessening the uncertainties that come with the risks and expenses related to a new legal matter, which are used to inform decisions on how to proceed. 

Undertaking this requires reviewing relevant electronically stored information (ESI); as such, using eDiscovery processes is essential for effective traditional ECA.

If you don’t know a lot about the materials you’re seeking, you will want to start with one or more of the tools and techniques best suited to revealing unknowns:

  • Prevalence estimations, based on formal sampling, are used to determine the amount of something under your possession (e.g., relevant documents, privileged docs, documents requiring redaction, etc.).
  • Formal and judgmental sampling can be used to get the linguistic cross-section you describe. These sampling techniques are used to select and analyze subsets of data, such as documents, emails, or other ESI to make decisions about the larger data set.
  • Visualization tools (communication maps, word clouds, etc.) can reveal patterns of communication and behavior and help complete the picture of what happened.
  • Conceptual indexing features let you use concept searching to find relevant materials without knowing the best search terms, concept clustering to explore a cross-section of topics, and categorization to use a few relevant examples.
  • Entity Extraction: Newer entity extraction tools also aid in traditional ECA by helping you identify the key individuals, organizations, and locations in your matter. Analytics tools can identify and extract a broad array of information, such as people’s names, addresses, email addresses, phone numbers, organization names, etc. Entity extraction applications, which are also known as Named Entity Recognizers (NERs), apply artificial intelligence to classify various types of entities, such as a location, a person, or an organization.

Image Analysis: Image analysis tools allow users to find objects of interest within images or videos. Searching images for objects of interest (e.g., a person, a passport photo) often occurs when working with mobile phone data or security video. Image analysis applications work to identify objects by examining an image’s pixels. When a group of pixels form a pattern, the application can then isolate distinct objects within the image.

  • Continuous active learning (TAR 2.0) workflows can rapidly surface relevant materials in suitable document collections.

Once you understand what you’re seeking, you can transition from these initial, exploratory efforts to more targeted search and filtering efforts, which can quickly find relevant materials.  As you find relevant documents to review, you can also use thread and duplicate management tools to find related materials for context, such as with related emails, alternate drafts, etc.

Pursuing Electronic Data Assessment (EDA)

If your top priority is assessing your ESI, finding individual documents is less important than ensuring a sufficiently complete collection has taken place and that any filtering applied during processing has not been excessive.  In such situations, your focus should be on seeing the big picture of your ESI collection and revealing the gaps within it:

Metadata filtering and visualization tools can help you assess the completeness of your ESI collection by discovering gaps in metadata values, including date ranges, by generating communication maps to show the connections between custodians. They can also generate word clouds to show the language commonly used

Concept clustering can provide a valuable overview of the content types and topics within your materials, including revealing an absence of things you expected or the presence of unnecessary things.

  • Thread and duplicate management tools can also reveal gaps requiring further collection, or they can reveal the presence of excessive near-duplicates, suggesting a collection or processing issue.
  • PII Analytics: Personally Identifiable Information (PII) includes someone’s name, social security number, passport number, driver’s license number, taxpayer identification number, patient identification number, biometric data, etc. As more privacy laws and regulations spread throughout jurisdictions, protecting PII is becoming all-the-more important, as it also appears in the ESI gathered for litigation and investigations.
  • PII analytic tools are designed to automatically identify certain kinds of PII wherever it appears in the collected ESI. These tools can work in the two following ways: by operating based on pattern matching or they can operate based on algorithmic or AI analysis.

Formal random sampling is useful if there are disputes over the appropriate scope of preservation and collection that need to be resolved.  Sampling to estimate prevalence can be used to prioritize different sources and custodians and to estimate costs and benefits associated with specific proposed work.

Pursuing Review Prep

When your top priority is review planning and preparation, you’ll need to understand the properties and composition of the ESI to be reviewed. That understanding can then inform the selection of effective tools and techniques for culling and the selection of effective methodologies for review.  You can leverage the following techniques:

  • Formal prevalence estimates work to reveal a variety of important details about a document collection, including how much relevant, privileged, or sensitive material is present.  Leveraging this information, legal practitioners can effectively evaluate potential workflows, including TAR and CAL workflows, and can accurately estimate the time and costs required.
  • Formal random sampling can be used to test classifiers, iteratively improving searches you will apply for culling, ensuring that they minimize unnecessary downstream review work and avoid missing any important materials. 
  • Searching and metadata filtering can eliminate much of the chaff without losing an unreasonable amount of the wheat, reducing all downstream review and production costs. Search and filtering tools, including newer visualization tools find specific materials, identify gaps in your collection, eliminating irrelevant materials prior to review.
  • Thread and duplicate management tools can dramatically speed up later review work, eliminating materials not requiring review and providing superior organization.
  • Semantic indexing features offer concept clustering to organize and prioritize subsequent review activity and categorization to power TAR workflows.
  • Generative AI: AI applications can generate new content based on their understanding of existing content. Applications have been created that can respond to natural language queries, draft natural language responses, generate images, generate audio, and generate video. Large language models are a kind of GenAI trained on enormous collections of text, often scraped from the Internet (LLMs allow generative AI applications to understand grammar and sentence structure, enabling them to predict the best next set of words accurately.

Mastering Early Case Intelligence

The early case assessment phase of discovery has evolved, encompassing three connected-but-distinct “archaeological” intelligence gathering activities: traditional early case assessment, early data assessment, and review preparation.

To pursue these three activities and their goals, practitioners can leverage the following tools and techniques: sampling; searching and filtering; structural analytics; conceptual analytics; and newer tools that use PII analytics, entity extraction, image analysis, and generative AI.

Successfully achieving your ESI goals requires leveraging the right combination of these tools and techniques based on what you seek to find and how much you already know.

No items found.

Consilioの最新情報にサインアップ

ロレム・イプサム・ドロール・シット・メット、コネクター・ディピッシング・エリット。様々なものを悲惨な要素にぶつけます。
ありがとう!提出物が受理されました!
「サインアップ」をクリックすると、当社に同意したものとみなされます プライバシーポリシー
おっと!フォームの送信中に問題が発生しました。