eDiscovery skills

If you're curious about eDiscovery, what skills should you learn?

Legal technology is one subset of Legal innovation.  I'd file eDiscovery under legaltech, though it may be a service.  eDiscovery is an aspect of most major litigation.  The basic goal is for defendants to 'produce' (turn over) electronic information in accordance with the courts' requirements, and for plaintiffs to find useful evidence within that data.  Recently the Duke Conference and the 2015 Amendments to the Federal Rules of Civil Procedure (FRCP, pay special attention to rule 16, 26, and 37e, which enforces) explicitly concern(ed) electronic evidence.  I have to say, linking to supremecourt.gov is always fun.

I asked my professor for 'homework' to do over Winter break, because I'm that sort of person.  Here is what he recommended for eDiscovery.

Learn Python

To generalize, Python is a high level (you write more abstract, ie. less, code for any given task) which has a large community and is thought to be a good language to interact with data.

One benefit of Python - if you have a Mac you already have it installed.  Search for "Terminal" in spotlight.

If you've never written a line of software code, I recommend starting with Zed Shaw's Learn Python the Hard Way (promise me you'll do the terminal exercises), but I've also seen people have great outcomes (ie. zero experience with software to six figure job) using Treehouse.

Read about Information Retrieval

Information Retrieval (IR for short) is the discipline of using technology to access information.  Google and search engines are obviously is the logical extension of IR.  It's funny - the intro of the book points out that the public used to prefer talking to another person to get information.  My generation of course prefers to interact with a machine as it's more option-rich, usually quicker, and arguably more accurate for the bulk of retrieval activities, ie. getting facts or data. 

Here is the recommended book, Introduction to Information Retrieval by Manning, Raghavan, and Shutze of Stanford. (NB: pay attention to the pre-requisites).    

Take a look at Machine Learning

My understanding of Machine learning (ML) is analogous "the opposite" of programming.  Unlike programming, where the maker create something (software) that tells a machine what to do with an input (ie. multiply input by 2), and later passes it that input asks it for an output (ie. input = 3, 3*2, output = 6).  Machine learning instead provides inputs and outputs, and the maker asks the computer to create a program that conforms.  This is called "training the model" and as far as I'm currently aware ML is retroactive, backwards-looking, helping you answer what the effects of something (ie does a yellow background increase or decrease clickthroughs?).

Learn on Coursera


It's worth pointing out that this is much more work than any normal person can accomplish in 3 weeks.