LETC - Little (eDiscovery) Engine That Could

Problem

eDiscovery engines need to solve two problems upfront: stability and scalability.

They need stability to successfully process gigabytes of data without interruptions and operator interventions. And they need scalability to cope with the ever-increasing volumes of information.

Scalability does not only mean that the engine should be able to work on a grid of computers, utilizing hundreds and thousands of machines. It also means that adding significant processing power should not add significant amount to licensing costs.

Solution

An eDiscovery search engine has a good chance of being stable if it runs in Linux and uses open-source software components. Then it bypasses the problems usually associated with consumer-oriented packages.

Moreover, Linux and open-source components allow to scale on demand, using hundreds and thousands of computers in the cloud without additional license costs.

Our eDiscovery engine runs on Linux, is written in Java, and utilizes Lucene search engine combined with multiple open-source libraries for file processing.

This simple interface provides a view into the concept.


The engine can serve as a core for more extended functionality and integrate with more specialized discovery systems.