LETC - Little (eDiscovery) Engine That Could
Problem
eDiscovery engines need to solve two problems upfront: stability and scalability.
They need stability to successfully process gigabytes of data without interruptions
and operator interventions. And they need scalability to cope with
the ever-increasing volumes of information.
Scalability does not only mean that
the engine should be able to work on a grid of computers, utilizing hundreds and
thousands of machines. It also means that adding significant processing power
should not add significant amount to licensing costs.
Solution
An eDiscovery search engine has a good chance of being stable if it runs
in Linux and uses open-source software components. Then it bypasses the problems
usually associated with consumer-oriented packages.
Moreover, Linux and open-source components allow to scale on demand, using hundreds
and thousands of computers in the cloud without additional license costs.
Our eDiscovery engine runs on Linux, is written in Java, and utilizes Lucene search
engine combined with multiple open-source libraries for file processing.
This simple interface provides a view into the concept.

The engine can serve as a core
for more extended functionality and integrate with more specialized discovery systems.
|