GReAT’s KLara project
In order to hunt efficiently for malware, one needs a large collection of samples to search over. Researchers usually need to fire a Yara rule over a collection / set of malicious files and then get the results back. In some cases, the rule needs adjusting.
Unfortunately, scanning a large collection of files takes time. Instead, if a custom architecture is used, scanning 10TB of files can take around 30 minutes.
Klara, a distributed system written in Python, allows researchers to scan one or more Yara rules over collections with samples, getting notifies by e-mail as well as the web interface when scan results are ready.
- Modern web interface, allowing researchers to “fire and forget” their rules, getting back results by e-mail / API
- Powerful API, allowing for automatic Yara jobs submissions, checking their status and getting back results. API Documentation will be released soon.
- Distributed system, running on commodity hardware.
Klara leverages Yara’s power, distributing scans using a dispatcher-worker model.
Each worker server connects to a dispatcher trying to check if new jobs are available. If a new job is indeed available, it checks to see if the required scan repository is available on its own filesystem and, if it is, it will start the Yara scan with the rules submitted by the researcher.
The main issue Klara tries to solve is running Yara jobs over a large collection of malware samples (>1TB) in a reasonable amount of time.
Please refer to instructions outlined here