
YARA for Python
===============

This is a Python extension that gives you access to YARA's powerful features from 
your own Python scripts. 


HOW TO BUILD
============


yara-python depends on libyara, a library that implements YARA's core functions. You
must build and install YARA in your system before building yara-python. The latest
YARA version can be downloaded from:

http://yara.googlecode.com/files/yara-1.6.tar.gz


After installing YARA you can build yara-python this way:

$ tar xzvf yara-python-1.6.tar.gz
$ cd yara-python-1.6
$ python setup.py build
$ sudo python setup.py install

You can test your installation by invoking Python and importing the YARA module:

$ python
Python 2.7.1 (r271:86832, Jun 16 2011, 16:59:05) 
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import yara
>>>

In some operating systems (e.g: Ubuntu) you can get an error message like this one:

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: libyara.so.0: cannot open shared object file: No such file or directory


If you get the previous error you should add the path /usr/local/lib to the loader
configuration file:

$ sudo su
$ echo "/usr/local/lib" >> /etc/ld.so.conf
$ ldconfig


HOW TO USE
==========

Once yara-python is built and installed on your system you can use it as shown below:

import yara

Then you will need to compile your YARA rules before applying them to your data, the
rules can be compiled from a file path:

rules = yara.compile(filepath='/foo/bar/myrules')

The default argument is 'filepath', so you don't need to explicitly specify its name:

rules = yara.compile('/foo/bar/myrules')

You can also compile your rules from a file object:

fh = open('/foo/bar/myrules')
rules = yara.compile(file=fh)
fh.close()

Or you can compile them directly from a Python string:

rules = yara.compile(source='rule dummy { condition: true }')

If you want to compile a group of files or strings at the same time you can do it by using
the 'filepaths' or 'sources' named arguments:

rules = yara.compile(filepaths={

	'namespace1':'/my/path/rules1',
	'namespace2':'/my/path/rules2'
})

rules = yara.compile(sources={

	'namespace1':'rule dummy { condition: true }',
	'namespace2':'rule dummy { condition: false }'
})

Notice that both 'filepaths' and 'sources' must be dictionaries with keys of string type. The dictionary
keys are used as a namespace identifier, allowing to differentiate between rules with the same name in
different sources, as occurs in the second example with the “dummy” name.

The compile method also have an optional boolean parameter 'includes' which allows you to control
whether or not the include directive should be accepted in the source files, for example:

rules = yara.compile('/foo/bar/myrules', includes=False)

If the source file contains include directives the previous line would raise an exception.

If you are using external variables in your rules you must define those externals variables either while
compiling the rules, or while applying the rules to some file. To define your variables at the moment of
compilation you should pass the 'externals' parameter to the compile method. For example:

rules = yara.compile( '/foo/rules', 
				   externals= {
						'var1': 'some string',
						'var2': 4,
						'var3': True
				   })

The 'externals' parameter must be a dictionary with the names of the variables as keys and an associated
value of either string, integer or boolean type.

In all cases compile returns an instance of the class Rules, which in turn has a match method:

matches = rules.match('/foo/bar/myfile')

But you can also apply the rules to a Python string:

f = fopen('/foo/bar/myfile', 'rb')

matches = rules.match(data=f.read())

As in the case of compile, the 'match' method can receive definitions for externals variables in the externals
parameter.

matches = rules.match( '/foo/bar/myfile', 
				   externals= {
						'var1': 'some other string',
						'var4': 100,
				   })

Externals variables defined during compile-time don’t need to be defined again in subsequent invocations of
'match' method. However you can redefine any variable as needed, or provide additional definitions that weren’t
provided during compilation.

You can also specify a callback function when invoking match method. The provided function will be called for
every rule, no matter if matching or not. Your callback function should expect a single parameter of dictionary
type, and should return CALLBACK_CONTINUE to proceed to the next rule or CALLBACK_ABORT to stop applying rules to
your data.


Here is an example:

import yara

def mycallback(data):
	print data
	yara.CALLBACK_CONTINUE

matches = rules.match('/foo/bar/myfile', callback=mycallback)

The passed dictionary will be something like this:

{
	'tags': ['foo', 'bar'], 
	'matches': True, 
	'namespace': 'default', 
	'rule': 'my_rule', 
	'meta': {}, 
	'strings': [(81, '$a', 'abc'), (141, '$b', 'def')]
}

The 'matches' field indicates if the rules matches the data or not.

The 'strings' fields is a list of matching strings, with vectors of the form:

(<offset>, <string identifier>, <string data>)

The 'match' method returns a list of instances of the class Match. The instances of this class can be treated as text
strings containing the name of the matching rule. For example you can print them:

for m in matches:
	print "%s" % m

In some circumstances you may need to explicitly convert the instance of Match to string, for example when comparing
it with another string:

if str(matches[0]) == 'SomeRuleName':
	...

The Match class has the same attributes as the dictionary passed to the callback function:

-rule
-namespace
-meta
-tags
-strings
