Checking Linux Kernel

There are some specifics about the kernel code. Here is described what is special on kernel handling in Stanse.

For methods of how to actually run Stanse on the kernel, see Tutorials.

Preprocessed function names

Sometimes it's useful to catch macros by XML patterns. However the macros may expand to different function names or different constants, depending on the kernel configuration. This is unwanted and we workaround this in Stanse.

Since our preprocessor is run to speed up the compilation phase, the preprocessor is also used to avoid the macro expansion. To some predefined set of symbols is prepended __st_ and appended _st__ while preprocessing. The symbols in question are currently:

GFP_KERNEL
get_cpu
put_cpu
get_user
put_user
likely
unlikely
local_irq_disable
local_irq_enable
local_irq_restore
local_irq_save
lock_kernel
unlock_kernel
mutex_lock
mutex_lock_interruptible
mutex_lock_interruptible_nested
mutex_lock_killable
mutex_lock_killable_nested
mutex_lock_nested
mutex_trylock
mutex_unlock
preempt_disable
preempt_enable
rcu_read_lock
rcu_read_unlock
spin_lock
spin_lock_bh
spin_lock_irq
spin_lock_irq_nested
spin_lock_irqsave
spin_lock_irqsave_nested
spin_lock_nested
spin_trylock
spin_trylock_irq
spin_trylock_irqsave
spin_unlock
spin_unlock_bh
spin_unlock_irq
spin_unlock_irqrestore
      

It means that patterns for these functions/macros must be set down even with the 'st' encapsulation. For example to match a spin_lock function call in the code like this:

void fun(spinlock_t *lock)
{
	spin_lock(lock);
	…
	spin_unlock(lock);
}
      

The following XML pattern needs to be used:

<pattern name="lock">
  <functionCall>
    <id>__st_spin_lock_st__</id>
    <var name="A" />
  </functionCall>
</pattern>
      

Predefined symbols

Some of the symbols are predefined either because they are defined internally in the C compiler or they allow better pattern matching (e.g. likely/unlikely removal). The symbols are defined as follows:
__st_unlikely_st__(x)=x
__st_likely_st__(x)=x
__attribute__(x)=
__attribute(x)=
va_list='int'
va_arg(ap,t)=(t)0
__builtin_va_list=int
__builtin_offsetof(x,y)=1
__builtin_types_compatible_p(x,y)=0
__builtin_va_arg(ap,t)=(t)0
      

Bundled kernel checker definitions

Currently there are four XML checker definitions for the AutomatonChecker in the Stanse package. They were tested on the kernel code and found many programming errors. If you find any bugs in the XML files themselves, please report back to us. We will appreciate even changes lowering false positive rate.

kernel_locking_irq.xml

It takes care of proper locking discipline. Especially locking is taken into account here, but it aims at irq handling too.

Since we do not execute symbolically, some false positives may occur for trylocks. When a developer assigns the trylock return value to a variable and tests it afterwards, we currently don't catch it by patterns now.

Hence there are patterns for if (trylock()) direct conditions only.

kernel_pairing.xml

This one tries to catch wrong pairing of functions such as tty_port_tty_get/tty_kref_put, pci_dev_get/pci_dev_put, get_cpu/put_cpu, etc.

The automaton is defined as 3-state. Variables might be in put (initial state), gotten (resource requested) or assigned state. Depending on the function to be matches, 2 or 3 of them are used.

Functions whose return value needn't be checked against NULL use only 2 states. E.g. get_cpu/put_cpu suits this category.

On the contrary some function return values are checked against NULL. Hence the automaton needs to go through another state called assigned. tty_port_tty_get is a good example of this case. See the following example:

	void fun(void)
put	{
		void *tty;
		int cpu;

get		cpu = get_cpu();
		print("we have to be running on cpu: ", cpu);
put		put_cpu();
		…
assign[tty]	tty = tty_port_tty_get(&port);
		if (tty)
get[tty]		print("we have tty");
		else
put[tty]		print("we do NOT have tty");
put[tty]	tty_kref_put(tty);
	}
      

The first column shows a state change happening on the line it precedes.

To distinguish more cases, automata are parametrized by a variable whenever it is possible. This is performed for the tty case in our example and depicted in brackets.

kernel_atomic.xml

It checks whether non-atomic operations are executed in an atomic context. It follows up to 5 nested make-atomic functions. Those are for example spin locks and irq disabling functions.

The XML defines many non-atomic operations. They include allocations, sleeps, mapping and others.

kernel_memory.xml

The memory automaton follows memory allocations, frees and usage and tries to prove any of these is not violated. Namely it watches correct NULL pointer testing after allocations, memory leaks, a pointer dereference after free, double free, etc.