This is a bit of a follow-up to the previous post about Ada SPARK. In it I mentioned that SPARK has some limitations, that every program must terminate and it can't perform dynamic memory allocation (update: this is not quite accurate, see the comments section in the Ada SPARK post). A few months after writing about SPARK I came across Frama-C, which is a tool for doing formal verification of C code using the ANSI/ISO C Specification Language (ACSL). With ACSL you can add contracts to C functions, and Frama-C can use some of its plugins to verify that said contracts are fulfilled. This is nice since C is more powerful than SPARK, and having contracts in C fixes one of its most glaring flaws: programs tending to crash, often and spectacularly.
In order to provide a fair comparison with Ada SPARK, I have decided to implement the same integer square root functions in ACSL. Let's start with the function that uses a for loop to compute the correct value:
/*@ requires 0 <= num <= INT_MAX;
ensures 0 <= \result <= 46340; //WP doesn't need this, but it's nice for users to know
ensures \result*\result <= num < (\result+1)*(\result+1);
assigns \nothing;
*/
int isqrt(int num) {
if (num < 2) {
return num;
}
/*@ loop assigns i;
loop invariant i*i <= num;
loop variant 46340-i;
*/
for (int i = 1; i < 46340; i++) {
if (num < (i+1)*(i+1)) {
return i;
}
}
return 46340;
}
We can see that ACSL annotations start with /*@ (or //@).
For this function we have some requirements, some statements on what the function ensures for the result,
and an assigns clause.
The assigns clause is something we didn't see in Ada SPARK.
It is used to say whether the function has any side effects.
This function doesn't, so we tell frama-c
that it assigns \nothing.
The rest of the function is similar to the SPARK example,
except we have to deal with some off-by-one issues due to differences in how C and Ada loops work.
The loop has an assigns clause, just like the function.
A loop invariant allows the Weakest Predicate (WP) plugin to reason about the behavior of i
,
and the variant clause lets WP know that the loop terminates, and how it terminates.
For simple loops it can often be omitted.
Next up is the recursive version, isqrt2
:
/*@ requires 1 <= lo < hi <= 46340;
requires lo*lo <= num < hi*hi;
ensures lo <= \result < hi;
ensures \result*\result <= num < (\result+1)*(\result+1);
assigns \nothing;
*/
int isqrt2_inner(int num, int lo, int hi) {
if (lo == hi-1) {
return lo;
}
int mid = (lo + hi) / 2;
if (mid*mid > num) {
return isqrt2_inner(num, lo, mid);
} else {
return isqrt2_inner(num, mid, hi);
}
}
/*@ requires 0 <= num <= INT_MAX;
ensures 0 <= \result <= 46340;
ensures \result*\result <= num < (\result+1)*(\result+1);
assigns \nothing;
*/
int isqrt2(int num) {
if (num < 2) {
return num;
}
if (num >= 46340*46340) {
return 46340;
}
return isqrt2_inner(num, 1, 46340);
}
Again we see the similarities with SPARK.
There are no annotations inside the functions,
suggesting WP is better able to reason about the code than gnatprove
.
Now that we have some code, we'll want to verify it.
For many tasks, just using the WP plugin is enough.
It is invoked via -wp
, and options to it are given via -wp-optionname
.
Help is provided with -wp-h
.
To invoke WP via Frama-C on our example isqrt implementations, we do: frama-c -wp -wp-rte isqrt.c
.
The -wp-rte
option suppresses "[wp] Warning: Missing RTE guards",
and is typically a good idea to include.
To quote the Frama-C WP tutorial:
[The RTE] plug-in generates assertions in the program wherever a runtime error might occur. Then, the WP plug-in can try to discharge the generated assertions.
For the full program listed in the appendix, the output of frama-c
(version Chlorine-20180502) is this:
$ frama-c -wp -wp-rte isqrt.c
[kernel] Parsing isqrt.c (with preprocessing)
[rte] annotating function isqrt
[rte] annotating function isqrt2
[rte] annotating function isqrt2_inner
[rte] annotating function main
[wp] 89 goals scheduled
[wp] Proved goals: 89 / 89
Qed: 42 (4ms-96ms-392ms)
Alt-Ergo: 47 (12ms-262ms-1.1s) (131)
This program is quite short and simple, so proving it is not such a huge task.
For larger programs things can quickly become tricky.
See for example
this paper
from the people behind ACSL by Example,
which formally proves a specialization of std::unique_copy()
.
Ghost code
Just like with SPARK, it is possible to use so-called "ghost code" to perform special kinds of tasks during verification. One thing I have found convenient is using ghost code to implement unit tests:
int main(void) {
// This ghost code acts like a unit test
/*@ ghost
int a0 = isqrt(0) - isqrt2(0);
int a1 = isqrt(1) - isqrt2(1);
int a2 = isqrt(2) - isqrt2(2);
int a3 = isqrt(3) - isqrt2(3);
int a4 = isqrt(4) - isqrt2(4);
int a5 = isqrt(46340*46340-1) - isqrt2(46340*46340-1);
int a6 = isqrt(46340*46340) - isqrt2(46340*46340);
int a7 = isqrt(46340*46340+1) - isqrt2(46340*46340+1);
int a8 = isqrt(INT_MAX) - isqrt2(INT_MAX);
*/
//@ assert a0 == 0;
//@ assert a1 == 0;
//@ assert a2 == 0;
//@ assert a3 == 0;
//@ assert a4 == 0;
//@ assert a5 == 0;
//@ assert a6 == 0;
//@ assert a7 == 0;
//@ assert a8 == 0;
return 0;
}
A more comprehensive test might ensure that \forall integer x; 0 <= x <= INT_MAX ==> isqrt(x) == isqrt2(x), using a function like this:
/*@ requires 0 <= num <= INT_MAX;
ensures \result == 0;
assigns \nothing;
*/
int diff(int num) {
return isqrt(num) - isqrt2(num);
}
How to actually prove the postcondition in this contract is left as an exercise for the reader. There might also be a smarter way of putting a "test" like this in the code without having to have an actual function, but I do not know how yet.
Dealing with unproven goals
If you have a problem proving something, you can tell WP to output its reasoning for each goal into a directory with -wp-out
:
$ frama-c -wp -wp-rte -wp-out goals isqrt.c
[kernel] Parsing isqrt.c (with preprocessing)
[rte] annotating function isqrt
[rte] annotating function isqrt2
[rte] annotating function isqrt2_inner
[rte] annotating function main
[wp] 46 goals scheduled
[wp] Proved goals: 46 / 46
Qed: 24 (4ms-16ms-68ms)
Alt-Ergo: 22 (16ms-34ms-148ms) (62)
$ ls goals/typed/
isqrt2_call_isqrt2_inner_pre_2_Alt-Ergo.mlw isqrt2_post_Alt-Ergo.mlw
isqrt2_call_isqrt2_inner_pre_2_Alt-Ergo.out isqrt2_post_Alt-Ergo.out
isqrt2_call_isqrt2_inner_pre_2.ergo isqrt2_post.ergo
isqrt2_inner_assert_rte_signed_overflow_2_Alt-Ergo.mlw isqrt_assert_rte_signed_overflow_2_Alt-Ergo.mlw
isqrt2_inner_assert_rte_signed_overflow_2_Alt-Ergo.out isqrt_assert_rte_signed_overflow_2_Alt-Ergo.out
isqrt2_inner_assert_rte_signed_overflow_2.ergo isqrt_assert_rte_signed_overflow_2.ergo
[...]
isqrt2_post_2_Alt-Ergo.mlw isqrt_post_Alt-Ergo.mlw
isqrt2_post_2_Alt-Ergo.out isqrt_post_Alt-Ergo.out
isqrt2_post_2.ergo isqrt_post.ergo
We can see that WP made use of Qed and Alt-Ergo to successfully prove all goals.
Let's take a look at isqrt_post.ergo
:
(* ---------------------------------------------------------- *)
(* --- Post-condition (file isqrt.c, line 4) in 'isqrt' --- *)
(* ---------------------------------------------------------- *)
goal isqrt_post:
forall i_3,i_2,i_1,i : int.
let x = (1 + i_1) : int in
let x_1 = (i_1 * i_1) : int in
let x_2 = to_sint32(x) : int in
let x_3 = (x_2 * x_2) : int in
(0 <= i) ->
(i <= 2147483647) ->
is_sint32(i) ->
is_sint32(i_1) ->
is_sint32(i_2) ->
is_sint32(i_3) ->
(((i <= 1) -> (i_1 = i)) and
((2 <= i) ->
(((i_3 * i_3) <= i) and
(((46340 <= i_3) -> (i_1 = 46340)) and
((i_3 <= 46339) ->
((i_2 = i_1) and (i_3 = i_1) and (i_3 = i_2) and (i < (x * x)) and
(i_1 <= 2147483646) and ((-2147483648) <= x_1) and
(x_1 <= 2147483647) and ((-2147483648) <= x_3) and
(x_3 <= 2147483647))))))) ->
((0 <= i_1) and (i_1 <= 46340))
In many cases it is possible, if you squint just right, to see why Alt-Ergo is not "connecting the dots". It might be a cast that's confusing it, or it might not be able to rule out that two pointers don't alias. In this case everything is fine.
Issues
There are some pain points I have discovered, which I will list in this section.
Strings and aliasing
Lately I've been struggling with string manipulation in ACSL, for example proving that a function which tokenizes a string into an array of tokens behaves correctly. Part of that work is hampered by WP (or Alt-Ergo?) seemingly not being able to keep track of whether char pointers alias or not. Despite putting a precondition that all pointers given must not alias (\separated), string manipulations inside the function leads to WP/Alt-Ergo thinking that they may suddenly possibly have become aliased. It also commonly thinks pointers to the heap, .bss or .data areas may alias data on the stack, which is highly annoying.
It is possible I'm doing something wrong of course, but I felt I should at least put here that aliasing can be a bit of a pain, which makes manipulating C strings difficult.
Termination
By default Frama-C assumes all functions terminate. It is possible to specify under which conditions a function might not terminate, which is useful in some cases. It is also possible to fulfill a contract without the function ever actually returning. For example, this is valid:
/*@ requires 1 <= lo < hi <= 46340;
requires lo*lo <= num < hi*hi;
ensures lo <= \result < hi;
ensures \result*\result <= num < (\result+1)*(\result+1);
assigns \nothing;
*/
int isqrt2_inner(int num, int lo, int hi) {
//infinite recursion!
return isqrt2_inner(num, lo, hi);
}
I have not yet found any way to have frama-c
complain about possible infinite recursion.
Verification times
One thing to be aware of is that compared to compilation, running Frama-C can take quite a while:
$ time gcc isqrt.c
real 0m0,086s
user 0m0,060s
sys 0m0,016s
$ time frama-c -wp -wp-rte -wp-out goals isqrt.c > /dev/null
real 0m12,766s
user 0m20,500s
sys 0m1,300s
Even for small programs this quickly adds up to minutes. Putting functions into their own compilation/verification units and using a competent build system is therefore a good idea.
Conclusions
Frama-C can be used to verify C code, which is useful in many cases. Doing so takes considerable developer effort, but provides peace of mind. Libraries with simple APIs that are used by many people are a good fit, such as libc.
Frama-C is not a panacea, since it says nothing about how long a program might take to execute. Infinite recursion is one example of this. Another could be certain degenerate files generated by fuzzers, which might cause a program to do much more work than usual.
Finally, it might not be feasible to define the entire behavior of a program or sets of programs using ACSL. For example, that compressing and decompressing a file results in the same output as input. For that it is probably better to use functional tests.
Comments
I got some comments when I submitted this post to the [Frama-c-discuss] mailing list, click here for my initial post and here to see the entire thread. Virgile Prevosto had probably the most relevant comment:
Hello Tomas,
Thanks for sharing your experience. It is always interesting for us to have
feedback on how Frama-C gets used in order to understand where improvements
would be the most useful(*)
In particular, your point about separation of strings warrant some thinking
about \separated hypotheses that can be safely made by the memory models of
WP. Regarding termination of recursive functions, this is indeed not yet
implemented by WP, but ACSL introduces a decreases clause, which is similar
to a loop variant: any recursive call must be done on a strictly smaller
'decreases' term than its direct caller, and that term must stay positive.
Currently, WP will emit a warning message if it encounters a decreases
clause, saying that this is unsupported. However, for single recursive
functions, a simple syntactic transformation should be able to generate an
ACSL assertion at each recursive call from the content of the decreases
clause. For mutual recursion, things are slightly more tricky (you have to
identify all functions involved in the recursion), but this should be
doable.
Finally, the point about verification time is just a matter of perspective:
just wait until Frama-C get a proper C++ front-end, and I'm sure we will be
able to have a smaller execution time than the (C++) compiler 😝. On a more
serious note, as was highlighted by one of our partners, the comparison
should rather be done with the time taken by running a test suite with an
extremely high level of coverage, a task which is not exactly free. Of
course, this does not preclude us from trying to achieve better
performances, but formal verification is unlikely to become costless at
some point.
Thanks again for your post, and feel free to ask further questions on
Frama-C.
Best regards,
Appendix
isqrt.c
:
#include <limits.h>
/*@ requires 0 <= num <= INT_MAX;
ensures 0 <= \result <= 46340; //WP doesn't need this, but it's nice for users to know
ensures \result*\result <= num < (\result+1)*(\result+1);
assigns \nothing;
*/
int isqrt(int num) {
if (num < 2) {
return num;
}
/*@ loop assigns i;
loop invariant i*i <= num;
loop variant 46340-i;
*/
for (int i = 1; i < 46340; i++) {
if (num < (i+1)*(i+1)) {
return i;
}
}
return 46340;
}
/*@ requires 1 <= lo < hi <= 46340;
requires lo*lo <= num < hi*hi;
ensures lo <= \result < hi;
ensures \result*\result <= num < (\result+1)*(\result+1);
assigns \nothing;
*/
int isqrt2_inner(int num, int lo, int hi) {
if (lo == hi-1) {
return lo;
}
int mid = (lo + hi) / 2;
if (mid*mid > num) {
return isqrt2_inner(num, lo, mid);
} else {
return isqrt2_inner(num, mid, hi);
}
}
/*@ requires 0 <= num <= INT_MAX;
ensures 0 <= \result <= 46340;
ensures \result*\result <= num < (\result+1)*(\result+1);
assigns \nothing;
*/
int isqrt2(int num) {
if (num < 2) {
return num;
}
if (num >= 46340*46340) {
return 46340;
}
return isqrt2_inner(num, 1, 46340);
}
int main(void) {
// This ghost code acts like a unit test
/*@ ghost
int a0 = isqrt(0) - isqrt2(0);
int a1 = isqrt(1) - isqrt2(1);
int a2 = isqrt(2) - isqrt2(2);
int a3 = isqrt(3) - isqrt2(3);
int a4 = isqrt(4) - isqrt2(4);
int a5 = isqrt(46340*46340-1) - isqrt2(46340*46340-1);
int a6 = isqrt(46340*46340) - isqrt2(46340*46340);
int a7 = isqrt(46340*46340+1) - isqrt2(46340*46340+1);
int a8 = isqrt(INT_MAX) - isqrt2(INT_MAX);
*/
//@ assert a0 == 0;
//@ assert a1 == 0;
//@ assert a2 == 0;
//@ assert a3 == 0;
//@ assert a4 == 0;
//@ assert a5 == 0;
//@ assert a6 == 0;
//@ assert a7 == 0;
//@ assert a8 == 0;
return 0;
}