../
2025-05-31
In the realm of system programming, memory safety has always been a nightmare for C/C++ programmers. A simple malloc or free can bury hidden dangers leading to program crashes, data corruption, or even security vulnerabilities. Today, we will explore how to use Haskell to tame these memory beasts, achieving memory safety guarantees similar to Rust.
First, let’s start with some unsafe C code. Below are three functions: creating a matrix, filling the matrix with random floating-point numbers, and matrix multiplication. Since the matrix is created using malloc, it can be destroyed directly using free.
#include <stdlib.h>
#include <time.h>
#include <openblas/cblas.h>
#define N 1000
double* new_matrix() {
double* mat = (double*)malloc(N * N * sizeof(double));
return mat;
}
void fill_matrix(double* mat) {
for (int i = 0; i < N; i++) {
for (int j = 0; j < N; j++) {
mat[i*N + j] = (double)rand() / RAND_MAX;
}
}
}
// Calculate c = a * b
void mat_mul(double* c, double* a, double* b) {
cblas_dgemm(
CblasRowMajor,
CblasNoTrans,
CblasNoTrans,
N, N, N, 1.0,
a, N, b, N, 1.0, c, N);
}
Then we write a main function:
int main() {
double *a, *b, *c;
srand(time(NULL));
a = new_matrix();
b = new_matrix();
c = new_matrix();
fill_matrix(a);
fill_matrix(b);
mat_mul(c, a, b);
free(a);
free(b);
free(c);
return 0;
}
Although this is a very simple example, because we are using unsafe C, there are countless ways to crash the program.
For example, Double Free:
free(a);
free(a);
Or, Use After Free:
free(a);
fill_matrix(a);
Or Memory Leaks due to forgetting to free, and so on.
Through Haskell’s FFI (Foreign Function Interface), we can call these functions within Haskell.
First, enable the FFI extension:
{-# LANGUAGE ForeignFunctionInterface #-}
Import some necessary modules:
import Foreign.Ptr (Ptr)
import Foreign.C.Types (CDouble)
import Foreign.Marshal.Alloc (free)
Define the FFI functions:
foreign import ccall "new_matrix"
cNewMatrix :: IO (Ptr CDouble)
foreign import ccall "fill_matrix"
cFillMatrix :: Ptr CDouble -> IO ()
foreign import ccall "mat_mul"
cMatMul :: Ptr CDouble -> Ptr CDouble -> Ptr CDouble -> IO ()
Then we can call these functions:
main = do
a <- cNewMatrix
b <- cNewMatrix
c <- cNewMatrix
cFillMatrix a
cFillMatrix b
cMatMul c a b
free a
free b
free c
return ()
The main function here corresponds almost line-by-line to the C version. However, this means the same safety issues from C follow us like a shadow; all the C-related safety problems mentioned above can occur here.
But unlike C, this time, we can solve this problem.
Haskell’s Foreign.ForeignPtr provides a mechanism similar to RAII, avoiding the vast majority of memory safety issues.
Specific to the example above, first, we import some library functions:
import Foreign.ForeignPtr
Then change the definition of the pointers from raw pointers to safe ForeignPtr:
a <- cNewMatrix >>= newForeignPtr free
b <- cNewMatrix >>= newForeignPtr free
c <- cNewMatrix >>= newForeignPtr free
When using them, we use withForeignPtr to create a scope, ensuring these pointers are used safely within that scope. The complete code is as follows:
main = do
a <- cNewMatrix >>= newForeignPtr finalizerFree
b <- cNewMatrix >>= newForeignPtr finalizerFree
c <- cNewMatrix >>= newForeignPtr finalizerFree
withForeignPtr a $ \a ->
withForeignPtr b $ \b ->
withForeignPtr c $ \c -> do
cFillMatrix a
cFillMatrix b
cMatMul c a b
This way, we don’t need to worry about memory safety.
However, the method above still has limitations; the lifecycle and ownership management it achieves is very coarse.
For example, we cannot implement two objects with partially overlapping lifecycles like this:
a <- cNewMatrix
-- ..
b <- cNewMatrix
free a
-- ...
free b
Nor can we implement operations like this:
a <- cNewMatrix
if foo then
free a
else
sendToAnotherThread a
If we want fine-grained lifecycle control, we need the latest and coolest Linear Types extension:
{-# LANGUAGE LinearTypes #-}
{-# LANGUAGE QualifiedDo #-}
We also enabled the QualifiedDo extension here to use Linear.do for the linear IO monad, which will be mentioned later.
In linear types, we can define a function like this:
func :: a %1 -> b
This function signature means that in func, if the return value b is used exactly once, then the argument a must be used—and used only once. Otherwise, the compiler will report an error.
Thus, the safety issues mentioned above will trigger errors at compile time:
If we need to perform read/write operations on an object, we just need to return the object as-is after the operation completes, “pretending” the object was never consumed. This is very similar to the borrow checker rules in Rust.
For example, the matrix filling above can be written as:
a <- fillMatrix a
And matrix multiplication can be written like this:
(c, a, b) <- matMul c a b
Based on this idea, we can encapsulate our FFI matrix operations. We need to use the linear-base library, which is effectively the standard library for linear types in the Haskell community.
For IO operations, we must use the specialized Linear IO Monad. The Linear IO Monad can be interconverted with the standard Monad.
import Prelude (IO, (>>), (>>=), fmap, return)
import Prelude.Linear
import qualified System.IO.Linear as Linear
import qualified Control.Functor.Linear as Linear
data Mat where Mat :: (Ptr CDouble) -> Mat
-- Consumes the resource but returns it, effectively not consuming it
fillMat :: Mat %1-> Linear.IO Mat
fillMat (Mat ptr) = Linear.fromSystemIO $
cFillMatrix ptr >>
return (Mat ptr)
-- Same as above, looks like it consumes resources, but they are returned
matMul :: Mat %1-> Mat %1-> Mat %1-> Linear.IO (Mat, Mat, Mat)
matMul (Mat a) (Mat b) (Mat c) = Linear.fromSystemIO $
cMatMul a b c >>
return (Mat a, Mat b, Mat c)
deleteMat :: Mat %1 -> Linear.IO ()
deleteMat (Mat ptr) = Linear.fromSystemIO $ free ptr
Then we rewrite the main function. Here we use Linear.do to perform monadic operations on Linear.IO:
main = Linear.withLinearIO $ Linear.do
a <- newMatrix
b <- newMatrix
c <- newMatrix
a <- fillMat a
b <- fillMat b
(c,a,b) <- matMul c a b
deleteMat a
deleteMat b
deleteMat c
Linear.return $ Ur ()
We can try to intentionally write some memory-unsafe code inside main. We will find that no matter how we try, the compiler will report an error.
Here we used Ur, which stands for “Unrestricted”. It indicates that the value () is not linearly consumed. This is a bit complex to explain fully; if interested, you can check the documentation and tutorials for linear-base.
However, passing resources back and forth like this with linear types—writing arguments twice for every function call (once for input, once for output)—looks clumsy. Rust’s solution is to borrow a reference, perform various operations on this reference (during which the original variable is unusable), and have the original variable return when the reference goes out of scope.
In Haskell, we can achieve a similar “borrowing” effect via a function similar to withForeignPtr. Let’s name this function borrow.
To distinguish between ownership and usage rights, we introduce a new type MatRef. The Mat type indicates we have full ownership of the matrix memory, while MatRef indicates we are only temporarily “borrowing” the matrix; we can read/write to it, but cannot free it or transfer its ownership.
data Mat where Mat :: (Ptr CDouble) -> Mat
-- MatRef allows us to operate on the matrix but won't "consume" the matrix resource
data MatRef where MatRef :: (Ptr CDouble) -> MatRef
newMatrix :: Linear.IO Mat
newMatrix = Linear.fromSystemIO $ fmap Mat cNewMatrix
-- Frees matrix resource, so it consumes a Mat type value
deleteMat :: Mat %1 -> Linear.IO ()
deleteMat (Mat ptr) = Linear.fromSystemIO $ free ptr
-- Fill matrix operation, doesn't consume resource, so it accepts MatRef
fillMat :: MatRef -> IO ()
fillMat (MatRef ptr) = cFillMatrix ptr
-- Matrix multiplication, doesn't consume resource, so it accepts MatRef
matMul :: MatRef -> MatRef -> MatRef -> IO ()
matMul (MatRef a) (MatRef b) (MatRef c) = cMatMul a b c
Then we implement the borrow function, using a bit of polymorphism tech:
class Borrow io b where
borrow :: Mat %1 -> (MatRef -> io b) %1-> Linear.IO (Mat, b)
instance Borrow Linear.IO a where
borrow :: Mat %1 -> (MatRef -> Linear.IO b) %1-> Linear.IO (Mat, b)
borrow (Mat ptr) body =
body (MatRef ptr) Linear.>>= \x->
Linear.return (Mat ptr, x)
instance (a ~ ()) => Borrow IO a where
borrow :: Mat %1 -> (MatRef -> IO b) %1-> Linear.IO (Mat, b)
borrow (Mat ptr) body =
Linear.fromSystemIO (body (MatRef ptr)) Linear.>>= \x->
Linear.return (Mat ptr, x)
By adding a type constraint, we limit the IO operation inside borrow to only return unit, ensuring no reference values “escape” outside of borrow, which could lead to unsafe memory operations.
The final result is as follows:
main = Linear.withLinearIO $ Linear.do
a <- newMatrix
b <- newMatrix
c <- newMatrix
(a, (b, (c, ()))) <-
borrow a $ \a ->
borrow b $ \b ->
borrow c $ \c -> do
fillMat a
fillMat b
matMul c a b
deleteMat a
deleteMat b
deleteMat c
Linear.return (Ur ())
Inside the borrow function, we can perform operations on MatRef as many times as we want without worrying about resource issues, and the code is concise and intuitive. Outside the borrow function, we can use linear types to manage the lifecycle of resources with fine-grained safety, achieving safe zero-cost abstractions.
Through linear types, Haskell can achieve memory safety comparable to Rust, even when interacting with unsafe C code.
Up to this point, we have used purely imperative programming. Haskell also has incredibly powerful functional programming features for us to choose from, as well as type-level programming that is easier to use and more powerful than C++ template metaprogramming. In short, Haskell is truly an excellent imperative programming language.
Mistivia - https://mistivia.com