Jul 5, 2025

From 'It Might Work' to 'It Will Work': Typestate in Rust

The Problem: Runtime Invariants vs Compile-Time Guarantees
Enter the Typestate Pattern
Seeing the Compiler Catch the Bug
Advanced Patterns: Session Types and Protocol Enforcement
Alternative Approach: Multiple State Types
Combining with Result Types for Fallible Transitions
Builder Pattern with Typestate
Performance Considerations
When to Use This Pattern
Limitations and Trade-offs
Conclusion

One of Rust’s greatest strengths lies not just in memory safety, but in its ability to encode business logic and invariants directly into the type system. Today, we’ll explore how to design APIs that make invalid states literally impossible to represent, pushing error detection from runtime to compile time.

The Problem: Runtime Invariants vs Compile-Time Guarantees

Consider a simple file handle API. In most languages, you might write something like this:

struct FileHandle {
    path: String,
    is_open: bool,
    content: Option<String>,
}

impl FileHandle {
    fn open(path: String) -> Result<Self, std::io::Error> {
        // Open file logic
        Ok(FileHandle {
            path,
            is_open: true,
            content: None,
        })
    }

    fn read(&mut self) -> Result<&str, std::io::Error> {
        if !self.is_open {
            return Err(std::io::Error::new(
                std::io::ErrorKind::InvalidInput,
                "Cannot read from closed file"
            ));
        }
        // Read logic...
        Ok("file content")
    }

    fn close(&mut self) {
        self.is_open = false;
        self.content = None;
    }
}

This approach has several problems:

Runtime checks for every operation
Possible inconsistent state (what if is_open is true but content is None?)
Easy to forget checks, leading to bugs
No compile-time guarantees about correct usage

Enter the Typestate Pattern

The typestate pattern leverages Rust’s type system to encode object states as distinct types. Let’s redesign our file handle:

use std::fs::File;
use std::io::{self, Read};
use std::marker::PhantomData;

// State markers
struct Closed;
struct Open;

struct FileHandle<State> {
    path: String,
    file: Option<File>,
    _state: PhantomData<State>,
}

// Only closed files can be opened
impl FileHandle<Closed> {
    fn new(path: String) -> Self {
        FileHandle {
            path,
            file: None,
            _state: PhantomData,
        }
    }

    fn open(mut self) -> Result<FileHandle<Open>, (Self, io::Error)> {
        match File::open(&self.path) {
            Ok(file) => Ok(FileHandle {
                path: self.path,
                file: Some(file),
                _state: PhantomData,
            }),
            Err(e) => {
                self.file = None; // Ensure consistency
                Err((self, e))
            }
        }
    }
}

// Only open files can be read or closed
impl FileHandle<Open> {
    fn read(&mut self) -> Result<String, io::Error> {
        // No need to check if file is open - it's guaranteed by the type!
        // The file handle is consumed when opened, so we know it exists
        let file = self.file.as_mut().expect("File must exist in Open state");
        let mut contents = String::new();
        file.read_to_string(&mut contents)?;
        Ok(contents)
    }

    fn close(self) -> FileHandle<Closed> {
        // File is automatically dropped here, demonstrating resource cleanup
        FileHandle {
            path: self.path,
            file: None, // File is explicitly closed
            _state: PhantomData,
        }
    }
}

Now, attempting to read from a closed file is a compile-time error, and we get true resource safety:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let file = FileHandle::new("test.txt".to_string());

    // This won't compile!
    // file.read(); // Error: no method named `read` found for `FileHandle<Closed>`
    let mut open_file = file.open()?;
    let content = open_file.read()?; // This works and actually reads the file!
    println!("File content: {}", content);
    let closed_file = open_file.close(); // File resource is properly cleaned up

    // This won't compile either!
    // closed_file.read(); // Error: no method named `read` found for `FileHandle<Closed>`

    Ok(())
}

Seeing the Compiler Catch the Bug

Let’s see exactly what happens when we try to misuse our API. If we uncomment that first read() call:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let file = FileHandle::new("test.txt".to_string());
    file.read(); // Oops!
    Ok(())
}

The compiler immediately catches this:

error[E0599]: no method named `read` found for struct `FileHandle<Closed>` in the current scope
 --> src/main.rs:4:10
  |
4 |     file.read();
  |          ^^^^ method not found in `FileHandle<Closed>`
  |
  = note: the method was found for
          - `FileHandle<Open>`

Similarly, if we try to read after closing:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let file = FileHandle::new("test.txt".to_string());
    let mut open_file = file.open()?;
    let closed_file = open_file.close();
    closed_file.read(); // Oops!
    Ok(())
}

We get:

error[E0599]: no method named `read` found for struct `FileHandle<Closed>` in the current scope
 --> src/main.rs:5:17
  |
5 |     closed_file.read();
  |                 ^^^^ method not found in `FileHandle<Closed>`
  |
  = note: the method was found for
          - `FileHandle<Open>`

The compiler not only tells us what’s wrong, but helpfully points out that read does exist—just not for the state we’re in!

Advanced Patterns: Session Types and Protocol Enforcement

Let’s explore a more complex example: a TCP connection state machine.

// Connection states
struct Disconnected;
struct Connected;
struct Authenticated;

struct TcpConnection<State> {
    address: String,
    _state: PhantomData<State>,
}

impl TcpConnection<Disconnected> {
    fn new(address: String) -> Self {
        TcpConnection {
            address,
            _state: PhantomData,
        }
    }

    fn connect(self) -> Result<TcpConnection<Connected>, (Self, std::io::Error)> {
        // Connection logic
        Ok(TcpConnection {
            address: self.address,
            _state: PhantomData,
        })
    }
}

impl TcpConnection<Connected> {
    fn authenticate(self, credentials: &str) -> Result<TcpConnection<Authenticated>, Self> {
        // Authentication logic
        if credentials == "valid" {
            Ok(TcpConnection {
                address: self.address,
                _state: PhantomData,
            })
        } else {
            Err(self)
        }
    }

    fn disconnect(self) -> TcpConnection<Disconnected> {
        TcpConnection {
            address: self.address,
            _state: PhantomData,
        }
    }
}

impl TcpConnection<Authenticated> {
    fn send_data(&self, data: &[u8]) -> Result<(), std::io::Error> {
        // Can only send data when authenticated
        println!("Sending {} bytes", data.len());
        Ok(())
    }

    fn disconnect(self) -> TcpConnection<Disconnected> {
        TcpConnection {
            address: self.address,
            _state: PhantomData,
        }
    }
}

This design enforces a strict protocol:

Must connect before authenticating
Must authenticate before sending data
Can disconnect from any connected state

Alternative Approach: Multiple State Types

For more complex state machines, we can use separate types for each state:

trait ParserState {}

struct Initial;
struct ReadingString;
struct ReadingNumber;
struct Complete;
struct Error;

impl ParserState for Initial {}
impl ParserState for ReadingString {}
impl ParserState for ReadingNumber {}
impl ParserState for Complete {}
impl ParserState for Error {}

struct Parser<S: ParserState> {
    input: String,
    position: usize,
    _state: PhantomData<S>,
}

impl Parser<Initial> {
    fn new(input: String) -> Self {
        Parser {
            input,
            position: 0,
            _state: PhantomData,
        }
    }

    fn start_string(self) -> Parser<ReadingString> {
        Parser {
            input: self.input,
            position: self.position,
            _state: PhantomData,
        }
    }

    fn start_number(self) -> Parser<ReadingNumber> {
        Parser {
            input: self.input,
            position: self.position,
            _state: PhantomData,
        }
    }
}

impl Parser<ReadingString> {
    fn read_char(&mut self) -> Option<char> {
        self.input.chars().nth(self.position).map(|c| {
            self.position += 1;
            c
        })
    }

    fn finish(self) -> Parser<Complete> {
        Parser {
            input: self.input,
            position: self.position,
            _state: PhantomData,
        }
    }
}

impl Parser<Complete> {
    fn result(&self) -> &str {
        &self.input[..self.position]
    }
}

Combining with Result Types for Fallible Transitions

Real-world state machines often have fallible transitions. We can combine our typestate pattern with Result types:

#[derive(Debug)]
enum ConnectionError {
    NetworkError,
    AuthenticationFailed,
    Timeout,
}

impl TcpConnection<Connected> {
    fn authenticate_fallible(
        self,
        credentials: &str
    ) -> Result<TcpConnection<Authenticated>, (Self, ConnectionError)> {
        if credentials.is_empty() {
            return Err((self, ConnectionError::AuthenticationFailed));
        }

        // Simulate network failure
        if credentials == "network_fail" {
            return Err((self, ConnectionError::NetworkError));
        }

        Ok(TcpConnection {
            address: self.address,
            _state: PhantomData,
        })
    }
}

This pattern ensures that:

On success, we get the desired state
On failure, we get back the original state and can retry or handle the error
We never lose our connection object

Builder Pattern with Typestate

The typestate pattern works excellently with builders, ensuring required fields are set:

struct HttpRequestBuilder<HasUrl, HasMethod> {
    url: Option<String>,
    method: Option<String>,
    headers: Vec<(String, String)>,
    _has_url: PhantomData<HasUrl>,
    _has_method: PhantomData<HasMethod>,
}

struct Yes;
struct No;

impl HttpRequestBuilder<No, No> {
    fn new() -> Self {
        HttpRequestBuilder {
            url: None,
            method: None,
            headers: Vec::new(),
            _has_url: PhantomData,
            _has_method: PhantomData,
        }
    }
}

impl<HasMethod> HttpRequestBuilder<No, HasMethod> {
    fn url(self, url: String) -> HttpRequestBuilder<Yes, HasMethod> {
        HttpRequestBuilder {
            url: Some(url),
            method: self.method,
            headers: self.headers,
            _has_url: PhantomData,
            _has_method: PhantomData,
        }
    }
}

impl<HasUrl> HttpRequestBuilder<HasUrl, No> {
    fn method(self, method: String) -> HttpRequestBuilder<HasUrl, Yes> {
        HttpRequestBuilder {
            url: self.url,
            method: Some(method),
            headers: self.headers,
            _has_url: PhantomData,
            _has_method: PhantomData,
        }
    }
}

impl<HasUrl, HasMethod> HttpRequestBuilder<HasUrl, HasMethod> {
    fn header(mut self, key: String, value: String) -> Self {
        self.headers.push((key, value));
        self
    }
}

// Only builders with both URL and method can build
impl HttpRequestBuilder<Yes, Yes> {
    fn build(self) -> HttpRequest {
        HttpRequest {
            url: self.url.unwrap(),
            method: self.method.unwrap(),
            headers: self.headers,
        }
    }
}

struct HttpRequest {
    url: String,
    method: String,
    headers: Vec<(String, String)>,
}

Usage:

fn main() {
    let request = HttpRequestBuilder::new()
        .url("https://api.example.com".to_string())
        .method("GET".to_string())
        .header("Authorization".to_string(), "Bearer token".to_string())
        .build(); // This compiles!

    // This won't compile - missing method:
    // let invalid = HttpRequestBuilder::new()
    //     .url("https://api.example.com".to_string())
    //     .build(); // Error!
}

Performance Considerations

One of the beautiful aspects of these patterns is that they have zero runtime cost. The PhantomData markers are zero-sized types that get completely optimized away. Let’s verify this:

use std::mem;

fn main() {
    println!("Size of FileHandle<Closed>: {}", mem::size_of::<FileHandle<Closed>>());
    println!("Size of FileHandle<Open>: {}", mem::size_of::<FileHandle<Open>>());
    println!("Size of PhantomData<Closed>: {}", mem::size_of::<PhantomData<Closed>>());
}

All these will print the same size - just the size of the String field.

When to Use This Pattern

The typestate pattern is particularly valuable when:

State transitions are well-defined and finite
Invalid operations would be serious bugs
API misuse is a common source of errors
Performance is critical (zero-cost abstractions)
Documentation through types is valuable

Consider using it for:

Network protocols and connection states
File handles and resource management
Parser state machines
API clients with authentication flows
Game state management
Hardware driver interfaces

Limitations and Trade-offs

While powerful, this pattern has some limitations:

Increased compile times due to more complex type checking
Code duplication if states share many methods
Learning curve for API users unfamiliar with the pattern
Inflexibility - runtime state changes require design changes

Conclusion

The typestate pattern represents one of Rust’s most powerful compile-time guarantees. By encoding state into the type system, we move from “it might work” to “it will work” - transforming runtime errors into compile-time impossibilities.

When designing APIs, consider how you can make invalid states unrepresentable. Your future self (and your users) will thank you.

The next time you find yourself writing runtime checks for object state, ask: “Could I encode this in the type system instead?” Often, the answer is yes, and the result is more robust, self-documenting code that catches bugs before they happen.