mardi 19 janvier 2021

How do I model composable stochastic processes in rust?

I'm writing a family of Markov Chain Monte Carlo (MCMC) algorithms to use as inference methods for a program induction system by translating the MCMC algorithms implemented here in C++ into rust.

The rough dynamic of MCMC is that we have some process which randomly moves around a space of objects and generates a stream of the objects which it discovers. The randomness is essential. Also, this stream never terminates (though you can stop taking objects from the stream). More complex forms of MCMC can be defined as compositions of this basic process.

My question: how do I translate these algorithms idiomatically? More specifically, how do I to think about this problem from a rustacean perspective? What are the right tools/techniques to bring to the job?

Here's what I've considered:

  1. A direct translation seemed awkward, because the C++ code makes heavy use of callbacks, and that felt awkward given my limited rust knowledge. It seems more in keeping with other rust code I've read to just make a stream of samples lazily available to the user to use as they see fit.

  2. I then thought that iterators might make sense, but:

  • The objects I'm sampling are complex data structures representing mini DSLs. It'd be nice to make a reference available that can be cloned if the user decides it's a useful sample. It's my understanding that this is a known limitation for iterators given rust's memory model.
  • These algorithms all need a source of randomness (i.e. &mut R where R: Rng), and access to a simple control structure for keeping track of various statistics. Baking these directly into whatever struct implements Iterator then means that I can't compose the algorithms in a way that would share the control structure or source of randomness, right?
  1. Generators/Coroutines seem to solve these problem, but seem like a relatively fringe/new area of rust. I began to wonder if I was making things harder than they need to be.

  2. I currently provide the struct for each algorithm with a function similar to the following, where H is a hypothesis and C is the control structure:

    impl<C, H> MCMCChain<C, H> {
        // ...
        pub fn next_sample<R: Rng>(&mut self, control: &mut C, rng: &mut R) -> Option<Self>
        // ...
    }
    

    Ideally, next_sample would return Option<&H>, but this is what I have working right now.

I'm new to Stack Overflow and happy to revise the question to be more helpful if you tell me how :-)




Aucun commentaire:

Enregistrer un commentaire