Ventisrv and more

some news on the ventivac front. this post is mostly about how to use ventisrv.b, the (still work in progress) venti server.

most of the code has been cleaned up a bit. not vacfs.b yet though, that one is next. the manual pages have been cleared up a bit. the biggest difference since my previous post has been that ventisrv is now closer to being a useful venti server. it should not be considered stable yet, and the file format might change as well, but i think it’s worth a try now!

so, where to start?

the ventisrv manual page (snapshot version) should offer a decent introduction. if anything is unclear after reading it or questions got raised by it, please let me know so i can clarify the manual page.

in short, if you want to run a venti server with a data file of 8 gigabytes and an expected mean blocksize of 8 kilobytes, executing the following commands starts a fresh ventisrv:

echo -n > data
echo -n > index # these first two only the first time ;)
ventisrv -v 8g 8k

data is used to store the data blocks, index keeps the index. ventisrv calculates how much memory it needs based on the arguments passed at startup. specify the verbose option to get it printed.

now vacput your current working directory to the venti:

vacput .

and use the resulting score to read the files in the vac archive again:

vacget -t yourscore

it should be as easy as that. if something fails along the way, please let me know so i can update this post with better descriptions, or fix the bug. please read the README in the ventivac hg repository, it has the instructions to patch inferno’s appl/lib/venti.b to make ventisrv work.

a few quick tests with randtest from plan9ports (src/cmd/venti/randtest.c) showed write performance (of new blocks) of 15MB/s and a sequential read performance of over 18MB/s. but the machine was quite fast and i’m not sure it represents an average machine ventisrv might run on. oh, if you want to do a few performance tests yourself, do not specify -D, it kills performance.

what’s next?

well, there is no authentication for connections. that does not really bother me, the scores are a nice and simple capability system. the only real worry i have is someone filling up a public venti server. thus, an address listen for read-only connections may do the trick.

compression is missing as well. the easiest would be per-block compression. a “smarter” thing might be to compress larger blocks. i don’t have well developed intuition for data compression methods, but i have a hunch that compressing larger blocks (say 128kb) gives the compressor more history to work with and better compression ratios. writing would still be append-only (no problems there), but reading would need special care: for a given file offset, accounting needs to be done to find the location of the 128kb block in the compressed file. if this is a viable option, it may even be done as a sys->file2chan program.

other open issues are listed in the bugs section of the manual page referenced above. solutions (and more problems) are welcome. thanks for listening!