Ventisrv and more
some news on the ventivac front. this post is mostly about how to use
ventisrv.b
, the (still work in progress) venti server.
most of the code has been cleaned up a bit. not vacfs.b
yet though, that
one is next. the manual pages have been cleared up a bit. the biggest
difference since my previous post has been that ventisrv is now closer to
being a useful venti server. it should not be considered stable yet,
and the file format might change as well, but i think it’s worth a
try now!
so, where to start?
the ventisrv manual page (snapshot version) should offer a decent introduction. if anything is unclear after reading it or questions got raised by it, please let me know so i can clarify the manual page.
in short, if you want to run a venti server with a data file of 8 gigabytes and an expected mean blocksize of 8 kilobytes, executing the following commands starts a fresh ventisrv:
echo -n > data
echo -n > index # these first two only the first time ;)
ventisrv -v 8g 8k
data
is used to store the data blocks, index
keeps the index.
ventisrv calculates how much memory it needs based on the arguments
passed at startup. specify the verbose option to get it printed.
now vacput your current working directory to the venti:
vacput .
and use the resulting score to read the files in the vac archive again:
vacget -t yourscore
it should be as easy as that. if something fails along the way, please
let me know so i can update this post with better descriptions, or fix
the bug. please read the README
in the ventivac hg repository, it has
the instructions to patch inferno’s appl/lib/venti.b to make ventisrv work.
a few quick tests with randtest
from plan9ports
(src/cmd/venti/randtest.c
) showed write performance (of new blocks)
of 15MB/s and a sequential read performance of over 18MB/s. but the
machine was quite fast and i’m not sure it represents an average machine
ventisrv might run on. oh, if you want to do a few performance tests
yourself, do not specify -D
, it kills performance.
what’s next?
well, there is no authentication for connections. that does not really bother me, the scores are a nice and simple capability system. the only real worry i have is someone filling up a public venti server. thus, an address listen for read-only connections may do the trick.
compression is missing as well. the easiest would be per-block
compression. a “smarter” thing might be to compress larger blocks.
i don’t have well developed intuition for data compression methods,
but i have a hunch that compressing larger blocks (say 128kb) gives the
compressor more history to work with and better compression ratios.
writing would still be append-only (no problems there), but reading would
need special care: for a given file offset, accounting needs to be done
to find the location of the 128kb block in the compressed file. if this
is a viable option, it may even be done as a sys->file2chan
program.
other open issues are listed in the bugs section of the manual page referenced above. solutions (and more problems) are welcome. thanks for listening!