whateverblog.
Mailstore performance tests
Thursday, February 13, 2003 05:56 PM
So far I've got only the most low-level of performance/space statistics, but the results are quite encouraging.

For a single inbox of approximately 100,000 "average-sized" e-mails, the metadata takes 3.4MB on disk. The real size of the messages is 541MB, but due to a 4K sector size and the fact that I am storing each message individually, the on-disk size is 732MB. I knew there would be inefficiency doing it this way, but paying 35% of the original file size is way higher than I expected. But really, who cares--hard drive capacity is cheap, and the size of e-mail messages stays relatively constant (or if it tracks anything, it's bandwidth).

Task Time Grows linearly with
Open mailstore 60 ms mailbox count
Open mailbox containing 100,000 messages 1462 ms message count
Add 30,000(!!) messages to mailbox* 42761 ms number and size of messages to deliver
Get mailbox message count (total and unseen) 40 ms message count
Close mailbox 1793 ms message count

* This assumes 30,000 messages sitting in an INBOUND directory, one file per message. The operation for each message involves getting the length, getting the timestamp, creating a new message metadata object, and moving the message file to the mailbox directory.

Anyway, these times are from an IBM ThinkPad T30 laptop, P4M 2.0GHz. I believe the hard drive is of the 5400RPM 60GB variety, but it doesn't feel any faster than any of the 4200RPM notebook drives I've used. In any case, the average new desktop IDE drive should score much better times.

I think this level of performance is more than sufficient, at least with clients like Outlook, Outlook Express, and Mozilla that want to download all the message flags as soon as they open a mailbox. Response times with those clients, for very large mailboxes, will be completely dominated by the time it takes to download all of those flags.

What is really scary is how long searching will take unless I do some far, far more drastic indexing. I'm going to do a little more research on how O/OE/M do searches before I commit to that, and even then optimized searching may not be a part of the first release.