Hi, Michael. Yes, VoltDB is very fast, but they self-admittedly do not perform well if transactions contain records spanning across multiple nodes. That is the key feature difference between InfiniSQL and VoltDB (along, of course, that their project is functionally much further along).
If you want more details about how things work when performing transactions, I think that the overview I created would be a good starting point. It probably doesn't have
everything you'd ask for, but I hope that it answers some of your questions: http://www.infinisql.org/docs/overview/
And to answer your question about performance degradation pertaining to number of nodes each transaction touches, I have not done comprehensive benchmarks of InfiniSQL measuring that particular item. However, I do believe that as multi-node communication increases, throughput will tend to decrease--I expect that the degradation would be graceful, but further testing is required. The benchmark I've performed and referenced has 3 updates and a select in each transaction, all very likely to be on 3 different nodes.
I'd like to invite you to benchmark InfiniSQL in your own environment. I've included the scripts in the source distribution, as well as a guide on how I benchmarked, as well as details on the specific benchmarking I've done so far. All at http://www.infinisql.org
I'd be glad to assist in any way, give pointers, and so on, if there are tests that you'd like to do. I also plan to do further benchmarking over time, and I'll update the site's blog (and twitter, etc) as I do so.
Please communicate further with me if you're curious.
So basically, VoltDB acknowledges that cross-partition transactions are Hard and has put a lot of effort into minimizing them. (This is basically the entire point of the original HStore paper.)
InfiniSQL says don't worry, we'll just use 2PC. But not just yet, we're still working on the lock manager.
I look forward to your exegesis of how you plan to overcome the well-documented scaling problems with 2PC. Preferably after you have working code. :)
Yes, InfiniSQL uses two phase locking. The Achilles' Heel is deadlock management. By necessity, the deadlock management will need to be single-threaded (at least as far as I can figure). No arguments there, so deadlock-prone usage patterns may definitely be problematic.
I don't think there's a perfect concurrency management protocol. MVCC is limited by transactionid generation. Predictive locking can't get rolled back once it starts, or limits practical throughput to single partitions.
2PL works. It's not ground-breaking (though incorporating it in the context of inter-thread messaging might be unique). And it will scale fine other than for a type of workload that tends to bring about a lot of deadlocks.
HAT (http://www.bailis.org/papers/hat-vldb2014.pdf) looks pretty promising in terms of multi-term transactions. It turns out that you can push the problem off to garbage collection in order to make transaction id generation easy, and garbage collection is easier to be sloppy and heuristic about. The only problem is it isn't clear yet that HATs are as rich as 2PL-based approaches, and that nobody's built an industrial strength implementation yet.
TransactionID generation, as you mentioned it, is probably being limited by the incredible expense of cross-core/socket/etc. sync.
Go single-threaded and divide up a single hardware node (server) into one node per core, and your performance should go way up. You'd want to do something like this anyways, just to avoid NUMA penalties. But treating every core as a separate node is just easy and clean, conceptually. I/O might go into a shared pool - you'd need to experiment.
I've seen this improvement on general purpose software. Running n instances where n=number of cores greatly outperformed running 1 instance across all cores.
Only major design change from one node/proc is that your replicas need to be aware of node placement, so they're on separate hardware. You may even consider taking this to another level, so that I can make sure replicas are on separate racks. Some sort of "availability group" concept might be an easy way to wrap it up.
Also: your docs page clearly says 2PC was chosen (it's in the footnote). Maybe I'm misreading what "basis" means.
Hey Mark. The overview seem to be much of the same. There's a lot of excited talk, which is fine, but should be limited to a leading paragraph. The fundamental issue is performance in face of transactions that need to do 2PC among multiple nodes (which also need to sync with the replicas).
I'm not much of an expert at all, but I like reading papers on databases. It seems to me that if you really did discover a breakthrough like this, you should be able to distill it to some basic algorithms and math. And a breakthrough of this scale would be quite notable.
If I'm reading correctly, there's no replica code even involved ATM. So 500Ktx/s really boils down to ~83Ktx/sec per node, on an in-memory database. Is it possible on modern hardware that this is just what to expect?
I am curious, and I'm not trying to be dismissive, but the copy sounds overly promising, without explaining how, even in theory, this will actually work. I'd suggest to explain that part first, then let the engineering come second.
Your advice that I create formal academic-style paper is reasonable, and I agree that it should be something that I pursue. Will you follow me somehow (by links at http://www.infinisql.org) so that when such is produced, you'll see it? I can't guarantee getting front page here again, and don't want to be missed in the future, especially as you (and others) have been asking for this type of information.
And, yes, many thousands of transactions per node in memory is what should be expected. But scalability of ACID (lacking durable, as discussed) transactions on multiple nodes--that's the unique part. I'll try to distill that into a paper.
It doesn't have to be formal and academic enough to be published. Just something that explains how performance is going to be achieved - any sort of analysis.
If you want more details about how things work when performing transactions, I think that the overview I created would be a good starting point. It probably doesn't have everything you'd ask for, but I hope that it answers some of your questions: http://www.infinisql.org/docs/overview/
And to answer your question about performance degradation pertaining to number of nodes each transaction touches, I have not done comprehensive benchmarks of InfiniSQL measuring that particular item. However, I do believe that as multi-node communication increases, throughput will tend to decrease--I expect that the degradation would be graceful, but further testing is required. The benchmark I've performed and referenced has 3 updates and a select in each transaction, all very likely to be on 3 different nodes.
I'd like to invite you to benchmark InfiniSQL in your own environment. I've included the scripts in the source distribution, as well as a guide on how I benchmarked, as well as details on the specific benchmarking I've done so far. All at http://www.infinisql.org
I'd be glad to assist in any way, give pointers, and so on, if there are tests that you'd like to do. I also plan to do further benchmarking over time, and I'll update the site's blog (and twitter, etc) as I do so.
Please communicate further with me if you're curious.
Thanks, Mark