projects
Detail
Publication date: 1 de June, 2021Byzantium: Efficient Byzantine fault-tolerant database replication
The goal of this project is to develop an efficient middleware system to support byzantine fault-tolerant database replication.
Relational database systems are employed for supporting a wide range
of uses, from simple applications used by members of a small
organization to complex applications used by thousands of users. For
example, most web applications built upon the popular three-tier
model rely on a database system to store all manipulated data.
In many applications, performance and availability are crucial,
which implies that the underlying database engine must also offer
good performance and high availability.
Replication is a technique that can be used to improve availability
by maintaining multiple data replicas and relying on the replicas of
the surviving nodes to continue operation in the presence of
failures. Replication algorithms can be divided broadly into eager
and lazy algorithms. Lazy replication algorithms [SS05]
asynchronously propagate replica updates to other replicas after the
updating transaction commits. This approach allows good performance
by minimizing the coordination overhead among replicas when
transactions are executed. However, in the presence of concurrent
transactions, this approach may lead to temporary replica staleness,
and even replica divergence (depending on the protocol used). In
this case, maintaining consistency is usually left to the the
application, requiring application-specific solutions. Thus, when
consistency is important, as it is the case in many database-based
applications, it is simpler to rely on eager replication schemes.
Another important choice replication algorithms make are the
assumptions about the behavior of faulty replicas. Most earlier
research has focused on techniques that tolerate benign faults:
these algorithms assume components fail by stopping or by omitting
some steps and may not provide correct service if a faulty component
violates this assumption. This assumption can be problematic as
malicious attacks, operator mistakes, and software errors can cause
faulty nodes to exhibit arbitrary behavior. Techniques that tolerate
Byzantine faults [LSP82] address this problem since they make no
assumptions about the behavior of faulty components.
The central goal of this project is to build a replicated database
system that tolerates Byzantine faults. The main challenge resides
in the fact that Byzantine fault tolerance replication introduces a
non-negligible performance overhead. This fact, combined with the
necessity of the use of eager replication, has led to a low adoption
of Byzantine fault tolerance in database replication. This project
aims at developing novel techniques for improving the performance of
Byzantine fault tolerant replicated databases.
The first technique this project will explore is to relax the
isolation level [BBG+95] for transaction execution. This approach
has the potential to improve scalability by allowing a higher degree
of concurrency among transactions, as it is shown in recent papers
[LKPMJP05, EZP05]. Moreover, it does not pose any major problem for
applications that require strict serializability, as it has already
been shown how to transform an application program so that its
execution under snapshot isolation is equivalent to strict
serializability [FLO+05]. In this project we will design novel
database replication algorithms that combine eager replication,
Byzantine fault tolerance, and provide snapshot-isolation semantics.
The second mechanism that we intend to integrate in our solution is
support for speculative execution [NCF05] in the client. In such a
scheme, the client program continues its execution before the final
result of an operation is established, based on a tentative result.
This approach will allow to partially mask delay times required for
coordination in Byzantine eager replication algorithms, leading to
better perceived performance from clients. This speculative
execution will tend to use computing resources that would be
otherwise wasted, as it is executed in the client when the client is
waiting for the operation result, thus imposing no (or minimal)
overhead in the clients. Additionally, as this approach reduces the
importance of providing an immediate reply to clients, it also opens
the opportunity to improve the performance of Byzantine replication
by optimizing the replication protocol . e.g., the distinct primary
replica, usually responsible for ordering the requests, will now be
able to group more operations into a single request and execute them
in batches.
Our solution will be built as a middleware layer. We intend to
reduce to the minimum the dependencies on the underlying database
engine, thus improving the portability of our solution and making it
easily usable with different database engines.
Sname | Byzantium |
---|---|
Funding Total | 130 |
Funding Center | 94 |
State | Concluded |
Startdate | 01/01/2008 |
Enddate | 30/06/2011 |