Another great example of randomized algorithms in the introductory chapters of the book "Probability and Computing" is verifying matrix multiplication. Suppose we have 3 matrices of compatible dimensions A, B, and C, and we want to verify that
A\cdot B = C
For simplicity, let's assume that all the matrices are square, and of dimension n \times n. The straightforward way to do the verification is to explicitly multiple the matrices A and B together, an operation that is of the order of magnitude of O(n^3).
As an aside, we can do better than O(n^3) for matrix multiplication for larger matrices. Wikipedia has an excellent writeup on Strassen's algorithm which accomplishes matrix multiplication in O(n^{2.8074}) through divide and conquer. First the matrices A, B, and C are padded with zero columns and rows so that their dimensions are of the form 2^m \times 2^m. The matrices are then divided into equally sized block matrices of the form:
A = \left[ \begin{array}{cc} A_{1,1} & A_{1,2} \\ A_{2,1} & A_{2,2} \end{array} \right]
B = \left[ \begin{array}{cc} B_{1,1} & B_{1,2} \\ B_{2,1} & B_{2,2} \end{array} \right]
A\cdot B = C
For simplicity, let's assume that all the matrices are square, and of dimension n \times n. The straightforward way to do the verification is to explicitly multiple the matrices A and B together, an operation that is of the order of magnitude of O(n^3).
As an aside, we can do better than O(n^3) for matrix multiplication for larger matrices. Wikipedia has an excellent writeup on Strassen's algorithm which accomplishes matrix multiplication in O(n^{2.8074}) through divide and conquer. First the matrices A, B, and C are padded with zero columns and rows so that their dimensions are of the form 2^m \times 2^m. The matrices are then divided into equally sized block matrices of the form:
A = \left[ \begin{array}{cc} A_{1,1} & A_{1,2} \\ A_{2,1} & A_{2,2} \end{array} \right]
B = \left[ \begin{array}{cc} B_{1,1} & B_{1,2} \\ B_{2,1} & B_{2,2} \end{array} \right]
C = \left[ \begin{array}{cc} C_{1,1} & C_{1,2} \\ C_{2,1} & C_{2,2} \end{array} \right]
Then
C_{i,j} = \Sigma_{k=1}^2 A_{i,k} B_{k,j}
The trick of the Strassen's algorithm is decreasing the number of required multiplications (8) by defining the auxiliary matrices M_i such that:
\begin{eqnarray} M_1 & = & (A_{1,1}+A_{2,2})(B_{1,1}+B_{2,2}) \\ M_2 & = & (A_{2,1}+A_{2,2}) B_{1,1} \\ M_3 & = & A_{1,1}(B_{1,2}-B_{2,2}) \\ M_4 & = & A_{2,2}(B_{2,1}-B_{1,1}) \\ M_5 & = & (A_{1,1}+A_{1,2})B_{2,2}\\ M_6 & = & (A_{2,1} - A_{1,1})(B_{1,1}+B_{1,2}) \\ M_7 & = & (A_{1,2} - A_{2,2})(B_{2,1}+B_{2,2}) \end{eqnarray}
which have 7 multiplications instead, and expressing the result in terms of the M matrices as:
\begin{eqnarray} C_{1,1} & = & M_1 + M_4 - M_5 + M_7\\ C_{1,2} & = & M_3 + M_5 \\ C_{2,1} & = & M_2 + M_4 \\ C_{2,2} & = & M_1 - M_2 + M_3 + M_6 \end{eqnarray}
The algorithm continues to divide matrices A, B, and C into smaller blocks in a similar manner until there is only one element, and the multiplication becomes straightforward.
There are faster algorithms for matrix multiplication, such as the Coppersmith–Winograd algorithm with complexity of O(n^{2.375477}) and yet faster ones with O(n^{2.3728639}), but what if these are not fast enough?
Randomized algorithms come to the rescue. The idea is to choose a vector r of dimension n, whose elements r_i \in \{0,1\} and compute the quantities ABr and Cr. If ABr \ne Cr then AB \ne C and the verification of matrix multiplication is done.
On the other hand if ABr = Cr then either AB=C or we might have a false positive. The false positive occurs when (AB-C)r=0, with the first term in the equation AB-C not zero. The book computes the probability of this happening as less than \frac{1}{2}. The argument goes as follows.
Define the first term as a matrix D=AB-C. Since D \ne 0 in the false positive case, it must contain a non-zero element somewhere in the matrix. Assume this element is d_{11}. We can then write Dr=0 as the set of n simultaneous equations:
\Sigma_{j=1}^{n} d_{ij}r_j=0
Or for r_1 in the false positive case
r_1 = -\frac{\Sigma_{j=2}^{n} d_{1j}r_j}{d_{11}}
Using the principle of deferred decisions we can figure out the probability of the false positive. We start by selecting r_i for i \in \{2, n\} randomly and independently from the set \{0,1\}, and we pause before selecting r_1 from the same set. Since r_1 can have one of two values \{0,1\}, the probability of selecting r_i that satisfies the prior equation to give us the false positive case is \frac{1}{2}
Since a false positive probability of \frac{1}{2} is not good enough, we can repeat the algorithm with a new random vector r. After k repetitions of the algorithm the false positive probability drops to \frac{1}{2^k}, which we can make small to our liking with many repetitions. The cost of the verification algorithm is on the order of O(kn^2) multiplications, which is better than the fastest direct algorithm for explicit matrix multiplication.
Then
C_{i,j} = \Sigma_{k=1}^2 A_{i,k} B_{k,j}
The trick of the Strassen's algorithm is decreasing the number of required multiplications (8) by defining the auxiliary matrices M_i such that:
\begin{eqnarray} M_1 & = & (A_{1,1}+A_{2,2})(B_{1,1}+B_{2,2}) \\ M_2 & = & (A_{2,1}+A_{2,2}) B_{1,1} \\ M_3 & = & A_{1,1}(B_{1,2}-B_{2,2}) \\ M_4 & = & A_{2,2}(B_{2,1}-B_{1,1}) \\ M_5 & = & (A_{1,1}+A_{1,2})B_{2,2}\\ M_6 & = & (A_{2,1} - A_{1,1})(B_{1,1}+B_{1,2}) \\ M_7 & = & (A_{1,2} - A_{2,2})(B_{2,1}+B_{2,2}) \end{eqnarray}
which have 7 multiplications instead, and expressing the result in terms of the M matrices as:
\begin{eqnarray} C_{1,1} & = & M_1 + M_4 - M_5 + M_7\\ C_{1,2} & = & M_3 + M_5 \\ C_{2,1} & = & M_2 + M_4 \\ C_{2,2} & = & M_1 - M_2 + M_3 + M_6 \end{eqnarray}
The algorithm continues to divide matrices A, B, and C into smaller blocks in a similar manner until there is only one element, and the multiplication becomes straightforward.
There are faster algorithms for matrix multiplication, such as the Coppersmith–Winograd algorithm with complexity of O(n^{2.375477}) and yet faster ones with O(n^{2.3728639}), but what if these are not fast enough?
Randomized algorithms come to the rescue. The idea is to choose a vector r of dimension n, whose elements r_i \in \{0,1\} and compute the quantities ABr and Cr. If ABr \ne Cr then AB \ne C and the verification of matrix multiplication is done.
On the other hand if ABr = Cr then either AB=C or we might have a false positive. The false positive occurs when (AB-C)r=0, with the first term in the equation AB-C not zero. The book computes the probability of this happening as less than \frac{1}{2}. The argument goes as follows.
Define the first term as a matrix D=AB-C. Since D \ne 0 in the false positive case, it must contain a non-zero element somewhere in the matrix. Assume this element is d_{11}. We can then write Dr=0 as the set of n simultaneous equations:
\Sigma_{j=1}^{n} d_{ij}r_j=0
Or for r_1 in the false positive case
r_1 = -\frac{\Sigma_{j=2}^{n} d_{1j}r_j}{d_{11}}
Using the principle of deferred decisions we can figure out the probability of the false positive. We start by selecting r_i for i \in \{2, n\} randomly and independently from the set \{0,1\}, and we pause before selecting r_1 from the same set. Since r_1 can have one of two values \{0,1\}, the probability of selecting r_i that satisfies the prior equation to give us the false positive case is \frac{1}{2}
Since a false positive probability of \frac{1}{2} is not good enough, we can repeat the algorithm with a new random vector r. After k repetitions of the algorithm the false positive probability drops to \frac{1}{2^k}, which we can make small to our liking with many repetitions. The cost of the verification algorithm is on the order of O(kn^2) multiplications, which is better than the fastest direct algorithm for explicit matrix multiplication.
Comments
Post a Comment