|
Post by baanish on Apr 18, 2023 5:45:28 GMT
I'm coding this book in rust, I have found allocating matrices as a 4x4 f64 array on the stack doesn't break anything(yet) and results in a HUGE performance improvement. Would doing this break anything? Are matrices larger than 4x4 used in this book? I'm just starting chapter 6.
|
|
|
Post by ulfben on Apr 18, 2023 14:56:43 GMT
4x4s are all you need throughout this book, and keeping them on the stack is a safe and performant strategy as you've found out. Even if you cache the inverse matrix too, stack space is unlikely to become an issue.
|
|
|
Post by baanish on Apr 18, 2023 21:22:53 GMT
Moving to the stack, using LU decomposition, multithreading, and a few other optimizations has resulted in an improvement from 6mins to <100ms for my code.
|
|
|
Post by ulfben on Apr 23, 2023 6:17:55 GMT
Interresting! Do you have timings before and after implementing LU decomposition? My gut feeling was that 4x4 matrices are too small to benefit much from the more complicated calculation. Ie. that the overhead of LU decomposition would never be earned back on such small matrices. I'd love to see your implementation, and timings if you have them?
|
|
|
Post by baanish on Apr 23, 2023 23:50:31 GMT
LU is about 40% faster for me.
pub fn lu_decomposition(&self) -> (Matrix, Matrix) {
let mut lower = Matrix::identity(self.rows);
let mut upper = Matrix::new(self.rows, self.cols);
for i in 0..self.rows {
// Upper Triangular
for j in i..self.cols {
let mut sum = 0.0;
for k in 0..i {
sum += lower.data[i][k] * upper.data[k][j];
}
upper.data[i][j] = self.data[i][j] - sum;
}
// Lower Triangular
for j in i..self.cols {
if i == j {
lower.data[i][i] = 1.0;
} else {
let mut sum = 0.0;
for k in 0..i {
sum += lower.data[j][k] * upper.data[k][i];
}
lower.data[j][i] = (self.data[j][i] - sum) / upper.data[i][i];
}
}
}
(lower, upper)
}
pub fn inverse(&self) -> Matrix {
// normal method
// let det = self.determinant();
// if det == 0.0{
// panic!("Matrix is not invertible");
// }
// let mut m = Matrix::new(self.rows, self.cols);
// for i in 0..self.rows {
// for j in 0..self.cols {
// let c = self.cofactor(i, j);
// m.data[j][i] = c / det;
// }
// }
// m
// LU decomposition
let (l, u) = self.lu_decomposition();
let mut x = Matrix::new(self.rows, self.cols);
for i in 0..self.rows {
let mut col = [0.0; 4];
col[i] = 1.0;
// Forward substitution for Ly = b
let mut y = [0.0; 4];
for j in 0..self.rows {
y[j] = col[j] - (0..j).map(|k| l.data[j][k] * y[k]).sum::<f64>();
}
// Backward substitution for Ux = y
for j in (0..self.rows).rev() {
let sum = (j + 1..self.cols).map(|k| u.data[j][k] * x.data[k][i]).sum::<f64>();
x.data[j][i] = (y[j] - sum) / u.data[j][j];
}
}
x
}
|
|