Made JNI modifications to expose the faster function, made the API use
the typesafe Matrix API, and synchronized the documentation with C++.
Sped up C++ LTV diff drive test from 20 ms to 15 ms.
Sped up C++ LTV unicycle test from 15 ms to 10 ms.
Both seem to work, but the SDA algorithm is specifically recommended for
solving DAREs as opposed to P-DAREs.
The QR decomposition was replaced with a partial pivoting LU
decomposition at the recommendation of section 2.4 of the paper.
More tests and a separate JNI function for each DARE solver variant were
added.
I timed the DARE unit tests, and the new solver is 0 to 100% faster in
all cases (that is, it's at least as fast as Drake's and up to 2x faster
in some cases).
The new solver is also much simpler, takes less time to compile, and
drops the libwpimath.so size from 325 MB to 301 MB.
I think most of the compilation time is coming from the eigenvalue
decompositions used to enforce argument preconditions.