response to CB

2020-06-30 14:37:58 +02:00 · 2020-06-30 14:37:58 +02:00 · 2170984cc8
commit 2170984cc8
parent 60abe4ca7d
2 changed files with 224 additions and 0 deletions
--- a/doc/audit/report-202005.pdf
+++ b/doc/audit/report-202005.pdf
--- a/doc/audit/response-202005.tex
+++ b/doc/audit/response-202005.tex
@ -0,0 +1,224 @@
 \documentclass[11pt]{article}
 \oddsidemargin=0in \evensidemargin=0in
 \textwidth=6.2in \textheight=8.7in
 %\topmargin=-0.2in
 \usepackage[ansinew]{inputenc}
 \usepackage{makeidx,amsmath,amssymb,exscale,multicol,epsfig,graphics}
 \begin{document}
 \pagestyle{headings}
 \title{Preliminary response to the \\ GNU Taler security audit in Q2/Q3 2020}
 \author{Christian Grothoff \and Florian Dold}
 \maketitle
 \section{Abstract}
 This is the preliminary response to the source code audit report CodeBlau
 created for GNU Taler in Q2/Q3 2020.  A final response with more details is
 expected later this year.
 \section{Management Summary}
 We thank CodeBlau for their detailed report and thorough analysis. We are
 particularly impressed that they reported issues against components that were
 not even in-scope, and also that they found an {\em interesting} new corner
 case we had not previously considered. Finally, we also find several of their
 architectural recommendations to strengthen security to be worthwhile, and
 while some were already on our long-term roadmap, we will reprioritize our
 roadmap given their recommendations.
 Given our extensive discussions with CodeBlau, we also have the impression
 that they really took the time to understand the system, and look forward
 to working with CodeBlau as a competent auditor for GNU Taler in the future.
 \section{Issues in the exchange}
 We agree with the issues CodeBlau discovered and both parties believe that
 they have all been addressed.
 \section{Issues in the auditor}
 We appreciate CodeBlau's extensive list of checks the Taler auditor performs,
 which was previously not documented adequately by us. We agree that the
 auditor still needs more comprehensive documentation.
 As for issue \#6416, we agree with the analysis and the proposed fix, even if
 the implications are not fully clear. It has not yet been implemented as we
 want to carefully review all of the SQL statements implicated in the
 resolution and ensure we fully understand the implications.
 \section{Issues in GNUnet}
 We agree with the issues CodeBlau discovered and both parties believe that
 they have all been addressed.
 \section{General remarks on the code}
 We understand that writing the code in another programming language may make
 certain checks for the auditor less work. However, other programming languages
 also have disadvantages (from the complexity of the languages to the
 complexity of the compilers to tool support).  We believe creating a parallel
 implementation in other languages would provide advantages, especially with
 respect to avoiding ``the implementation is the specification''-style issues.
 However, given limited resources will not make this a priority.
 We disagree that all modern software development has embraced the idea that
 memory errors are to be handled in ways other than terminating or restarting
 the process. Many programming languages (Erlang, Java) hardly offer any other
 means of handling out-of-memory situations than to terminate the process. We
 also insist that Taler {\em does} handle out-of-memory as it does have code
 that terminates the process (we do {\em not} simply ignore the return value
 from {\tt malloc()} or other allocation functions!). We simply consider that
 terminating the process (which is run by a hypervisor that will restart the
 service) is the correct way to handle out-of-memory situations.  We also have
 limits in place that should prevent attackers from causing large amounts of
 memory to be consumed, and also have code to automatically preemptively
 restart the process to guard against memory exhaustion from memory
 fragmentation.  Finally, a common problem with abrupt termination may be
 corrupted files.  However, the code mostly only reads from files and limits
 writing to the Postgres database. Hence, there is no possibility of corrupt
 files being left behind even in the case of abnormal termination.
 \section{More specs and documentation code}
 We agree with the recommendation that the documentation should be improved,
 and will try to improve it along the lines recommended by CodeBlau.
 \section{Protocol change: API for uniformuly distributed seeds}
 We agree with the suggestion, have made the necessary changes, and both
 parties believe that the suggestion has been implemented.
 \section{Reduce code complexity}
 \subsection{Reduce global variables}
 While we do not disagree with the general goal to have few global variables,
 we also believe that there are cases where global variables make sense.
 We have already tried to minimize the scope of variables. The remaining few
 global variables are largely ``read-only'' configuration data. The report does
 not point out specific instances that would be particularly beneficial to
 eliminate. As we continue to work on the code, we will of course evaluate
 whether the removal of a particular global variable would make the code
 cleaner.
 Also, we want to point out that all global variables we introduce
 in the exchange are indicated with a prefix {\tt TEH\_} in the code, so they
 are easy to identify as such.
 \subsection{Callbacks, type p(r)unning}
 We understand that higher order functions in C can be confusing, but this
 is also a common pattern to enable code re-use and asynchronous execution
 which is essential for network applications. We do not believe that we
 use callbacks {\em excessively}.  Rewriting the code in another language
 may indeed make this part easier to understand, alas would have other
 disadvantages as pointed out previously.
 \subsection{Initializing structs with memset}
 Using {\tt memset()} first prevents compiler (or valgrind) warnings about
 using uninitialized memory, possibly hiding bugs. We also do use struct
 initialization in many cases.
 The GNUnet-wrappers are generally designed to be ``safer'' or ``stricter''
 variants of the corresponding libc functions, and not merely ``the same''.
 Hence we do not believe that renaming {\tt GNUNET\_malloc} is indicated.
 The argument that {\tt memset()}ing first makes the code inherently more
 obvious also seems fallacious, as it would commonly result in dead stores,
 which can confuse developers and produce false-positive warnings from static
 analysis tools.
 \subsection{NULL pointer handling}
 The problem with the ``goto fail'' style error handling is that it rarely
 results in specific error handling where diagnostics are created that are
 specific to the error. Using this style of programming encourages developers
 to create simplistic error handling, which can result in inappropriate error
 handling logic and also makes it harder to attribute errors to the specific
 cause.
 However, we have no prohibition on using this style of error handling either:
 if it is appropriate, develpers should make a case-by-case decision as to how
 to best handle a specific error.
 We have made some first changes to how {\tt GNUNET\_free()} works in response
 to the report, and will discuss further changes with the GNUnet development
 team.
 \subsection{Hidden security assumptions}
 We disagree that the assumptions stated are ``hidden'', as (1) the Taler code
 has its own checks to warrant that the requirements of the {\tt
  GNUNET\_malloc()} API are satisfied (so enforcement is not limited to the
 abstraction layer), and (2) the maximum allocation size limit is quite clearly
 specified in the GNUnet documentation.  Also, the GNUnet-functions are not
 merely an abstraction layer for portability, but they provided extended
 semantics that we rely upon. So it is not like it is possible to swap this
 layer and expect anything to continue to work.
 When we use the libjansson library, it is understood that it does not use
 the GNUnet operations, and the code is careful about this distinction.
 \subsection{Get rid of boolean function arguments}
 We agree that this can make the code more readable, and have in some places
 already changed the code in this way.
 \section{Structural Recommendation}
 \subsection{Least privilege}
 It is wrong to say that GNU Taler has ``no work done'' on privilege separation.
 For example, the {\tt taler-exchange-dbinit} tool is the only tool that requires
 CREATE, ALTER and DROP rights on database tables, thus enusring that the ``main''
 process does not need these rights.
 We also already had the {\tt taler-exchange-keyup} tool responsible for
 initializing keys. In response to the audit, we already changed the GNUnet API
 to make sure that tools do not create keys as a side-effect of trying to read
 non-existent key files.
 We agree with the recommendation on further privilege separation for access
 to cryptographic keys, and intend to implement this in the near future.
 \subsection{File system access}
 The auditor helpers actually only read from the file system, only the LaTeX
 invocation to compile the final report to PDF inherently needs write
 access. We do not predict that we will retool LaTeX.  Also, the file system
 access is completely uncritical, as the auditor by design runs on a system
 that is separate from the production exchange system.
 Because that system will not have {\em any} crypto keys (not even the one of
 the auditor!), CodeBlau is wrong to assume that reading from or writing to the
 file system represents a security threat.
 We have started to better document the operational requirements on running the
 auditor.
 \subsection{Avoid dlopen}
 Taler actually uses {\tt ltdlopen()} from GNU libtool, which provides
 compiler flags to conver the dynamic linkage into static linkage.  For
 development, dynamic linkage has many advantages.
 We plan to test and document how to build GNU Taler with only static
 linkage, and will recommend this style of deployment for the Taler
 exchange for production.
 \subsection{Reduce reliance on PostgreSQL}
 Using other mechanisms beyond the database as a ``Plan B'' would create
 serious availability and cost concerns, as now either mechanism may create
 serialization issues and require database rollbacks.  Also, any such
 append-only logging mechanism would itself have a similar complexity as the
 primary database.  Thus, we do not believe that the drastic complexity
 increase from the combined solution represents a valid security trade-off.
 \end{document}