From 7ebde15709591796f37311e3f815eb41fc33cad8 Mon Sep 17 00:00:00 2001 From: John Winans Date: Tue, 18 Aug 2020 16:04:52 -0500 Subject: [PATCH] Cleanup signed, unsigned, adding, & overflow --- book/binary/chapter.tex | 223 ++++++++++++++++++++++++++++------------ 1 file changed, 157 insertions(+), 66 deletions(-) diff --git a/book/binary/chapter.tex b/book/binary/chapter.tex index 7612242..f82a97f 100644 --- a/book/binary/chapter.tex +++ b/book/binary/chapter.tex @@ -564,46 +564,42 @@ Binary: 1 1 1 1 1 1 1 1 \ldots because: $-128+64+32+16+8+4+2+1=-1$. +This format has the virtue of allowing the same addition logic discussed above to be +used to calculate the sums of signed numbers as unsigned numbers. -Calculating $4+5 = 9$ +Calculating the signed addition: $4+5 = 9$ \begin{verbatim} - 1 <== carries - 000100 <== 4 - +000101 <== 5 - ------ - 001001 <== 9 + 1 <== carries + 000100 <== 4 = 0 + 0 + 0 + 4 + 0 + 0 + +000101 <== 5 = 0 + 0 + 0 + 4 + 0 + 1 + ------- + 001001 <== 9 = 0 + 0 + 8 + 0 + 0 + 1 \end{verbatim} -Calculating $-4+ -5 = -9$ +Calculating the signed addition: $-4+ -5 = -9$ \begin{verbatim} - 1 11 <== carries - 111100 <== -4 - +111011 <== -5 + 1 11 <== carries + 111100 <== -4 = -32 + 16 + 8 + 4 + 0 + 0 + +111011 <== -5 = -32 + 16 + 8 + 0 + 2 + 1 --------- - 1 110111 <== -9 (with a truncation) - - -32 16 8 4 2 1 - 1 1 0 1 1 1 - -32 + 16 + 4 + 2 + 1 = -9 + 1 110111 <== -9 (with a truncation) = -32 + 16 + 4 + 2 + 1 = -9 \end{verbatim} -This format has the virtue of allowing the same addition logic -discussed above to be used to calculate $-1+1=0$. +Calculating the signed addition: $-1+1=0$ \begin{verbatim} -128 64 32 16 8 4 2 1 <== place value - 1 1 1 1 1 1 1 1 0 <== carries + 1 1 1 1 1 1 1 1 <== carries 1 1 1 1 1 1 1 1 <== addend (-1) + 0 0 0 0 0 0 0 1 <== addend (1) ---------------------- 1 0 0 0 0 0 0 0 0 <== sum (0 with a truncation) \end{verbatim} -In order for this to work, the carry out of the sum of the MSBs is -ignored. +{\em In order for this to work, the carry out of the sum of the MSBs {\bfseries must} be discarded.} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsubsection{Converting between Positive and Negative} @@ -612,8 +608,9 @@ Changing the sign on two's complement numbers can be described as inverting all of the bits (which is also known as the {\em one's complement}) and then add one. -For example, inverting the number four: +For example, negating the number four: +\begin{minipage}{\textwidth} \begin{verbatim} -128 64 32 16 8 4 2 1 0 0 0 0 0 1 0 0 <== 4 @@ -624,23 +621,24 @@ For example, inverting the number four: ---------------------- 1 1 1 1 1 1 0 0 <== -4 \end{verbatim} +\end{minipage} This can be verified by adding 5 to the result and observe that the sum is 1: \begin{verbatim} -128 64 32 16 8 4 2 1 - 1 1 1 1 1 <== carries + 1 1 1 1 1 1 <== carries 1 1 1 1 1 1 0 0 <== -4 + 0 0 0 0 0 1 0 1 <== 5 ---------------------- - 1 0 0 0 0 0 0 0 1 + 1 0 0 0 0 0 0 0 1 <== 1 (with a truncation) \end{verbatim} Note that the changing of the sign using this method is symmetric in that it is identical when converting from negative to positive -and when converting from positive to negative: flip the bits and -add 1. +and when converting from positive to negative: {\em flip the bits and +add 1.} For example, changing the value -4 to 4 to illustrate the reverse of the conversion above: @@ -661,45 +659,56 @@ reverse of the conversion above: \subsection{Subtraction of Binary Numbers} -Subtraction of binary numbers is performed by first negating -the subtrahend and then adding the two numbers. Due to the -nature of two's complement numbers this will work for both -signed and unsigned numbers. +Subtraction% \enote{This section needs more examples of subtracting signed an unsigned numbers and a discussion on how signedness is not relevant until the results are interpreted. For example adding $-4+ -8=-12$ using two 8-bit numbers is the same as adding $252+248=500$ and truncating the result to 244.} +of binary numbers is performed by first negating +the subtrahend and then adding the two numbers. Due to the +nature of two's complement numbers this method will work for both +signed and unsigned numbers! -To calculate $-4-8 = -12$ +Observation: Since we always have a carry-in of zero into the LSB when +adding, we can take advantage of that fact by (ab)using that carry input +to perform that adding the extra 1 to the subtrahend as part of +changing its sign in the examples below. -\enote{This example is unclear. That the adding of one to the subtrahend -has to be done as part of the same operation as the sum of the two values. -otherwise adding 1000 to 0001 will {\em not} result in a proper overflow -staus as discussed below.} +An example showing the subtraction of two {\em signed} binary numbers: $-4-8 = -12$ \begin{verbatim} -128 64 32 16 8 4 2 1 1 1 1 1 1 1 0 0 <== -4 (minuend) - 0 0 0 0 1 0 0 0 <== 8 (subtrahend) + ------------------------ - 1 1 1 <== carries - 1 1 1 1 0 1 1 1 <== one's complement of -8 - + 0 0 0 0 0 0 0 1 <== plus 1 - ---------------------- - 1 1 1 1 1 0 0 0 <== -8 - - - 1 1 1 1 <== carries + 1 1 1 1 1 1 1 1 1 <== carries 1 1 1 1 1 1 0 0 <== -4 - + 1 1 1 1 1 0 0 0 <== -8 - ---------------------- - 1 1 1 1 1 0 1 0 0 < == -12 + + 1 1 1 1 0 1 1 1 <== one's complement of -8 + ------------------------ + 1 1 1 1 1 0 1 0 0 <== -12 \end{verbatim} +%An example showing the subtraction of two {\em unsigned} binary numbers: $252+248=500$ +% +%\begin{verbatim} +% 128 64 32 16 8 4 2 1 +% +% 1 1 1 1 1 <== carries +% 1 1 1 1 1 1 0 0 <== 252 +% + 1 1 1 1 1 0 0 0 <== 248 +% ---------------------- +% 1 1 1 1 1 0 1 0 0 < == 500 (if we do NOT truncate the MSB) +%\end{verbatim} +% +%An example showing the subtraction of two {\em unsigned} binary numbers: $252+248=500$ + + + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Truncation} @@ -707,10 +716,64 @@ staus as discussed below.} \index{overflow} \index{carry} -So far we have been ignoring (truncating) the carries that can come from -the MSBs when adding and subtracting. We have also been ignoring the -potential impact of a carry causing a signed number to change its sign in -a destructive way. +Discarding the carry bit that can be generated from the MSB is called {\em truncation}. + +So far we have been ignoring the carries that can come from the MSBs when adding and subtracting. +We have also been ignoring the potential impact of a carry causing a signed number to change +its sign in an unexpected way. + +In the examples above, truncating the results either had 1) no impact on the calculated sums +or 2) was absolutely necessary to correct the sum in cases such as: $-4 + 5$. + +For example, note what happens when we try to subtract 1 from the most +negative value that we can represent in a 4 bit two's complement number: + +\begin{verbatim} + -8 4 2 1 + 1 0 0 0 <== -8 (minuend) + - 0 0 0 1 <== 1 (subtrahend) + ------------ + + + 1 1 <== carries + 1 0 0 0 <== -8 + + 1 1 1 0 <== one's complement of 1 + ---------- + 1 0 1 1 1 <== this SHOULD be -9 but with truncation it is 7 +\end{verbatim} + +The problem with this example is that we can not represent $-9_{10}$ using a 4-bit +two's complement number. + +Granted, if we would have used 5 bit numbers, then the ``answer'' would have fit OK. +But the same problem would return when trying to calculate $-16 - 1$. +So simply ``making more room'' does not solve this problem. + +%However, as calculating $-1+1=0$ has demonmstrated above, it was necessary for that +%case to discard the carry out of the MSB to get the correct result. + +%In the case of calculating $-1+1=0$ the addends and result all fit into same-sized +%(8-bit) values. When calculating $-8-1=-9$ the addends each can fit into 4-bit +%two's complement numbers but the result would require a 5-bit number. + +This is not just a problem when subtracting, nor is it just a problem with +signed numbers. + +The same situation can happen {\em unsigned} numbers. +For example: + +\begin{verbatim} + 8 4 2 1 + 1 1 1 0 0 <== carries + 1 1 1 0 <== 14 (addend) + + 0 0 1 1 <== 3 (addend) + ------------ + 1 0 0 0 1 <== this SHOULD be 17 but with truncation it is 1 +\end{verbatim} + + +How to handle such a truncation depends on whether the {\em original} values +being added are signed or unsigned. The RV ISA refers to the discarding the carry out of the MSB after an add (or subtract) of two {\em unsigned} numbers as an {\em unsigned overflow}% @@ -728,38 +791,66 @@ When adding {\em unsigned} numbers, an overflow only occurs when there is a carry out of the MSB resulting in a sum that is truncated to fit into the number of bits allocated to contain the result. -When subtracting {\em unsigned} numbers, an overflow only occurs when the -difference is negative (because there are no negative unsigned numbers.) - -\autoref{sum:240+17} illustrates an unsigned overflow. +\autoref{sum:240+17} illustrates an unsigned overflow during addition: \begin{figure}[H] \centering \begin{BVerbatim} - 1 1 1 1 0 0 0 0 <== carries - 1 1 1 1 0 0 0 0 <== 240 - + 0 0 0 1 0 0 0 1 <== 17 + 1 1 1 1 0 0 0 0 0 <== carries + 1 1 1 1 0 0 0 0 <== 240 + + 0 0 0 1 0 0 0 1 <== 17 --------------------- - 0 0 0 0 0 0 0 1 <== sum = 1 + 1 0 0 0 0 0 0 0 1 <== sum = 1 \end{BVerbatim} %{\captionof{figure}{$240+16=0$ (overflow)}\label{sum:240+17}} \caption{$240+17=1$ (overflow)} \label{sum:240+17} \end{figure} -\enote{Need to add an example of an unsigned overflow after a subtraction. -When subtracting by adding the two's complement of the subtrahend, the unsigned -overflow status is represented by a 0 carry out of the most significant bit!} Some times an overflow like this is referred to as a {\em wrap around} because of the way that successive additions will result in a value that increases until it {\em wraps} back {\em around} to zero and then returns to increasing in value until it, again, wraps around again. \begin{tcolorbox} -An {\em unsigned overflow} occurs when ever there is a carry +When adding, {\em unsigned overflow} occurs when ever there is a carry {\em out of} the most significant bit. \end{tcolorbox} + + +When subtracting {\em unsigned} numbers, an overflow only occurs when the +subtrahend is greater than the minuend (because in those cases the +different would have to be negative and there are no negative values +that can be represented with an unsigned binary number.) + +\autoref{sum:3-4} illustrates an unsigned overflow during subtraction: + +\begin{figure}[H] +\centering +\begin{BVerbatim} + 0 0 0 0 0 1 1 <== 3 (minuend) + - 0 0 0 0 1 0 0 <== 4 (subtrahend) + --------------- + + + 0 0 0 0 0 1 1 1 <== carries + 0 0 0 0 0 1 1 <== 3 + + 1 1 1 1 0 1 1 <== one's complement of 4 + --------------- + 1 1 1 1 1 1 1 <== 255 (overflow) +\end{BVerbatim} +\caption{$3-4=255$ (overflow)} +\label{sum:3-4} +\end{figure} + +\begin{tcolorbox} +When subtracting, {\em unsigned overflow} occurs when ever there is {\em not} a carry +{\em out of} the most significant bit (IFF the carry-in on the LSB is used to add the +extra 1 to the subtrahend when changing its sign.) +\end{tcolorbox} + + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsubsection{Signed Overflow} \index{overflow!signed} @@ -781,7 +872,7 @@ while looking more closely at the carry values. \begin{figure}[H] \centering \begin{BVerbatim} - 0 1 0 0 0 0 0 0 <== carries + 0 1 0 0 0 0 0 0 0 <== carries 0 1 0 0 0 0 0 0 <== 64 + 0 1 0 0 0 0 0 0 <== 64 --------------------- @@ -811,7 +902,7 @@ We say that this result has been {\em truncated}. \begin{figure}[H] \centering \begin{BVerbatim} - 1 0 0 0 0 0 0 0 <== carries + 1 0 0 0 0 0 0 0 0 <== carries 1 0 0 0 0 0 0 0 <== -128 + 1 0 0 0 0 0 0 0 <== -128 --------------------- @@ -830,7 +921,7 @@ do not have the same sign. \begin{figure}[H] \centering \begin{BVerbatim} - 1 1 1 1 1 1 1 1 <== carries + 1 1 1 1 1 1 1 1 0 <== carries 1 1 1 1 1 1 0 1 <== -3 + 1 1 1 1 1 0 1 1 <== -5 --------------------- @@ -843,7 +934,7 @@ do not have the same sign. \begin{figure}[H] \centering \begin{BVerbatim} - 1 1 1 1 1 1 1 0 <== carries + 1 1 1 1 1 1 1 0 0 <== carries 1 1 1 1 1 1 1 0 <== -2 + 0 0 0 0 1 0 1 0 <== 10 --------------------- @@ -862,7 +953,7 @@ the most negative value as shown in \autoref{sum:127+1}. \begin{figure}[H] \centering \begin{BVerbatim} - 0 1 1 1 1 1 1 1 <== carries + 0 1 1 1 1 1 1 1 0 <== carries 0 1 1 1 1 1 1 1 <== 127 + 0 0 0 0 1 0 0 1 <== 1 ---------------------