From caaa6b3599b13f1857884f3f7842b4901e8eecd6 Mon Sep 17 00:00:00 2001 From: John Winans Date: Thu, 24 May 2018 05:42:18 -0500 Subject: [PATCH] Truth tables, msb/lsb, alignment, sign-extend. --- book/binary/chapter.tex | 357 +++++++++++++++++++++++++++++++--------- book/rv32/chapter.tex | 44 +---- 2 files changed, 277 insertions(+), 124 deletions(-) diff --git a/book/binary/chapter.tex b/book/binary/chapter.tex index a709ab5..82cdc3b 100644 --- a/book/binary/chapter.tex +++ b/book/binary/chapter.tex @@ -11,6 +11,17 @@ RISC-V assembly language uses binary to represent all values, be they boolean or numeric. It is the context within which they are used that determines whether they are boolean or numeric. +\enote{Add some diagrams here showing bits, bytes and the MSB, +LSB,\ldots\ perhaps relocated from the RV32I chapter?} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Boolean Functions} + +Boolean functions apply on a per-bit basis. +When applied to multi-bit values, each bit position is operated upon +independently of the other bits. + RISC-V assembly language uses zero to represent {\em false} and one to represent {\em true}. In general, however, it is useful to relax this and define zero {\bf and only zero} to be {\em false} and anything @@ -19,27 +30,11 @@ that is not {\em false} is therefore {\em true}.% many other languages as well as the common assembly language idioms discussed in this text.} -\enote{Add some diagrams here showing bits, bytes and the MSB, -LSB,\ldots\ perhaps relocated from the RV32I chapter?}% The reason for this relaxation is because, while a single binary digit (\gls{bit}) can represent the two values zero and one, the vast majority of the time data is processed by the CPU in groups of bits. These -groups have names like \gls{byte}, \gls{halfword} and \gls{fullword}. - - - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{Boolean Functions} - -\enote{Probably should add basic truth table diagrams.}% -Boolean functions apply on a per-bit basis. -%in that they do not impact neighboring bits. -%by generating things like a carry or a borrow. -When applied to multi-bit values, each bit position is operated upon -independently of the other bits. - - +groups have names like \gls{byte} (8 bits), \gls{halfword} (16 bits) +and \gls{fullword} (32 bits). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -54,9 +49,35 @@ If the input is 1 then the output is 0. If the input is 0 then the output is 1. In other words, the output value is {\em not} that of the input value. -This text will use the operator used in the C language when discussing +Expressing the {\em not} function in the form a a truth table: + +\begin{center} +\begin{tabular}{c|c} +A & $\overline{\mbox{A}}$\\ +\hline +0 & 1 \\ +1 & 0 \\ +\end{tabular} +\end{center} + +A truth table is drawn by indicating all of the possible input values on +the left of the vertical bar with each row displaying the output values +that correspond to the input for that row. The column headings are used +to define the illustrated operation expressed using a mathematical +notation. The {\em not} operation is indicated by the presense of +an {\em overline}. + +In computer programming languages, things like an overline can not be +efficiently expressed using a standard keyboard. Therefore it is common +to use a notation such as that used by the C language when discussing the {\em NOT} operator in symbolic form. Specifically the tilde: `\verb@~@'. +It is also uncommon to for programming languages to express boolean operations +on single-bit input(s). A more generalized operation is used that applies +to a set of bits all at once. For example, performing a {\em not} operation +of eight bits at once can be illustrated as: + + \begin{verbatim} ~ 1 1 1 1 0 1 0 1 <== A ----------------- @@ -73,12 +94,30 @@ The boolean {\em and} function has two or more inputs and the output is a single bit. The output is 1 if and only if all of the input values are 1. Otherwise it is 0. +This function works like it does in spoken language. For example +if A is 1 {\em AND} B is 1 then the output is 1 (true). +Otherwise the output is 0 (false). + +In mathamatical notion, the {\em and} operator is expressed the same way +as is {\em multiplication}. That is by a raised dot between, or by +juxtaposition of, two variable names. It is also worth noting that, +in base-2, the {\em and} operation actually {\em is} multiplication! + +\begin{center} +\begin{tabular}{cc|c} +A & B & AB \\ +\hline +0 & 0 & 0 \\ +0 & 1 & 0 \\ +1 & 0 & 0 \\ +1 & 1 & 1 \\ +\end{tabular} +\end{center} + This text will use the operator used in the C language when discussing the {\em AND} operator in symbolic form. Specifically the ampersand: `\verb@&@'. -This function works like it does in spoken language. For example -if A is 1 {\em AND} B is 1 then the output is 1 (true). -Otherwise the output is 0 (false). For example: +An eight-bit example: \begin{verbatim} 1 1 1 1 0 1 0 1 <== A @@ -96,12 +135,28 @@ In a line of code the above might read like this: \verb@output = A & B@ The boolean {\em or} function has two or more inputs and the output is a single bit. The output is 1 if at least one of the input values are 1. +This function works like it does in spoken language. For example +if A is 1 {\em OR} B is 1 then the output is 1 (true). +Otherwise the output is 0 (false). + +In mathamatical notion, the {\em or} operator is expressed using the plus +($+$). + +\begin{center} +\begin{tabular}{cc|c} +A & B & A$+$B \\ +\hline +0 & 0 & 0 \\ +0 & 1 & 1 \\ +1 & 0 & 1 \\ +1 & 1 & 1 \\ +\end{tabular} +\end{center} + This text will use the operator used in the C language when discussing the {\em OR} operator in symbolic form. Specifically the pipe: `\verb@|@'. -This function works like it does in spoken language. For example -if A is 1 {\em OR} B is 1 then the output is 1 (true). -Otherwise the output is 0 (false). For example: +An eight-bit example: \begin{verbatim} 1 1 1 1 0 1 0 1 <== A @@ -120,14 +175,29 @@ The boolean {\em exclusive or} function has two or more inputs and the output is a single bit. The output is 1 if only an odd number of inputs are 1. Otherwise the output will be 0. -This text will use the operator used in the C language when discussing -the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'. - Note that when {\em XOR} is used with two inputs, the output is set to 1 (true) when the inputs have different values and 0 (false) when the inputs both have the same value. -For example: +In mathamatical notion, the {\em xor} operator is expressed using the plus +in a circle ($\oplus$). + +\begin{center} +\begin{tabular}{cc|c} +A & B & A$\oplus{}$B \\ +\hline +0 & 0 & 0 \\ +0 & 1 & 1 \\ +1 & 0 & 1 \\ +1 & 1 & 0 \\ +\end{tabular} +\end{center} + +This text will use the operator used in the C language when discussing +the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'. + + +An eight-bit example: \begin{verbatim} 1 1 1 1 0 1 0 1 <== A @@ -200,9 +270,9 @@ $10^2$ & $10^1$ & $10^0$ & $2^7$ & $2^6$ & $2^5$ & $2^4$ & $2^3$ & $2^2$ & $2^1$ One way to look at this table is on a per-row basis where each place value is represented by the base raised to the power of the place value -position (shown in the column headings.) This is useful when -converting arbitrary values between bases. For example to interpret -the decimal value on the fourth row: +position (shown in the column headings.) +%This is useful when converting arbitrary numeric values between bases. +For example to interpret the decimal value on the fourth row: \begin{equation} 0 \times 10^2 + 0 \times 10^1 + 3 \times 10^0 = 3_{10} @@ -220,6 +290,15 @@ Interpreting the hexadecimal value on the fourth row by converting it to decimal 0 \times 16^1 + 3 \times 16^0 = 3_{10} \end{equation} +\index{Most significant bit}\index{MSB|see {Most significant bit}}% +\index{Least significant bit}\index{LSB|see {Least significant bit}}% +We refer to the place values with the largest exponenet (the one furthest to the +left for any given base) as the {\em most significant} digit and the place value +with the lowest exponenet as the {\em least significant} digit. For binary +numbers these are the \acrfull{msb} and \acrfull{lsb} respectively.% +\footnote{Changing the value of the MSB will have a more {\em significant} +impact on the numeric value than changing the value of the LSB.} + Another way to look at this table is on a per-column basis. When tasked with drawing such a table by hand, it might be useful @@ -242,8 +321,8 @@ for readability. The relationship between binary and hex values is also worth taking note. Because $2^4 = 16$, there is a clean and simple grouping -of 4 \gls{bit}s to 1 \gls{hit}. There is no such relationship -between binary and decimal. +of 4 \gls{bit}s to 1 \gls{hit} (aka \gls{nybble}). +There is no such relationship between binary and decimal. Writing and reading numbers in binary that are longer than 8 bits is cumbersome and prone to error. The simple conversion between @@ -283,8 +362,9 @@ To convert from binary to decimal, put the decimal value of the place values {\ldots8 4 2 1} over the binary digits like this: \begin{verbatim} - 128 64 32 16 8 4 2 1 - 0 0 0 1 1 0 1 1 +Base-2 place values: 128 64 32 16 8 4 2 1 +Binary: 0 0 0 1 1 0 1 1 +Decimal: 16 +8 +2 +1 = 27 \end{verbatim} Now sum the place-values that are expressed in decimal for each @@ -303,9 +383,9 @@ extend the binary to the left with zeros to make it so. Grouping the bits into sets of four and summing: \begin{verbatim} -Place: 8 4 2 1 8 4 2 1 8 4 2 1 8 4 2 1 -Binary: 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 0 -Decimal: 4+2 =6 8+4+ 1=13 8+ 2 =10 8+4+2 =14 +Base-2 place values: 8 4 2 1 8 4 2 1 8 4 2 1 8 4 2 1 +Binary: 0 1 1 0 1 1 0 1 1 0 1 0 1 1 1 0 +Decimal: 4+2 =6 8+4+ 1=13 8+ 2 =10 8+4+2 =14 \end{verbatim} After the summing, convert each decimal value to hex. The decimal @@ -339,10 +419,9 @@ method discussed in \autoref{section:bindec}. For example: \begin{verbatim} -Hex: 4 C -Binary: 0 1 0 0 1 1 0 0 -Decimal: 128 64 32 16 8 4 2 1 -Sum: 64+ 8+4 = 76 +Hex: 7 C +Decimal Sum: 4+2+1=7 8+4 =12 +Binary: 0 1 1 1 1 1 0 0 \end{verbatim} @@ -356,8 +435,9 @@ of the place values that would yield a non-negative result. For example, to convert $1234_{10}$ to binary: + \begin{verbatim} -Place values: 2048-1024-512-256-128-64-32-16-8-4-2-1 +Base-2 place values: 2048-1024-512-256-128-64-32-16-8-4-2-1 0 2048 (too big) 1 1234 - 1024 = 210 @@ -429,20 +509,46 @@ negate the place value of the \acrshort{msb}. For example, the number one is represented the same as discussed before: \begin{verbatim} - -128 64 32 16 8 4 2 1 - 0 0 0 0 0 0 0 1 +Base-2 place values: -128 64 32 16 8 4 2 1 +Binary: 0 0 0 0 0 0 0 1 \end{verbatim} The \acrshort{msb} of any negative number in this format will always be 1. For example the value $-1_{10}$ is: \begin{verbatim} - -128 64 32 16 8 4 2 1 - 1 1 1 1 1 1 1 1 +Base-2 place values: -128 64 32 16 8 4 2 1 +Binary: 1 1 1 1 1 1 1 1 \end{verbatim} \ldots because: $-128+64+32+16+8+4+2+1=-1$. + +Calculating $4+5 = 9$ + +\begin{verbatim} + 1 <== carries + 000100 <== 4 + +000101 <== 5 + ------ + 001001 <== 9 +\end{verbatim} + +Calculating $-4+ -5 = -9$ + +\begin{verbatim} + 1 11 <== carries + 111100 <== -4 + +111011 <== -5 + --------- + 1 110111 <== -9 (with a truncation) + + -32 16 8 4 2 1 + 1 1 0 1 1 1 + -32 + 16 + 4 + 2 + 1 = -9 +\end{verbatim} + + This format has the virtue of allowing the same addition logic discussed above to be used to calculate $-1+1=0$. @@ -452,11 +558,11 @@ discussed above to be used to calculate $-1+1=0$. 1 1 1 1 1 1 1 1 <== addend (-1) + 0 0 0 0 0 0 0 1 <== addend (1) ---------------------- - 1 0 0 0 0 0 0 0 0 <== sum (0 with an overflow) + 1 0 0 0 0 0 0 0 0 <== sum (0 with a truncation) \end{verbatim} -In order for this to work, the \gls{overflow} carry out of the -sum of the MSBs is ignored. +In order for this to work, the carry out of the sum of the MSBs is +ignored. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsubsection{Converting between Positive and Negative} @@ -623,10 +729,78 @@ the results of ADD and ADDW on the operands. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Sign and Zero Extension} -\enote{Refactor the sx() and zx() discussion in the RV32I chapter -and locate the details here.}% -Seems like a good place to discuss extension. +\index{sign extension} +\label{SignExtension} +Due to the nature of the two's complement encoding scheme, the following +numbers all represent the same value: +\begin{verbatim} + 1111 <== -1 + 11111111 <== -1 + 11111111111111111111 <== -1 + 1111111111111111111111111111 <== -1 +\end{verbatim} +As do these: +\begin{verbatim} + 01100 <== 12 + 0000001100 <== 12 + 00000000000000000000000000000001100 <== 12 +\end{verbatim} +The phenomenon illustrated here is called {\em sign extension}. That is +any signed number can have any quantity of additional MSBs added to it, +provided that they repeat the value of the sign bit. + +\autoref{Figure:SignExtendNegative} illustrates extending the negative sign +bit of {\em val} to the left by replicating it. +When {\em val} is negative, its \acrshort{msb} (bit 19 in this example) will +be set to 1. Extending this value to the left will set all the new bits +to the left of it to 1 as well. + +\begin{figure}[ht] +\centering +\DrawBitBoxSignExtendedPicture{32}{10100000000000000010} +\captionof{figure}{Sign-extending a negative integer from 20 bits to 32 bits.} +\label{Figure:SignExtendNegative} +\end{figure} + +\autoref{Figure:SignExtendPositive} illustrates extending the positive sign +bit of {\em val} to the left by replicating it. +When {\em val} is positive, its \acrshort{msb} will be set to 0. Extending this +value to the left will set all the new bits to the left of it to 0 as well. + +\begin{figure}[ht] +\centering +\DrawBitBoxSignExtendedPicture{32}{01000000000000000010} +\captionof{figure}{Sign-extending a positive integer from 20 bits to 32 bits.} +\label{Figure:SignExtendPositive} +\end{figure} + + +\label{ZeroExtension} +In a similar vein, any {\em unsigned} number also may have any quantity of +additional MSBs added to it provided that htey are all zero. For example, +the following all represent the same value: +\begin{verbatim} + 1111 <== 15 + 01111 <== 15 + 00000000000000000000000001111 <== 15 +\end{verbatim} + +The observation here is that any {\em unsigned} number may be +{\em zero extended} to any size. + +\autoref{Figure:ZeroExtend} illustrates zero-extending a 20-bit {\em val} to the +left to form a 32-bit fullword. + +\begin{figure}[ht] +\centering +\DrawBitBoxZeroExtendedPicture{32}{10000000000000000010} +\captionof{figure}{Zero-extending an unsigned integer from 20 bits to 32 bits.} +\label{Figure:ZeroExtend} +\end{figure} + +%Sign- and zero-extending binary numbers are common operations used to +%fit a byte or halfword into a fullword. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -694,28 +868,59 @@ value and TRUE if it is a conditional. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Alignment} -Draw a diagram showing the overlapping data types when they are all aligned. +\enote{Include the obligatory diagram showing the overlapping data types +when they are all aligned.}% +With respect to memory and storage, {\em \gls{alignment}} refers to the +{\em location} of a data element when the address that it is stored is +a precise multiple of a power-of-2. + +The primary alignments of concern are typically 2 (a halfword), +4 (a fullword), 8 (a double word) and 16 (a quad-word) bytes. + +For example, any data element that is aligned to 2-byte boundary +must have an (hex) address that ends in any of: 0, 2, 4, 6, 8, A, +C or E. +Any 4-byte aligned element must be located at an address ending +in 0, 4, 8 or C. An 8-byte aligned element at an address ending +with 0 or 8, and 16-byte aligned elements must be located at +addresses ending in zero. + +Such alignments are important when exchanging data between the CPU +and memory because the hardware implementations are optimized to +transfer aligned data. Therefore, aligning data used by any program +will reap the benefit of running faster. + +An element of data is considered to be {\em aligned to its natural size} +when its address is an exact multiple of the number of bytes used to +represent the data. Note that the ISA we are concerned with {\em only} +operates on elements that have sizes that are powers of two. + +For example, a 32-bit integer consumes one full word. If the four bytes +are stored in main memory at an address than is a multiple of 4 then +the integer is considered to naturally aligned. + +The same would apply to 16-bit, 64-bit, 128-bit and other such values +as they fit into 2, 8 and 16 byte elements respectively. + +Some CPUs can deliver four (or more) bytes at the same time while others +might only be capable of delivering one or two bytes at a time. Such +differences in hardware typically impact the cost and performance of a +system.% +\footnote{The design and implementation +choices that determine how any given system operates are part of what is +called a system's {\em organization} and is beyond the scope of this text. +See~\cite{codriscv:2017} for more information on computer organization.} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Instruction Alignment} -\enote{Rewrite this section for data rather than instructions and then -note here that instructions must be naturally aligned. For RV32 that -is on a 4-byte boundary}% +The RISC-V ISA requires that all instructions be aligned to their +natural boundaries. + Every possible instruction that an RV32I CPU can execute contains -exactly 32 bits. Therefore each one must be stored in four bytes -of the main memory. - -To simplify the hardware, each instruction must be placed into four -adjacent bytes whose numeric address sequence begins with a multiple -four. For example, an instruction might be located in bytes -4, 5, 6 and 7 (but not in 5, 6, 7 and 8 nor in 9, 3, 1, and 0\ldots). - -This sort of addressing requirement is common and is referred to as -\gls{alignment}. An aligned instruction begins at a memory address -that is a multiple of four. An {\em unaligned} instruction would -be one beginning at any other address and is {\em illegal}. +exactly 32 bits. Therefore they are always stored on a full word +boundary. Any {\em unaligned} instruction would is {\em illegal}. An attempt to fetch an instruction from an unaligned address will result in an error referred to as an alignment {\em \gls{exception}}. @@ -724,15 +929,3 @@ current instruction and start executing a different set of instructions that are prepared to handle the problem. Often an exception is handled by completely stopping the program in a way that is commonly referred to as a system or application {\em crash}. - -Given a properly aligned instruction address, the CPU can request -that the main memory locate and deliver the values of the four bytes -in the address sequence to the CPU using what is called a memory -read operation. Some systems can deliver four (or more) bytes at the -same time while others might only be capable of delivering one or -two bytes at a time. These differences in hardware typically impact the -cost and performance of a system.\footnote{The design and implementation -choices that determine how any given system operates are part of what is -called a system's {\em organization} and is beyond the scope of this text. -See~\cite{codriscv:2017} for more information on computer organization.} - diff --git a/book/rv32/chapter.tex b/book/rv32/chapter.tex index f40c2d7..6e03e11 100644 --- a/book/rv32/chapter.tex +++ b/book/rv32/chapter.tex @@ -27,37 +27,8 @@ This is used to convert a signed integer value expressed using some number of bits to a larger number of bits by adding more bits to the left. In doing so, the sign will be preserved. In this case {\em val} represents the least \acrshort{msb}s of the value. -For more on binary numbers see \autoref{chapter:NumberSystems}. - -\autoref{Figure:SignExtendNegative} illustrates extending the negative sign -bit of {\em val} to the left by replicating it. -When {\em val} is negative, its \acrshort{msb} (bit 19 in this example) will -be set to 1. Extending this value to the left will set all the new bits -to the left of it to 1 as well. - - -\begin{figure}[ht] -\centering -\DrawBitBoxSignExtendedPicture{32}{10100000000000000010} -\captionof{figure}{Sign-extending a negative integer from 20 bits to 32 bits.} -\label{Figure:SignExtendNegative} -\end{figure} - - -\autoref{Figure:SignExtendPositive} illustrates extending the positive sign -bit of {\em val} to the left by replicating it. -When {\em val} is positive, its \acrshort{msb} will be set to 0. Extending this -value to the left will set all the new bits to the left of it to 0 as well. - - -\begin{figure}[ht] -\centering -\DrawBitBoxSignExtendedPicture{32}{01000000000000000010} -\captionof{figure}{Sign-extending a positive integer from 20 bits to 32 bits.} -\label{Figure:SignExtendPositive} -\end{figure} - +For more on sign-extension see \autoref{SignExtension}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{zx(val)} @@ -69,19 +40,8 @@ This is used to convert an unsigned integer value expressed using some number of bits to a larger number of bits by adding more bits to the left. In doing so, the new bits added will all be set to zero. As is the case with \verb@sx(val)@, {\em val} represents the \acrshort{lsb}s of the final value. -\autoref{Figure:ZeroExtend} illustrates zero-extending a 20-bit {\em val} to the -left to form a 32-bit fullword. - -For more on binary numbers see \autoref{chapter:NumberSystems}. - -\begin{figure}[ht] -\centering -\DrawBitBoxZeroExtendedPicture{32}{10000000000000000010} -\captionof{figure}{Zero-extending an unsigned integer from 20 bits to 32 bits.} -\label{Figure:ZeroExtend} -\end{figure} - +For more on zero-extension see \autoref{ZeroExtension}. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%