diff --git a/book/Makefile b/book/Makefile index cd39769..f52fb38 100644 --- a/book/Makefile +++ b/book/Makefile @@ -1,7 +1,7 @@ TOP=.. include $(TOP)/Make.rules -TEXPATH=./numbers:./intro:./rv32:./copyright:./license +TEXPATH=./float:./intro:./rv32:./copyright:./license:./elements SUBDIRS= diff --git a/book/numbers/chapter.tex b/book/binary/chapter.tex similarity index 68% rename from book/numbers/chapter.tex rename to book/binary/chapter.tex index 9a5aaa7..aef151b 100644 --- a/book/numbers/chapter.tex +++ b/book/binary/chapter.tex @@ -1,11 +1,137 @@ -\chapter{Number Systems} -\label{chapter:NumberSystems} +\chapter{Numbers and Storage Systems} +\label{chapter:numbers} -RISC-V systems represent information using binary values stored in -little-endian order.\footnote{See\cite{IEN137} for some history of -the big/little-endian ``controversy.''} +This chapter discusses how data are represented and stored in a computer. -\section{Integers} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%\section{Context} +% +%Numbers can be interpreted differently depending on the context in +%which they are used. For example a number may represent the quantity +%of millimeters between two points. It may enumerate a +%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$, +%$01000011=C$\ldots\ In fact, any finite set of items can be identified +%(enumerated) by a assigning a code number to each element in this fashon. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Logical/Boolean Functions} + +\enote{This is unclear. Need to define bit positions and probably +should add basic truth table diagrams.}% +Unlike addition and subtraction, boolean functions apply +on a per-bit basis. +%in that they do not impact neighboring bits. +%by generating things like a carry or a borrow. +When applied to multi-bit values, each bit position is operated upon +independantly of the other bits. +\enote{Need to define 1 as true and 0 as false somewhere.} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsection{NOT} + +The {\em NOT} operator applies to a single operand and represents the +opposite of the input. +\enote{Need to define unary, binary and ternary operators without +confusing binary operators with binary numbers.} + +If the input is 1 then the output is 0. If the input is 0 then the +output is 1. In other words, the output value is {\em not} that of the +input value. + +This text will use the operator used in the C language when discussing +the {\em NOT} operator in symbolic form. Specifically the tilde: `\verb@~@'. + +\begin{verbatim} + ~ 1 1 1 1 0 1 0 1 <== A + ----------------- + 0 0 0 0 1 0 1 0 <== output +\end{verbatim} + +In a line of code the above might read like this: \verb@output = ~A@ + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsection{AND} + +The boolean {\em and} function has two or more inputs and the output is a +single bit. The output is 1 if and only if all of the input values are 1. +Otherwise it is 0. + +This text will use the operator used in the C language when discussing +the {\em AND} operator in symbolic form. Specifically the ampersand: `\verb@&@'. + +This function works like it does in spoken language. For example +if A is 1 {\em AND} B is 1 then the output is 1 (true). +Otherwise the output is 0 (false). For example: + +\begin{verbatim} + 1 1 1 1 0 1 0 1 <== A + & 1 0 0 1 0 0 1 1 <== B + ----------------- + 1 0 0 1 0 0 0 1 <== output +\end{verbatim} + +In a line of code the above might read like this: \verb@output = A & B@ + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsection{OR} + +The boolean {\em or} function has two or more inputs and the output is a +single bit. The output is 1 if at least one of the input values are 1. + +This text will use the operator used in the C language when discussing +the {\em OR} operator in symbolic form. Specifically the pipe: `\verb@|@'. + +This function works like it does in spoken language. For example +if A is 1 {\em OR} B is 1 then the output is 1 (true). +Otherwise the output is 0 (false). For example: + +\begin{verbatim} + 1 1 1 1 0 1 0 1 <== A + | 1 0 0 1 0 0 1 1 <== B + ----------------- + 1 1 1 1 0 1 1 1 <== output +\end{verbatim} + +In a line of code the above might read like this: \verb@output = A | B@ + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsection{XOR} + +The boolean {\em exclusive or} function has two or more inputs and the +output is a single bit. The output is 1 if only an odd number of inputs +are 1. Otherwise the output will be 0. + +This text will use the operator used in the C language when discussing +the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'. + +Note that when {\em XOR} is used with two inputs, the output +is set to 1 (true) when the inputs have different values and 0 +(false) when the inputs both have the same value. + +For example: + +\begin{verbatim} + 1 1 1 1 0 1 0 1 <== A + ^ 1 0 0 1 0 0 1 1 <== B + ----------------- + 0 1 1 0 0 1 1 0 <== output +\end{verbatim} + +In a line of code the above might read like this: \verb@output = A ^ B@ + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Integers and Counting} A binary integer is constructed with only 1s and 0s in the same manner as decimal numbers are constructed with values from 0 to 9. @@ -408,343 +534,85 @@ Disscuss the details of truncation and overflow here. {\em truncation} and {\em overflow} as occur with signed and unsigned addition and subtraction.} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Logical/Boolean Functions} -Unlike addition and subtraction, boolean functions apply -on a per-bit basis. -%in that they do not impact neighboring bits. -%by generating things like a carry or a borrow. -When applied to multi-bit values, each bit position is operated upon -independantly of the other bits. -\enote{This is unclear. Need to define bit positions and probably -should add basic truth table diagrams.} -\enote{Need to define 1 as true and 0 as false somewhere.} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsubsection{NOT} - -The {\em NOT} operator applies to a single operand and represents the -opposite of the input. -\enote{Need to define unary, binary and ternary operators without -confusing binary operators with binary numbers.} - -If the input is 1 then the output is 0. If the input is 0 then the -output is 1. In other words, the output value is {\em not} that of the -input value. - -This text will use the operator used in the C language when discussing -the {\em NOT} operator in symbolic form. Specifically the tilde: `\verb@~@'. - -\begin{verbatim} - ~ 1 1 1 1 0 1 0 1 <== A - ----------------- - 0 0 0 0 1 0 1 0 <== output -\end{verbatim} - -In a line of code the above might read like this: \verb@output = ~A@ - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsubsection{AND} - -The boolean {\em and} function has two or more inputs and the output is a -single bit. The output is 1 if and only if all of the input values are 1. -Otherwise it is 0. - -This text will use the operator used in the C language when discussing -the {\em AND} operator in symbolic form. Specifically the ampersand: `\verb@&@'. - -This function works like it does in spoken language. For example -if A is 1 {\em AND} B is 1 then the output is 1 (true). -Otherwise the output is 0 (false). For example: - -\begin{verbatim} - 1 1 1 1 0 1 0 1 <== A - & 1 0 0 1 0 0 1 1 <== B - ----------------- - 1 0 0 1 0 0 0 1 <== output -\end{verbatim} - -In a line of code the above might read like this: \verb@output = A & B@ - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsubsection{OR} - -The boolean {\em or} function has two or more inputs and the output is a -single bit. The output is 1 if at least one of the input values are 1. - -This text will use the operator used in the C language when discussing -the {\em OR} operator in symbolic form. Specifically the pipe: `\verb@|@'. - -This function works like it does in spoken language. For example -if A is 1 {\em OR} B is 1 then the output is 1 (true). -Otherwise the output is 0 (false). For example: - -\begin{verbatim} - 1 1 1 1 0 1 0 1 <== A - | 1 0 0 1 0 0 1 1 <== B - ----------------- - 1 1 1 1 0 1 1 1 <== output -\end{verbatim} - -In a line of code the above might read like this: \verb@output = A | B@ - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsubsection{XOR} - -The boolean {\em exclusive or} function has two or more inputs and the -output is a single bit. The output is 1 if only an odd number of inputs -are 1. Otherwise the output will be 0. - -This text will use the operator used in the C language when discussing -the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'. - -Note that when {\em XOR} is used with two inputs, the output -is set to 1 (true) when the inputs have different values and 0 -(false) when the inputs both have the same value. - -For example: - -\begin{verbatim} - 1 1 1 1 0 1 0 1 <== A - ^ 1 0 0 1 0 0 1 1 <== B - ----------------- - 0 1 1 0 0 1 1 0 <== output -\end{verbatim} - -In a line of code the above might read like this: \verb@output = A ^ B@ - - - -%\section{Context} -% -%Numbers can be interpreted differently depending on the context in -%which they are used. For example a number may represent the quantity -%of millimeters between two points. It may enumerate a -%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$, -%$01000011=C$\ldots\ In fact, any finite set of items can be identified -%(enumerated) by a assigning a code number to each element in this fashon. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{IEEE-754 Floating Point Number Representation} -\label{chapter::floatingpoint} +\section{Main Memory Storage} -This section provides an overview of the IEEE-754 32-bit binary floating -point format. +\enote{Refactor this section and the memory discussion in RV32 reference chapter}% +When transferring data between its registers registers and main memory a +RISC-V system uses the little-endian byte order.\footnote{ +See\cite{IEN137} for some history of the big/little-endian ``controversy.''} -\begin{itemize} -\item Recall that the place values for integer binary numbers are: -\begin{verbatim} - ... 128 64 32 16 8 4 2 1 -\end{verbatim} -\item We can extend this to the right in binary similar to the way we do for -decimal numbers: -\begin{verbatim} - ... 128 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1/64 1/128 ... -\end{verbatim} -The `.' in a binary number is a binary point, not a decimal point. +\enote{Discuss byte ordering, addressing and character strings.} -\item We use scientific notation as in $2.7 \times 10^{-47}$ to express either -small fractions or large numbers when we are not concerned every last digit -needed to represent the entire, exact, value of a number. +\subsection{Memory Dump} -\item The format of a number in scientific notation is $mantissa \times base^{exponent}$ +Introduce the memory dump and how to read them here. -\item In binary we have $mantissa \times 2^{exponent}$ +Discuss the pitfalls of assuming what a set of bytes is used for based +on their contents! -\item IEEE-754 format requires binary numbers to be {\em normalized} to -$1.significand \times 2^{exponent}$ where the {\em significand} -is the portion of the {\em mantissa} that is to the right of the binary-point. +\subsection{Big Endian Representation} -\begin{itemize} -\item The unnormalized binary value of $-2.625$ is $10.101$ -\item The normalized value of $-2.625$ is $1.0101 \times 2^1$ -\end{itemize} +Using the memory dump contents in prior section, discuss how +big endian values are stored. -\item We need not store the `1.' because {\em all} normalized floating -point numbers will start that way. Thus we can save memory when storing -normalized values by adding 1 to the significand. +\subsection{Little Endian Representation} -{ -\small -\setlength{\unitlength}{.15in} -\begin{picture}(32,4)(0,0) - \put(0,1){\line(1,0){32}} % bottom line - \put(0,2){\line(1,0){32}} % top line +Using the memory dump contents in prior section, discuss how +little endian values are stored. - \put(0,1){\line(0,1){2}} % left vertical - \put(0,2){\makebox(1,1){\tiny 31}} % left end bit number marker +\subsection{Character Strings and Arrays} - \put(32,1){\line(0,1){2}} % vertical right end - \put(31,2){\makebox(1,1){\tiny 0}} % right end bit number marker +Define character strings and arrays. - \put(0,0){\makebox(1,1){\small sign}} - \put(1,0){\makebox(8,1){\small exponent}} - \put(9,0){\makebox(23,1){\small significand}} +Using the prior memory dump, discuss how and where things are stored and +retrieved. - \put(0,1){\makebox(1,1){1}} % sign +\subsection{Alignment} - \put(1,1){\line(0,1){2}} % seperator - \put(1,2){\makebox(1,1){\tiny 30}} % bit marker +Draw a diagram showing the overlapping data types when they are all aligned. - \put(1,1){\makebox(1,1){1}} % exponent - \put(2,1){\makebox(1,1){0}} - \put(3,1){\makebox(1,1){0}} - \put(4,1){\makebox(1,1){0}} - \put(5,1){\makebox(1,1){0}} - \put(6,1){\makebox(1,1){0}} - \put(7,1){\makebox(1,1){0}} - \put(8,1){\makebox(1,1){0}} - \put(8,2){\makebox(1,1){\tiny 23}} % bit marker - \put(9,1){\line(0,1){2}} % seperator - \put(9,2){\makebox(1,1){\tiny 22}} % bit marker +\subsection{Instruction Alignment} - \put(9,1){\makebox(1,1){0}} - \put(10,1){\makebox(1,1){1}} - \put(11,1){\makebox(1,1){0}} - \put(12,1){\makebox(1,1){1}} - \put(13,1){\makebox(1,1){0}} - \put(14,1){\makebox(1,1){0}} - \put(15,1){\makebox(1,1){0}} - \put(16,1){\makebox(1,1){0}} - \put(17,1){\makebox(1,1){0}} - \put(18,1){\makebox(1,1){0}} - \put(19,1){\makebox(1,1){0}} - \put(20,1){\makebox(1,1){0}} - \put(21,1){\makebox(1,1){0}} - \put(22,1){\makebox(1,1){0}} - \put(23,1){\makebox(1,1){0}} - \put(24,1){\makebox(1,1){0}} - \put(25,1){\makebox(1,1){0}} - \put(26,1){\makebox(1,1){0}} - \put(27,1){\makebox(1,1){0}} - \put(28,1){\makebox(1,1){0}} - \put(29,1){\makebox(1,1){0}} - \put(30,1){\makebox(1,1){0}} - \put(31,1){\makebox(1,1){0}} -\end{picture} -} +\enote{Rewrite this section for data rather than instructions and then +note here that instructions must be naturally aligned. For RV32 that +is on a 4-byte boundary}% +Every possible instruction that an RV32I CPU can execute contains +exactly 32 bits. Therefore each one must be stored in four bytes +of the main memory. -%\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -(1 \frac{5}{16} \times 2^{1}) = -(1.3125 \times 2^{1}) = -2.625$ -\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -((1 + \frac{1}{4} + \frac{1}{16}) \times 2^1) = -(2 + \frac{1}{2} + \frac{1}{8}) = -(2 + .5 + .125) = -2.625$ +To simplify the hardware, each instruction must be placed into four +adjacent bytes whose numeric address sequence begins with a multiple +four. For example, an instruction might be located in bytes +4, 5, 6 and 7 (but not in 5, 6, 7 and 8 nor in 9, 3, 1, and 0\ldots). -\item IEEE754 formats: +This sort of addressing requirement is common and is referred to as +\gls{alignment}. An aligned instruction begins at a memory address +that is a multiple of four. An {\em unaligned} instruction would +be one beginning at any other address and is {\em illegal}. -\begin{tabular}{|l|l|l|} -\hline - & IEEE754 32-bit & IEEE754 64-bit \\ -\hline -sign & 1 bit & 1 bit \\ -exponent & 8 bits (excess-127) & 11 bits (excess-1023) \\ -mantissa & 23 bits & 52 bits \\ -max exponent & 127 & 1023 \\ -min exponent & -126 & -1022 \\ -\hline -\end{tabular} +An attempt to fetch an instruction from an unaligned address +will result in an error referred to as an alignment {\em \gls{exception}}. +This and other exceptions cause the CPU to stop executing the +curent instruction and start executing a different set of instructions +that are prepared to handle the problem. Often an exception is +handled by completely stopping the program in a way that is commonly +refered to as a system or application {\em crash}. -\item When the exponent is all ones, the mantissa is all zeros, and -the sign is zero, the number represents positive infinity. +Given a properly aligned instruction address, the CPU can request +that the main memory locate and deliver the values of the four bytes +in the address sequence to the CPU using what is called a memory +read operation. Some systems can deliver four (or more) bytes at the +same time while others might only be capable of delivering one or +two bytes at a time. These differences in hardware typically impact the +cost and performance of a system.\footnote{The design and implementation +choices that determine how any given system operates are part of what is +called a system's {\em organization} and is beyond the scope of this text. +See~\cite{codriscv:2017} for more information on computer organization.} -\item When the exponent is all ones, the mantissa is all zeros, and -the sign is one, the number represents negative infinity. - -\item Note that the binary representation of an IEEE754 number in memory -can be compared for magnitude with another one using the same logic as for -comparing two's complement signed integers because the magnitude of an -IEEE number grows upward and downward in the same fashion as signed integers. -This is why we use excess notation and locate the significand's sign bit on -the left of the exponent. - -\item Note that zero is a special case number. Recall that a normalized -number has an implied 1-bit to the left of the significand\ldots\ which -means that there is no way to represent zero! -Zero is represented by an exponent of all-zeros and a significand of -all-zeros. This definition allows for a positive and a negative zero -if we observe that the sign can be either 1 or 0. - -\item On the number-line, numbers between zero and the smallest fraction in -either direction are in the {\em \gls{underflow}} areas. -\enote{Need to add the standard lecture numberline diagram showing -where the over/under-flow areas are and why.} - -\item On the number line, numbers greater than the mantissa of all-ones and the -largest exponent allowed are in the {\em \gls{overflow}} areas. - -\item Note that numbers have a higher resolution on the number line when the -exponent is smaller. -\end{itemize} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Floating Point Number Accuracy} -Due to the finite number of bits used to store the value of a floating point -number, it is not possible to represent every one of the infinite values -on the real number line. The following C programs illustrate this point. - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsubsection{Powers Of Two} -Just like the integer numbers, the powers of two that have bits to represent -them can be represented perfectly\ldots\ as can their sums (provided that the -significand requires no more than 23 bits.) - -\listing{powersoftwo.c}{Precise Powers of Two} -\listing{powersoftwo.out}{Output from {\tt powersoftwo.c}} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsubsection{Clean Decimal Numbers} -When dealing with decimal values, you will find that they don't map simply -into binary floating point values. -% (the same holds true for binary integer numbers). - -Note how the decimal numbers are not accurately represented as they get larger. -The decimal number on line 10 of \listingRef{cleandecimal.out} -can be perfectly represented in IEEE format. However, a problem arises in -the 11Th loop iteration. It is due to the fact that the -binary number can not be represented accurately in IEEE format. Its least -significant bits were truncated in a best-effort attempt at rounding the value -off in order to fit the value into the bits provided. This is an example of -{\em low order truncation}. Once this happens, the value of \verb@x.f@ is -no longer as precise as it could be given more bits in which to save its value. - -\listing{cleandecimal.c}{Print Clean Decimal Numbers} -\listing{cleandecimal.out}{Output from {\tt cleandecimal.c}} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsubsection{Accumulation of Error} -These rounding errors can be exaggerated when the number we multiply -the \verb@x.f@ value by is, itself, something that can not be accurately -represented in IEEE -form.\footnote{Applications requiring accurate decimal values, such as -financial accounting systems, can use a packed-decimal numeric format -to avoid unexpected oddities caused by the use of binary numbers.} -\enote{In a lecture one would show that one tenth is a repeating -non-terminating binary number that gets truncated. This discussion -should be reproduced here in text form.} - -For example, if we multiply our \verb@x.f@ value by $\frac{1}{10}$ each time, -we can never be accurate and we start accumulating errors immediately. - -\listing{erroraccumulation.c}{Accumulation of Error} -\listing{erroraccumulation.out}{Output from {\tt erroraccumulation.c}} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{Reducing Error Accumulation} -In order to use floating point numbers in a program without causing -excessive rounding problems an algorithm can be redesigned such that the -accumulation is eliminated. -This example is similar to the previous one, but this time we recalculate the -desired value from a known-accurate integer value. -Some rounding errors remain present, but they can not accumulate. - -\listing{errorcompensation.c}{Accumulation of Error} -\listing{errorcompensation.out}{Output from {\tt erroraccumulation.c}} diff --git a/book/book.tex b/book/book.tex index 1718ddf..6f10514 100644 --- a/book/book.tex +++ b/book/book.tex @@ -59,7 +59,8 @@ %\part{Introduction} \include{intro/chapter} -\include{numbers/chapter} +\include{binary/chapter} +\include{elements/chapter} \include{toolchain/chapter} \include{rv32/chapter} @@ -67,6 +68,8 @@ % These 'chapters' are lettered rather than numbered \appendix +\include{install/chapter} +\include{float/chapter} \include{license/chapter} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% diff --git a/book/elements/chapter.tex b/book/elements/chapter.tex new file mode 100644 index 0000000..5339791 --- /dev/null +++ b/book/elements/chapter.tex @@ -0,0 +1,149 @@ +\chapter{The Elements of a Assembly Language Program} +\label{chapter:elements} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{A Sample Program Source Listing} + +A simple program that illustrates how this text presents +program source code is seen in \listingRef{zero4regs.S}. +This program will place a zero in each of the 4 registers +named x28, x29, x30 and x31. + +\listing{zero4regs.S}{Setting four registers to zero.} + +This program listing illustrates a number of things: +\begin{itemize} +\item Listings are identified by the name of the file within which + they are stored. This listing is from a file named: \verb@zero4regs.S@. +\item The assembly language programs discussed in this text will be saved + in files that end with: \verb@.S@ (Alternately you can use \verb@.sx@ + on systems that don't understand the difference between upper and + lowercase letters.\footnote{The author of this text prefers to avoid + using such systems.}) +\item A description of the listing's purpose appears under the name of the + file. The description of \listingRef{zero4regs.S} is + {\em Setting four registers to zero.} +\item The lines of the listing are numberd on the left margin for + easy reference. +\item An assembly program consists of lines of plain text. +\item The RISC-V ISA does not provide an operation that will simply + set a register to a numeric value. To accomplish our goal this + program will add zero to zero and place the sum in in each of the + four registers. +\item The lines that start with a dot `.' (on lines 1, 2 and 3) are + called {\em assembler directives} as they tell the assembler itself + how we want it to translate the following {\em assembly language instructions} + into {\em machine language instructions.} +\item Line 4 shows a {\em label} named {\em \_start}. The colon + at the end is the indicator to the assembler that causes it to + recognize the preceeding characters as a label. +\item Lines 5-8 are the four assembly language instructions that + make up the program. Each instruction in this program + consists of four {\em fields}. (Different instructions can have + a different number of fields.) The fields on line 5 are: + + \begin{itemize} + \item [addi] The instruction mneumonic. It indicates the operation + that the CPU will perform. + \item [x28] The {\em destination} register that will receive the + sum when the {\em addi} instruction is finished. The names of + the 32 registers are expressed as x0 -- x31. + \item [x0] One of the addends of the sum operation. (The x0 register + will always contain the vlaue zero. It can never be changed.) + \item [0] The second addend is the number zero. + \item [\# set \ldots] Any text anywhere in a RISC-V assembly language + program that starts with the pound-sign is ignored by the assembler. + They are used to place a {\em comment} in the program to help + the reader better understand the motive of the programmer. + \end{itemize} +\end{itemize} + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Running a Program With rvddt} +\index{rvddt} + +To illustrate what a CPU does when it executes instructions this text +will use the \gls{rvddt} simulator to display shows sequence of events +and the binary values involved. This simulator supports the RV32I ISA +and has a configurable ammount of memory.% +\footnote{The {\em rvddt} simulator was written to generate the listings for +this text. It is similar to the fancier {\em spike} simulator. +Given the simplicity of the RV32I ISA, rvddt is less than 1700 lines of C++ +and was written in one (long) afternoon.} + +\listingRef{zero4regs.out} shows the operation of the four +{\em addi} instructions from \listingRef{zero4regs.S} when it is executed +in trace-mode. + +\listing{zero4regs.out}{Running a program with the rvddt simulator} + +\begin{itemize} +\item [$\ell$ 1] This listing includes the command-line that shows how the simulator + was executed to load a file containing the machine instructions (aka + machine code) from the assembler. +\item [$\ell$ 2] A message from the simulator indicating that it loaded the machine + code into simulated memory at address 0. +\item [$\ell$ 3] This line shows the prompt from the debugger and the command + \verb@t4@ that the user entered to request that the simulator trace + the execution of four extructions. +\item [$\ell$ 4-8] Prior to executing the first instruction, the state of the + CPU registers is displayed. +\item [$\ell$ 4] The values in registers 0, 1, 2, 3, 4, 5, 6 and 7 are printed + from left to right in \gls{bigendian}, \gls{hexadecimal} form. + The dash `\verb@-@' character in the middle of the line is a reference + to make it easier to visually navigate across the line without being + forced to count the values from the far left when seeking the value + of, say, x5. +\item [$\ell$ 5-7] The values of registers 8--31 are printed. +\item [$\ell$ 8] The {\em program counter} (\reg{pc}) register is printed. + It contains the address of the instruction that the CPU will execute. + After each instruction, the \reg{pc} will either advance four bytes + ahead or be set to another value by a branch instruction as discussed above. +\item [$\ell$ 9] A four-byte instruction is fetched from memory at the address + in the \reg{pc} register, is decoded and printed. From left to right + the fields shown on this line are: + + \begin{itemize} + + \item [00000000] The memory address from which the instruction was + fetched. This address is displayed in \gls{bigendian}, + \gls{hexadecimal} form. + \item [00000e13] The machine code of the instruction displayed in + \gls{bigendian}, \gls{hexadecimal} form. + \item [addi] The mneumonic for the machine instruction. + \item [x28] The \reg{rd} field of the addi instruction. + \item [x0] The \reg{rs1} field of the addi instruction that + holds one of the two addends of the operation. + \item [0] The \reg{imm} field of the addi instruction that + holds the second of the two addends of the operation. + \item [\# \ldots] A simulator-generated comment that exaplains + what the instruction is doing. For this instruction it indicates + that \reg{x28} will have the value zero stored into it as a result + of performing the addition: $0+0$. + \end{itemize} + +\item [$\ell$ 10-14] These lines are printed as the prelude while tracing the + second instruction. Lines 7 and 13 show that \reg{x28} has changed + from \verb@f0f0f0f0@ to \verb@00000000@ as a result of executing the + first instruction and lines 8 and 14 show that the \reg{pc} has + advanced from zero (the location of the first instruction) to + four, where the second instruction will be fetched. None of the + rest of the registers have changed values. +\item [$\ell$ 15] The second instruction decoded executed and described. + This time register \reg{x29} will be assigned a value. +\item [$\ell$ 16-27] The third and fourth instructions are traced. +\item [$\ell$ 28] Tracing has completed. The simulator prints its prompt + and the user enters the `r' command to see the register state + after the fourth instruction has completed executing. +\item [$\ell$ 29-33] Following the fourth instruction it can be observed + that registers \reg{x28}, \reg{x29}, \reg{x30} and \reg{x31} + have been set to zero and that the \reg{pc} has advanced from + zero to four, then eight, then 12 (the hex value for 12 is c) + and then to 16 (which, in hex, is 10). +\item [$\ell$ 34] The simulator exit command `x' is entered by the user and + the terminal displays the shell prompt. + +\end{itemize} diff --git a/book/intro/zero4regs.S b/book/elements/zero4regs.S similarity index 100% rename from book/intro/zero4regs.S rename to book/elements/zero4regs.S diff --git a/book/intro/zero4regs.out b/book/elements/zero4regs.out similarity index 100% rename from book/intro/zero4regs.out rename to book/elements/zero4regs.out diff --git a/book/float/chapter.tex b/book/float/chapter.tex new file mode 100644 index 0000000..f835ba4 --- /dev/null +++ b/book/float/chapter.tex @@ -0,0 +1,218 @@ +\chapter{Floating Point Numbers} +\label{chapter:NumberSystems} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{IEEE-754 Floating Point Number Representation} +\label{chapter::floatingpoint} + +This section provides an overview of the IEEE-754 32-bit binary floating +point format. + +\begin{itemize} +\item Recall that the place values for integer binary numbers are: +\begin{verbatim} + ... 128 64 32 16 8 4 2 1 +\end{verbatim} +\item We can extend this to the right in binary similar to the way we do for +decimal numbers: +\begin{verbatim} + ... 128 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1/64 1/128 ... +\end{verbatim} +The `.' in a binary number is a binary point, not a decimal point. + +\item We use scientific notation as in $2.7 \times 10^{-47}$ to express either +small fractions or large numbers when we are not concerned every last digit +needed to represent the entire, exact, value of a number. + +\item The format of a number in scientific notation is $mantissa \times base^{exponent}$ + +\item In binary we have $mantissa \times 2^{exponent}$ + +\item IEEE-754 format requires binary numbers to be {\em normalized} to +$1.significand \times 2^{exponent}$ where the {\em significand} +is the portion of the {\em mantissa} that is to the right of the binary-point. + +\begin{itemize} +\item The unnormalized binary value of $-2.625$ is $10.101$ +\item The normalized value of $-2.625$ is $1.0101 \times 2^1$ +\end{itemize} + +\item We need not store the `1.' because {\em all} normalized floating +point numbers will start that way. Thus we can save memory when storing +normalized values by adding 1 to the significand. + +{ +\small +\setlength{\unitlength}{.15in} +\begin{picture}(32,4)(0,0) + \put(0,1){\line(1,0){32}} % bottom line + \put(0,2){\line(1,0){32}} % top line + + \put(0,1){\line(0,1){2}} % left vertical + \put(0,2){\makebox(1,1){\tiny 31}} % left end bit number marker + + \put(32,1){\line(0,1){2}} % vertical right end + \put(31,2){\makebox(1,1){\tiny 0}} % right end bit number marker + + \put(0,0){\makebox(1,1){\small sign}} + \put(1,0){\makebox(8,1){\small exponent}} + \put(9,0){\makebox(23,1){\small significand}} + + \put(0,1){\makebox(1,1){1}} % sign + + \put(1,1){\line(0,1){2}} % seperator + \put(1,2){\makebox(1,1){\tiny 30}} % bit marker + + \put(1,1){\makebox(1,1){1}} % exponent + \put(2,1){\makebox(1,1){0}} + \put(3,1){\makebox(1,1){0}} + \put(4,1){\makebox(1,1){0}} + \put(5,1){\makebox(1,1){0}} + \put(6,1){\makebox(1,1){0}} + \put(7,1){\makebox(1,1){0}} + \put(8,1){\makebox(1,1){0}} + + \put(8,2){\makebox(1,1){\tiny 23}} % bit marker + \put(9,1){\line(0,1){2}} % seperator + \put(9,2){\makebox(1,1){\tiny 22}} % bit marker + + \put(9,1){\makebox(1,1){0}} + \put(10,1){\makebox(1,1){1}} + \put(11,1){\makebox(1,1){0}} + \put(12,1){\makebox(1,1){1}} + \put(13,1){\makebox(1,1){0}} + \put(14,1){\makebox(1,1){0}} + \put(15,1){\makebox(1,1){0}} + \put(16,1){\makebox(1,1){0}} + \put(17,1){\makebox(1,1){0}} + \put(18,1){\makebox(1,1){0}} + \put(19,1){\makebox(1,1){0}} + \put(20,1){\makebox(1,1){0}} + \put(21,1){\makebox(1,1){0}} + \put(22,1){\makebox(1,1){0}} + \put(23,1){\makebox(1,1){0}} + \put(24,1){\makebox(1,1){0}} + \put(25,1){\makebox(1,1){0}} + \put(26,1){\makebox(1,1){0}} + \put(27,1){\makebox(1,1){0}} + \put(28,1){\makebox(1,1){0}} + \put(29,1){\makebox(1,1){0}} + \put(30,1){\makebox(1,1){0}} + \put(31,1){\makebox(1,1){0}} +\end{picture} +} + +%\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -(1 \frac{5}{16} \times 2^{1}) = -(1.3125 \times 2^{1}) = -2.625$ +\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -((1 + \frac{1}{4} + \frac{1}{16}) \times 2^1) = -(2 + \frac{1}{2} + \frac{1}{8}) = -(2 + .5 + .125) = -2.625$ + +\item IEEE754 formats: + +\begin{tabular}{|l|l|l|} +\hline + & IEEE754 32-bit & IEEE754 64-bit \\ +\hline +sign & 1 bit & 1 bit \\ +exponent & 8 bits (excess-127) & 11 bits (excess-1023) \\ +mantissa & 23 bits & 52 bits \\ +max exponent & 127 & 1023 \\ +min exponent & -126 & -1022 \\ +\hline +\end{tabular} + +\item When the exponent is all ones, the mantissa is all zeros, and +the sign is zero, the number represents positive infinity. + +\item When the exponent is all ones, the mantissa is all zeros, and +the sign is one, the number represents negative infinity. + +\item Note that the binary representation of an IEEE754 number in memory +can be compared for magnitude with another one using the same logic as for +comparing two's complement signed integers because the magnitude of an +IEEE number grows upward and downward in the same fashion as signed integers. +This is why we use excess notation and locate the significand's sign bit on +the left of the exponent. + +\item Note that zero is a special case number. Recall that a normalized +number has an implied 1-bit to the left of the significand\ldots\ which +means that there is no way to represent zero! +Zero is represented by an exponent of all-zeros and a significand of +all-zeros. This definition allows for a positive and a negative zero +if we observe that the sign can be either 1 or 0. + +\item On the number-line, numbers between zero and the smallest fraction in +either direction are in the {\em \gls{underflow}} areas. +\enote{Need to add the standard lecture numberline diagram showing +where the over/under-flow areas are and why.} + +\item On the number line, numbers greater than the mantissa of all-ones and the +largest exponent allowed are in the {\em \gls{overflow}} areas. + +\item Note that numbers have a higher resolution on the number line when the +exponent is smaller. +\end{itemize} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsection{Floating Point Number Accuracy} +Due to the finite number of bits used to store the value of a floating point +number, it is not possible to represent every one of the infinite values +on the real number line. The following C programs illustrate this point. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsubsection{Powers Of Two} +Just like the integer numbers, the powers of two that have bits to represent +them can be represented perfectly\ldots\ as can their sums (provided that the +significand requires no more than 23 bits.) + +\listing{powersoftwo.c}{Precise Powers of Two} +\listing{powersoftwo.out}{Output from {\tt powersoftwo.c}} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsubsection{Clean Decimal Numbers} +When dealing with decimal values, you will find that they don't map simply +into binary floating point values. +% (the same holds true for binary integer numbers). + +Note how the decimal numbers are not accurately represented as they get larger. +The decimal number on line 10 of \listingRef{cleandecimal.out} +can be perfectly represented in IEEE format. However, a problem arises in +the 11Th loop iteration. It is due to the fact that the +binary number can not be represented accurately in IEEE format. Its least +significant bits were truncated in a best-effort attempt at rounding the value +off in order to fit the value into the bits provided. This is an example of +{\em low order truncation}. Once this happens, the value of \verb@x.f@ is +no longer as precise as it could be given more bits in which to save its value. + +\listing{cleandecimal.c}{Print Clean Decimal Numbers} +\listing{cleandecimal.out}{Output from {\tt cleandecimal.c}} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsubsection{Accumulation of Error} +These rounding errors can be exaggerated when the number we multiply +the \verb@x.f@ value by is, itself, something that can not be accurately +represented in IEEE +form.\footnote{Applications requiring accurate decimal values, such as +financial accounting systems, can use a packed-decimal numeric format +to avoid unexpected oddities caused by the use of binary numbers.} +\enote{In a lecture one would show that one tenth is a repeating +non-terminating binary number that gets truncated. This discussion +should be reproduced here in text form.} + +For example, if we multiply our \verb@x.f@ value by $\frac{1}{10}$ each time, +we can never be accurate and we start accumulating errors immediately. + +\listing{erroraccumulation.c}{Accumulation of Error} +\listing{erroraccumulation.out}{Output from {\tt erroraccumulation.c}} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsection{Reducing Error Accumulation} +In order to use floating point numbers in a program without causing +excessive rounding problems an algorithm can be redesigned such that the +accumulation is eliminated. +This example is similar to the previous one, but this time we recalculate the +desired value from a known-accurate integer value. +Some rounding errors remain present, but they can not accumulate. + +\listing{errorcompensation.c}{Accumulation of Error} +\listing{errorcompensation.out}{Output from {\tt erroraccumulation.c}} diff --git a/book/numbers/cleandecimal.c b/book/float/cleandecimal.c similarity index 100% rename from book/numbers/cleandecimal.c rename to book/float/cleandecimal.c diff --git a/book/numbers/cleandecimal.out b/book/float/cleandecimal.out similarity index 100% rename from book/numbers/cleandecimal.out rename to book/float/cleandecimal.out diff --git a/book/numbers/erroraccumulation.c b/book/float/erroraccumulation.c similarity index 100% rename from book/numbers/erroraccumulation.c rename to book/float/erroraccumulation.c diff --git a/book/numbers/erroraccumulation.out b/book/float/erroraccumulation.out similarity index 100% rename from book/numbers/erroraccumulation.out rename to book/float/erroraccumulation.out diff --git a/book/numbers/errorcompensation.c b/book/float/errorcompensation.c similarity index 100% rename from book/numbers/errorcompensation.c rename to book/float/errorcompensation.c diff --git a/book/numbers/errorcompensation.out b/book/float/errorcompensation.out similarity index 100% rename from book/numbers/errorcompensation.out rename to book/float/errorcompensation.out diff --git a/book/numbers/powersoftwo.c b/book/float/powersoftwo.c similarity index 100% rename from book/numbers/powersoftwo.c rename to book/float/powersoftwo.c diff --git a/book/numbers/powersoftwo.out b/book/float/powersoftwo.out similarity index 100% rename from book/numbers/powersoftwo.out rename to book/float/powersoftwo.out diff --git a/book/glossary.tex b/book/glossary.tex index d808f23..06c5afb 100644 --- a/book/glossary.tex +++ b/book/glossary.tex @@ -165,6 +165,13 @@ so the programmer need not memorize the biary values of each machine instruction} } +\newglossaryentry{thread} +{ + name={thread}, + description={An stream of instructions. When plural, it is + used to refer to the ability of a CPU to execute multiple + instruction streams at the same time} +} \newacronym{hart}{hart}{Hardware Thread} \newacronym{msb}{MSB}{Most Significant Bit} diff --git a/book/install/chapter.tex b/book/install/chapter.tex new file mode 100644 index 0000000..d213aff --- /dev/null +++ b/book/install/chapter.tex @@ -0,0 +1,72 @@ +\chapter{Installing a RISC-V Toolchain} +\label{chapter:install} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{The GNU Toolchain} + +Discuss the GNU toolchain elements used to experiment with the +material in this book. + +\enote{It would be good to find some Mac and Windows users to write +and test proper variations on this section to address those systems. +Pull requests, welcome!}% +The instructions and examples here were all implemented on Ubuntu 16.04 LTS. + +Install custom code in a location that will not cause interference with +other applications and allow for easy cleanup. These instructions +install the toolchain in \verb@/usr/local/riscv@. At any time +you can remove the lot and start over by executing the following +command: + +\begin{verbatim} +rm -rf /usr/local/riscv/* +\end{verbatim} + + +Tested on Ubuntu 16.04 LTS. +18.04 was just released\ldots\ update accordingly. + +These are the only commands that you should perform as root when installing +the toolchain: + +\begin{verbatim} +sudo apt-get install autoconf automake autotools-dev curl libmpc-dev \ + libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf \ + libtool patchutils bc zlib1g-dev libexpat-dev +sudo mkdir -p /usr/local/riscv/ +sudo chmod 777 /usr/local/riscv/ +\end{verbatim} + +All other commands should be executed as a regular user. This will eliminate the +possibility of clobbering system files that should not be touched when tinkering with +the toolchain applicaitons. + +To download, compile and ``install'' the toolchain: + +\begin{verbatim} +# riscv toolchain: +# +# https://riscv.org/software-tools/risc-v-gnu-compiler-toolchain/ + +git clone --recursive https://github.com/riscv/riscv-gnu-toolchain +cd riscv-gnu-toolchain +./configure --prefix=/usr/local/riscv/rv32i --with-arch=rv32i --with-abi=ilp32 +make +make install +\end{verbatim} + +Need to discuss augmenting the PATH environment variable. + +Discuss the choice of ilp32 as well as what the other variations would do. + +Discuss rv32im and note that the details are found in \autoref{chapter:RV32}. + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{rvddt} + +Disciuss installing the rvddt simulator here. diff --git a/book/intro/chapter.tex b/book/intro/chapter.tex index 2bfccb3..ca68643 100644 --- a/book/intro/chapter.tex +++ b/book/intro/chapter.tex @@ -57,6 +57,11 @@ the data and instructions that can not fit into the CPU registers. Typically, a CPU's registers can hold tens of data values while the main memory can contain many billions of data values. +To keep track of the data values, each register is assigned a number and +the main memory is broken up into small blocks called \gls{byte}s that +are also each assigned number called an \gls{address} +(an address is often referred to as a {\em location.} + A CPU can process data in a register at a speed that can be an order of magnitude faster than the rate that it can process (specifically, transfer data and instructions to and from) the main memory. @@ -81,15 +86,72 @@ more slowly than its main memory. This text is not particularly concerned with non-volatile storage. +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{CPU} - \index{CPU} -The \acrshort{cpu} is a collection of registers and circuitry designed -to read data and instructions from the storage system. The instructions -tell the CPU to perform various mathamatical and logical operations on -the data in its registers and where to save the results of those operations. +\enote{Add a block diagram of the CPU components described here.} +The \acrshort{cpu} is a collection of registers and circuitry designed +manipulate the register data and to exchange data and instructions with the +storage system. The instructions that it reads from the main memory tells +the CPU to perform various mathamatical and logical operations on the data +in its registers and where to save the results of those operations. + +\subsubsection{Execution Unit} + +The part of a CPU that coordinates all aspects of the operations of each +instruction is called the {\em execution unit.} It is what performs the transfers +of instructions and bata between the CPU and the main memory and tells the +registers when they are supposed to either store or recall data being transferred. +The execution unit also controls the ALU (Arithmetic and Logic Unit). + +\subsubsection{Arithmetic and Logic Unit} +\index{ALU} + +When an instruction manipulates data by performing things like an {\em addition}, +{\em subtraction}, {\em comparison} or other similar operations, the ALU is what +will calculate the sum, difference, and so on. + +\subsubsection{Registers} +\index{register} + +In the RV32 CPU there are 31 general purpose registers that each contain 32 \gls{bit}s +(where each bit is one \gls{binary} digit value of one or zero) and a number +of special-purpose registers. +Each of the general purpose registers is given a name such as \reg{x1}, \reg{x2}, +\ldots\ on up to \reg{x31} ({\em general purpose} refers to the fact that the CPU +itself does not prescribe any particular function to any these registers.) +Two important special-purpose registers are \reg{x0} and \reg{pc}. + +Register \reg{x0} will always represent the value zero or logical {\em false} +no matter what. If any instruction tries to change the value is \reg{x0} value the +operation will fail. The need for {\em zero} is so common that, other than the +fact that it is hard-wired to zero, the \reg{x0} register is made available as +if it were otherwise a general purpose register.% +\footnote{Having a special +{\em zero} register allows the total set of instructions that the CPU can execute +to be simplified. Thus reducing its complexity, power consumption and cost.} + +The \reg{pc} regiter is called the {\em program counter}. The CPU uses it to +remember the memory address where its program istructions are located. + +The number of bits in each register is defined by the \acrfull{isa}. + +\subsubsection{Harts} +\index{hart} + +Analogous to a {\em core} in other types of CPUs, a {\em \acrshort{hart}} +(hardware \gls{thread}) in a RISC-V CPU refers to the collection of 32 registers, +instruction execution unit and ALU. + +When more than one hart is present in a CPU, a different stream of instructions can +be executed on each hart all at the same time. +Programs that are written to take advantage of this are called {\em multithreaded}. + +This text will primairly focus on CPUs that have only one hart. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Peripherals} @@ -106,8 +168,8 @@ instructions are used to initiate, execute and/or synchronize data transfers. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Instruction Set Architecture} - \index{ISA} + The catalog of rules that describes the details of the instructions and features that a given CPU provides is called its \acrfull{isa}. @@ -125,80 +187,58 @@ modules and zero or more of the {\em extension} modules. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{RV Base Modules} \index{RV32I} + The base modules are RV32I (32-bit general purpose), RV32E (32-bit embedded), RV64I (64-bit general purpose) and RV128I (128-bit general purpose). These base modules provide the minimal functional set of integer operations -needed to execute an application. The differing bit-widths address +needed to execute a useful application. The differing bit-widths address the needs of different main-memory sizes. This text primairly focuses on the RV32I base module and how to program it. +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Extension Modules} -\index{RV32M} -\index{RV32A} -\index{RV32F} -\index{RV32D} -\index{RV32Q} -\index{RV32C} -\index{RV32G} RISC-V extension modules may be included by an implementor interested in optimizing a design for one or more purposes. +\index{RV32M}% +\index{RV32A}% +\index{RV32F}% +\index{RV32D}% +\index{RV32Q}% +\index{RV32C}% Available extension modules include M (integer math), A (atomic), F (32-bit floating point), D (64-bit floating point), Q (128-bit floating point), C (compressed size instructions) and others. +\index{RV32G}% The extension name {\em G} is used to represent the combined set of IMAFD extensions as it is expected to be a common combination. -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{An Example Computer} - -\enote{Need a block diagram and description of the virtual machine -that is used in this text.}% -The machine used to execute the programs presented in this text -has one RV32I CPU with 32 registers, one \acrshort{hart} -(analogous to what is called a {\em core} on other CPUs such as an ARM) -and 65536 bytes of memory. - - %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{Executing a Program} +\section{How the CPU Executes a Program} -To observe the operation of our example computer an RV32I simulator -will be used that will print a message describing the status of the -CPU and the instructions that it executes as it goes along. - -The process of executing an instruction is called an -\index{instruction cycle}{\em instruction cycle} and it is comprised +The process of executing a program is continuously repeating series of +\index{instruction cycle}{\em instruction cycles} that are each comprised of an {\em instruction fetch} and an {\em instruction execute} phase. -The status of the CPU is entirely embodied in the data values that -are stored in its registers at any moment in time. The simulator -can print all of the register values before it executes an instruction -for reference. +The current status of a CPU is entirely embodied in the data values that +are stored in its registers at any moment in time. Of particular interest +to an executing a program is the \reg{pc} register. The \reg{pc} contains +the memory address containing the instruction that the CPU will execute next. -When an instruction is executed the simulator can print a message -describing where in main memory it came from, its numeric machine code -value, its mneumonic, a description of any associated parameters, -the values of those parameters and then carry out the operation as -defined by the ISA. - -For this to work, the instructions to be executed will have been -previously stored in a list in the main memory and any parameters that -an instruction specifies will either be part of the instruction itself -or read from (or stored into) one or more of the registers. +For this to work, the instructions to be executed must have been previously +stored in ajacent main memory locations and the address of the first instruction +placed into the \reg{pc} register. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -209,54 +249,22 @@ In order to {\em fetch} an instruction from the main memory the CPU must have a method to identify which instruction should be fetched and a method to fetch it. -To make this possible the main memory is broken up into small blocks -called \gls{byte}s that are each given a unique identifying number -called an \gls{address}. The process of identifying which instruction -to fetch is therefore a matter of knowing what address it is stored in. +Given that the main memory is broken up and that each of its bytes is +assigned an address, the \reg{pc} is used to hold the address of the +location where the next instruction to execute is located. -A byte is comprised of eight binary digits called \gls{bit}s. - -Every possible instruction that the RV32I can execute contains -exactly 32 bits. Therefore each instruction must be stored in -four bytes of the main memory. - -To simplify the hardware, each instruction -must be placed into four adjacent bytes whose numeric address sequence -begins with a multiple four. For example, an instruction might be -located in bytes 12, 13, 14 and 15 (but not in 15, 16, 17 and 18 -nor 8, 207, 5, and 1073\ldots). - -This sort of addressing requirement is common and is referred to as -\gls{alignment}. An aligned instruction begins at a memory address -that is a multiple of four. An {\em unaligned} instruction would -be one beginning at any other address and is {\em illegal}. - -An attempt to fetch an instruction from an unaligned address -will result in an error referred to as an alignment {\em \gls{exception}}. -This and other exceptions cause the CPU to stop executing the -curent instruction and start executing a different set of instructions -that are prepared to handle the problem. Often an exception is -handled by completely stopping the program in a way that is commonly -refered to as a system or application {\em crash}. - -Given a properly aligned instruction address, the CPU can request -that the main memory locate and deliver the values of the four bytes -in the address sequence to the CPU using what is called a memory -read operation. Some systems can deliver four (or more) bytes at the -same time while others might only be capable of delivering one or -two bytes at a time. These differences in hardware typically impact the -cost and performance of a system.\footnote{The design and implementation -choices that determine how any given system operates are part of what is -called a system's {\em organization} and is beyond the scope of this text. -See~\cite{codriscv:2017} for more information on computer organization.} +Given an instruction address, the CPU can request that the main memory +locate and return the value of the data stored there using what is called +a {\em memory read} operation and then the CPU can treat that {\em fetched} +value as an instruction and execute it.\footnote{RV32I instructions are +more than one byte in size, but this general description is suitable for now.} +Once an instruction has been fetched, it can be executed. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Instruction Execute} \index{instruction execute} -Once an instruction has been fetched by the CPU, it can be executed. - Typical instructions do things like add a number to the value currently stored in one of the registers or store the contents of a register into the main memory at some given address. @@ -265,14 +273,19 @@ Also part of every instruction is a notion of what should be done next. Most of the time an instruction will be complete by indicating that the CPU should proceed to fetch and execute the instruction at the next -larger main memory address. +larger main memory address. In these cases the \reg{pc} is incremented +to point to the memory address after the current instruction. + +Any parameters that an instruction requires must either be part of +the instruction itself or read from (or stored into) one or more of the +general purpose registers. Some instructions can specify that the CPU proceed to execute an instruction at an address other than the one that follows itself. This class of instructions have names like {\em jump} and {\em branch} and are available in a variety of different styles. -The RV ISA uses the word {\em jump} to refer to an {\em unconditional} +The RISC-V ISA uses the word {\em jump} to refer to an {\em unconditional} change in the sequential processing of instructions and the word {\em branch} to refer to a {\em conditional} change. @@ -285,143 +298,5 @@ one of two different actions pending the resulting {\em condition} of the comparison.\footnote{This is the fundamental method used by a CPU to make decisions.} - -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\subsection{A Sample Program Source Listing} - -A simple program that illustrates how this text presents -program source code is seen in \listingRef{zero4regs.S}. -This program will place a zero in each of the 4 registers -named x28, x29, x30 and x31. - -\listing{zero4regs.S}{Setting four registers to zero.} - -This program listing illustrates a number of things: -\begin{itemize} -\item Listings are identified by the name of the file within which - they are stored. This listing is from a file named: \verb@zero4regs.S@. -\item The assembly language programs discussed in this text will be saved - in files that end with: \verb@.S@ (Alternately you can use \verb@.sx@ - on systems that don't understand the difference between upper and - lowercase letters.\footnote{The author of this text prefers to avoid - using such systems.}) -\item A description of the listing's purpose appears under the name of the - file. The description of \listingRef{zero4regs.S} is - {\em Setting four registers to zero.} -\item The lines of the listing are numberd on the left margin for - easy reference. -\item An assembly program consists of lines of plain text. -\item The RISC-V ISA does not provide an operation that will simply - set a register to a numeric value. To accomplish our goal this - program will add zero to zero and place the sum in in each of the - four registers. -\item The lines that start with a dot `.' (on lines 1, 2 and 3) are - called {\em assembler directives} as they tell the assembler itself - how we want it to translate the following {\em assembly language instructions} - into {\em machine language instructions.} -\item Line 4 shows a {\em label} named {\em \_start}. The colon - at the end is the indicator to the assembler that causes it to - recognize the preceeding characters as a label. -\item Lines 5-8 are the four assembly language instructions that - make up the program. Each instruction in this program - consists of four {\em fields}. (Different instructions can have - a different number of fields.) The fields on line 5 are: - - \begin{itemize} - \item [addi] The instruction mneumonic. It indicates the operation - that the CPU will perform. - \item [x28] The {\em destination} register that will receive the - sum when the {\em addi} instruction is finished. The names of - the 32 registers are expressed as x0 -- x31. - \item [x0] One of the addends of the sum operation. (The x0 register - will always contain the vlaue zero. It can never be changed.) - \item [0] The second addend is the number zero. - \item [\# set \ldots] Any text anywhere in a RISC-V assembly language - program that starts with the pound-sign is ignored by the assembler. - They are used to place a {\em comment} in the program to help - the reader better understand the motive of the programmer. - \end{itemize} -\end{itemize} - - -\subsection{Running a Program With rvddt} -\index{rvddt} - -To illustrate what a CPU does when it executes instructions this text -will use a simulator that shows sequence of events and the binary values -involved. \listingRef{zero4regs.out} shows the operation of the four -{\em addi} instructions from \listingRef{zero4regs.S} when executed using the -\gls{rvddt} simulator.\footnote{The {\em rvddt} application was written to -generate the listings for this text. It is similar to the fancier -{\em spike} simulator. Given the simplicity of the RV32I ISA, rvddt -is less than 1700 lines of C++ and was written in one (long) afternoon.} - -\listing{zero4regs.out}{Running a program with the rvddt simulator} - -\begin{itemize} -\item [$\ell$ 1] This listing includes the command-line that shows how the simulator - was executed to load a file containing the machine instructions (aka - machine code) from the assembler. -\item [$\ell$ 2] A message from the simulator indicating that it loaded the machine - code into simulated memory at address 0. -\item [$\ell$ 3] This line shows the prompt from the debugger and the command - \verb@t4@ that the user entered to request that the simulator trace - the execution of four extructions. -\item [$\ell$ 4-8] Prior to executing the first instruction, the state of the - CPU registers is displayed. -\item [$\ell$ 4] The values in registers 0, 1, 2, 3, 4, 5, 6 and 7 are printed - from left to right in \gls{bigendian}, \gls{hexadecimal} form. - The dash `\verb@-@' character in the middle of the line is a reference - to make it easier to visually navigate across the line without being - forced to count the values from the far left when seeking the value - of, say, x5. -\item [$\ell$ 5-7] The values of registers 8--31 are printed. -\item [$\ell$ 8] The {\em program counter} (\reg{pc}) register is printed. - It contains the address of the instruction that the CPU will execute. - After each instruction, the \reg{pc} will either advance four bytes - ahead or be set to another value by a branch instruction as discussed above. -\item [$\ell$ 9] A four-byte instruction is fetched from memory at the address - in the \reg{pc} register, is decoded and printed. From left to right - the fields shown on this line are: - - \begin{itemize} - - \item [00000000] The memory address from which the instruction was - fetched. This address is displayed in \gls{bigendian}, - \gls{hexadecimal} form. - \item [00000e13] The machine code of the instruction displayed in - \gls{bigendian}, \gls{hexadecimal} form. - \item [addi] The mneumonic for the machine instruction. - \item [x28] The \reg{rd} field of the addi instruction. - \item [x0] The \reg{rs1} field of the addi instruction that - holds one of the two addends of the operation. - \item [0] The \reg{imm} field of the addi instruction that - holds the second of the two addends of the operation. - \item [\# \ldots] A simulator-generated comment that exaplains - what the instruction is doing. For this instruction it indicates - that \reg{x28} will have the value zero stored into it as a result - of performing the addition: $0+0$. - \end{itemize} - -\item [$\ell$ 10-14] These lines are printed as the prelude while tracing the - second instruction. Lines 7 and 13 show that \reg{x28} has changed - from \verb@f0f0f0f0@ to \verb@00000000@ as a result of executing the - first instruction and lines 8 and 14 show that the \reg{pc} has - advanced from zero (the location of the first instruction) to - four, where the second instruction will be fetched. None of the - rest of the registers have changed values. -\item [$\ell$ 15] The second instruction decoded executed and described. - This time register \reg{x29} will be assigned a value. -\item [$\ell$ 16-27] The third and fourth instructions are traced. -\item [$\ell$ 28] Tracing has completed. The simulator prints its prompt - and the user enters the `r' command to see the register state - after the fourth instruction has completed executing. -\item [$\ell$ 29-33] Following the fourth instruction it can be observed - that registers \reg{x28}, \reg{x29}, \reg{x30} and \reg{x31} - have been set to zero and that the \reg{pc} has advanced from - zero to four, then eight, then 12 (the hex value for 12 is c) - and then to 16 (which, in hex, is 10). -\item [$\ell$ 34] The simulator exit command `x' is entered by the user and - the terminal displays the shell prompt. - -\end{itemize} +Once the instruction execution phase has completed, the next instruction +cycle will be performed using the new \reg{pc} register address. diff --git a/book/toolchain/chapter.tex b/book/toolchain/chapter.tex index c27c550..3ec2b68 100644 --- a/book/toolchain/chapter.tex +++ b/book/toolchain/chapter.tex @@ -1,57 +1,11 @@ -\chapter{The RISC-V GNU Toolchain} +\chapter{Using The RISC-V GNU Toolchain} -This chapter discusses the GNU toolchain elements used to +This chapter discusses using the GNU toolchain elements to experiment with the material in this book. -The\enote{It would be good to find some Mac and Windows users to write -and test proper variations on this section to address those systems. -Pull requests, welcome!} -instructions and examples here were all implemented on Ubuntu 16.04 LTS. +See \autoref{chapter:install} if you do not already have the +GNU crosscompiler toolchain availale on your system. -Install custom code in a location that will not cause interference with -other applications and allow for easy cleanup. These instructions -install the toolchain in \verb@/usr/local/riscv@. At any time -you can remove the lot and start over by executing the following -command: - -\begin{verbatim} -rm -rf /usr/local/riscv/* -\end{verbatim} - - -Tested on Ubuntu 16.04 LTS. -18.04 was just released\ldots\ update accordingly. - -These are the only commands that you should perform as root when installing -the toolchain: - -\begin{verbatim} -sudo apt-get install autoconf automake autotools-dev curl libmpc-dev \ - libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf \ - libtool patchutils bc zlib1g-dev libexpat-dev -sudo mkdir -p /usr/local/riscv/ -sudo chmod 777 /usr/local/riscv/ -\end{verbatim} - -All other commands should be executed as a regular user. This will eliminate the -possibility of clobbering system files that should not be touched when tinkering with -the toolchain applicaitons. - -To download, compile and ``install'' the toolchain: - -\begin{verbatim} -# riscv toolchain: -# -# https://riscv.org/software-tools/risc-v-gnu-compiler-toolchain/ - -git clone --recursive https://github.com/riscv/riscv-gnu-toolchain -cd riscv-gnu-toolchain -./configure --prefix=/usr/local/riscv/rv32i --with-arch=rv32i --with-abi=ilp32 -make -make install -\end{verbatim} - -Need to discuss augmenting the PATH environment variable. Discuss the choice of ilp32 as well as what the other variations would do.