Refactor, reorganize, rephrase,...

This commit is contained in:
John Winans 2018-05-12 16:55:21 -05:00
parent 3276b58be4
commit 4ad6e160c8
19 changed files with 747 additions and 601 deletions

View File

@ -1,7 +1,7 @@
TOP=..
include $(TOP)/Make.rules
TEXPATH=./numbers:./intro:./rv32:./copyright:./license
TEXPATH=./float:./intro:./rv32:./copyright:./license:./elements
SUBDIRS=

View File

@ -1,11 +1,137 @@
\chapter{Number Systems}
\label{chapter:NumberSystems}
\chapter{Numbers and Storage Systems}
\label{chapter:numbers}
RISC-V systems represent information using binary values stored in
little-endian order.\footnote{See\cite{IEN137} for some history of
the big/little-endian ``controversy.''}
This chapter discusses how data are represented and stored in a computer.
\section{Integers}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\section{Context}
%
%Numbers can be interpreted differently depending on the context in
%which they are used. For example a number may represent the quantity
%of millimeters between two points. It may enumerate a
%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$,
%$01000011=C$\ldots\ In fact, any finite set of items can be identified
%(enumerated) by a assigning a code number to each element in this fashon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Logical/Boolean Functions}
\enote{This is unclear. Need to define bit positions and probably
should add basic truth table diagrams.}%
Unlike addition and subtraction, boolean functions apply
on a per-bit basis.
%in that they do not impact neighboring bits.
%by generating things like a carry or a borrow.
When applied to multi-bit values, each bit position is operated upon
independantly of the other bits.
\enote{Need to define 1 as true and 0 as false somewhere.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{NOT}
The {\em NOT} operator applies to a single operand and represents the
opposite of the input.
\enote{Need to define unary, binary and ternary operators without
confusing binary operators with binary numbers.}
If the input is 1 then the output is 0. If the input is 0 then the
output is 1. In other words, the output value is {\em not} that of the
input value.
This text will use the operator used in the C language when discussing
the {\em NOT} operator in symbolic form. Specifically the tilde: `\verb@~@'.
\begin{verbatim}
~ 1 1 1 1 0 1 0 1 <== A
-----------------
0 0 0 0 1 0 1 0 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = ~A@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{AND}
The boolean {\em and} function has two or more inputs and the output is a
single bit. The output is 1 if and only if all of the input values are 1.
Otherwise it is 0.
This text will use the operator used in the C language when discussing
the {\em AND} operator in symbolic form. Specifically the ampersand: `\verb@&@'.
This function works like it does in spoken language. For example
if A is 1 {\em AND} B is 1 then the output is 1 (true).
Otherwise the output is 0 (false). For example:
\begin{verbatim}
1 1 1 1 0 1 0 1 <== A
& 1 0 0 1 0 0 1 1 <== B
-----------------
1 0 0 1 0 0 0 1 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = A & B@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{OR}
The boolean {\em or} function has two or more inputs and the output is a
single bit. The output is 1 if at least one of the input values are 1.
This text will use the operator used in the C language when discussing
the {\em OR} operator in symbolic form. Specifically the pipe: `\verb@|@'.
This function works like it does in spoken language. For example
if A is 1 {\em OR} B is 1 then the output is 1 (true).
Otherwise the output is 0 (false). For example:
\begin{verbatim}
1 1 1 1 0 1 0 1 <== A
| 1 0 0 1 0 0 1 1 <== B
-----------------
1 1 1 1 0 1 1 1 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = A | B@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{XOR}
The boolean {\em exclusive or} function has two or more inputs and the
output is a single bit. The output is 1 if only an odd number of inputs
are 1. Otherwise the output will be 0.
This text will use the operator used in the C language when discussing
the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'.
Note that when {\em XOR} is used with two inputs, the output
is set to 1 (true) when the inputs have different values and 0
(false) when the inputs both have the same value.
For example:
\begin{verbatim}
1 1 1 1 0 1 0 1 <== A
^ 1 0 0 1 0 0 1 1 <== B
-----------------
0 1 1 0 0 1 1 0 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = A ^ B@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Integers and Counting}
A binary integer is constructed with only 1s and 0s in the same
manner as decimal numbers are constructed with values from 0 to 9.
@ -408,343 +534,85 @@ Disscuss the details of truncation and overflow here.
{\em truncation} and {\em overflow} as occur with signed and unsigned
addition and subtraction.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Logical/Boolean Functions}
Unlike addition and subtraction, boolean functions apply
on a per-bit basis.
%in that they do not impact neighboring bits.
%by generating things like a carry or a borrow.
When applied to multi-bit values, each bit position is operated upon
independantly of the other bits.
\enote{This is unclear. Need to define bit positions and probably
should add basic truth table diagrams.}
\enote{Need to define 1 as true and 0 as false somewhere.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{NOT}
The {\em NOT} operator applies to a single operand and represents the
opposite of the input.
\enote{Need to define unary, binary and ternary operators without
confusing binary operators with binary numbers.}
If the input is 1 then the output is 0. If the input is 0 then the
output is 1. In other words, the output value is {\em not} that of the
input value.
This text will use the operator used in the C language when discussing
the {\em NOT} operator in symbolic form. Specifically the tilde: `\verb@~@'.
\begin{verbatim}
~ 1 1 1 1 0 1 0 1 <== A
-----------------
0 0 0 0 1 0 1 0 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = ~A@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{AND}
The boolean {\em and} function has two or more inputs and the output is a
single bit. The output is 1 if and only if all of the input values are 1.
Otherwise it is 0.
This text will use the operator used in the C language when discussing
the {\em AND} operator in symbolic form. Specifically the ampersand: `\verb@&@'.
This function works like it does in spoken language. For example
if A is 1 {\em AND} B is 1 then the output is 1 (true).
Otherwise the output is 0 (false). For example:
\begin{verbatim}
1 1 1 1 0 1 0 1 <== A
& 1 0 0 1 0 0 1 1 <== B
-----------------
1 0 0 1 0 0 0 1 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = A & B@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{OR}
The boolean {\em or} function has two or more inputs and the output is a
single bit. The output is 1 if at least one of the input values are 1.
This text will use the operator used in the C language when discussing
the {\em OR} operator in symbolic form. Specifically the pipe: `\verb@|@'.
This function works like it does in spoken language. For example
if A is 1 {\em OR} B is 1 then the output is 1 (true).
Otherwise the output is 0 (false). For example:
\begin{verbatim}
1 1 1 1 0 1 0 1 <== A
| 1 0 0 1 0 0 1 1 <== B
-----------------
1 1 1 1 0 1 1 1 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = A | B@
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{XOR}
The boolean {\em exclusive or} function has two or more inputs and the
output is a single bit. The output is 1 if only an odd number of inputs
are 1. Otherwise the output will be 0.
This text will use the operator used in the C language when discussing
the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'.
Note that when {\em XOR} is used with two inputs, the output
is set to 1 (true) when the inputs have different values and 0
(false) when the inputs both have the same value.
For example:
\begin{verbatim}
1 1 1 1 0 1 0 1 <== A
^ 1 0 0 1 0 0 1 1 <== B
-----------------
0 1 1 0 0 1 1 0 <== output
\end{verbatim}
In a line of code the above might read like this: \verb@output = A ^ B@
%\section{Context}
%
%Numbers can be interpreted differently depending on the context in
%which they are used. For example a number may represent the quantity
%of millimeters between two points. It may enumerate a
%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$,
%$01000011=C$\ldots\ In fact, any finite set of items can be identified
%(enumerated) by a assigning a code number to each element in this fashon.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{IEEE-754 Floating Point Number Representation}
\label{chapter::floatingpoint}
\section{Main Memory Storage}
This section provides an overview of the IEEE-754 32-bit binary floating
point format.
\enote{Refactor this section and the memory discussion in RV32 reference chapter}%
When transferring data between its registers registers and main memory a
RISC-V system uses the little-endian byte order.\footnote{
See\cite{IEN137} for some history of the big/little-endian ``controversy.''}
\begin{itemize}
\item Recall that the place values for integer binary numbers are:
\begin{verbatim}
... 128 64 32 16 8 4 2 1
\end{verbatim}
\item We can extend this to the right in binary similar to the way we do for
decimal numbers:
\begin{verbatim}
... 128 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1/64 1/128 ...
\end{verbatim}
The `.' in a binary number is a binary point, not a decimal point.
\enote{Discuss byte ordering, addressing and character strings.}
\item We use scientific notation as in $2.7 \times 10^{-47}$ to express either
small fractions or large numbers when we are not concerned every last digit
needed to represent the entire, exact, value of a number.
\subsection{Memory Dump}
\item The format of a number in scientific notation is $mantissa \times base^{exponent}$
Introduce the memory dump and how to read them here.
\item In binary we have $mantissa \times 2^{exponent}$
Discuss the pitfalls of assuming what a set of bytes is used for based
on their contents!
\item IEEE-754 format requires binary numbers to be {\em normalized} to
$1.significand \times 2^{exponent}$ where the {\em significand}
is the portion of the {\em mantissa} that is to the right of the binary-point.
\subsection{Big Endian Representation}
\begin{itemize}
\item The unnormalized binary value of $-2.625$ is $10.101$
\item The normalized value of $-2.625$ is $1.0101 \times 2^1$
\end{itemize}
Using the memory dump contents in prior section, discuss how
big endian values are stored.
\item We need not store the `1.' because {\em all} normalized floating
point numbers will start that way. Thus we can save memory when storing
normalized values by adding 1 to the significand.
\subsection{Little Endian Representation}
{
\small
\setlength{\unitlength}{.15in}
\begin{picture}(32,4)(0,0)
\put(0,1){\line(1,0){32}} % bottom line
\put(0,2){\line(1,0){32}} % top line
Using the memory dump contents in prior section, discuss how
little endian values are stored.
\put(0,1){\line(0,1){2}} % left vertical
\put(0,2){\makebox(1,1){\tiny 31}} % left end bit number marker
\subsection{Character Strings and Arrays}
\put(32,1){\line(0,1){2}} % vertical right end
\put(31,2){\makebox(1,1){\tiny 0}} % right end bit number marker
Define character strings and arrays.
\put(0,0){\makebox(1,1){\small sign}}
\put(1,0){\makebox(8,1){\small exponent}}
\put(9,0){\makebox(23,1){\small significand}}
Using the prior memory dump, discuss how and where things are stored and
retrieved.
\put(0,1){\makebox(1,1){1}} % sign
\subsection{Alignment}
\put(1,1){\line(0,1){2}} % seperator
\put(1,2){\makebox(1,1){\tiny 30}} % bit marker
Draw a diagram showing the overlapping data types when they are all aligned.
\put(1,1){\makebox(1,1){1}} % exponent
\put(2,1){\makebox(1,1){0}}
\put(3,1){\makebox(1,1){0}}
\put(4,1){\makebox(1,1){0}}
\put(5,1){\makebox(1,1){0}}
\put(6,1){\makebox(1,1){0}}
\put(7,1){\makebox(1,1){0}}
\put(8,1){\makebox(1,1){0}}
\put(8,2){\makebox(1,1){\tiny 23}} % bit marker
\put(9,1){\line(0,1){2}} % seperator
\put(9,2){\makebox(1,1){\tiny 22}} % bit marker
\subsection{Instruction Alignment}
\put(9,1){\makebox(1,1){0}}
\put(10,1){\makebox(1,1){1}}
\put(11,1){\makebox(1,1){0}}
\put(12,1){\makebox(1,1){1}}
\put(13,1){\makebox(1,1){0}}
\put(14,1){\makebox(1,1){0}}
\put(15,1){\makebox(1,1){0}}
\put(16,1){\makebox(1,1){0}}
\put(17,1){\makebox(1,1){0}}
\put(18,1){\makebox(1,1){0}}
\put(19,1){\makebox(1,1){0}}
\put(20,1){\makebox(1,1){0}}
\put(21,1){\makebox(1,1){0}}
\put(22,1){\makebox(1,1){0}}
\put(23,1){\makebox(1,1){0}}
\put(24,1){\makebox(1,1){0}}
\put(25,1){\makebox(1,1){0}}
\put(26,1){\makebox(1,1){0}}
\put(27,1){\makebox(1,1){0}}
\put(28,1){\makebox(1,1){0}}
\put(29,1){\makebox(1,1){0}}
\put(30,1){\makebox(1,1){0}}
\put(31,1){\makebox(1,1){0}}
\end{picture}
}
\enote{Rewrite this section for data rather than instructions and then
note here that instructions must be naturally aligned. For RV32 that
is on a 4-byte boundary}%
Every possible instruction that an RV32I CPU can execute contains
exactly 32 bits. Therefore each one must be stored in four bytes
of the main memory.
%\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -(1 \frac{5}{16} \times 2^{1}) = -(1.3125 \times 2^{1}) = -2.625$
\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -((1 + \frac{1}{4} + \frac{1}{16}) \times 2^1) = -(2 + \frac{1}{2} + \frac{1}{8}) = -(2 + .5 + .125) = -2.625$
To simplify the hardware, each instruction must be placed into four
adjacent bytes whose numeric address sequence begins with a multiple
four. For example, an instruction might be located in bytes
4, 5, 6 and 7 (but not in 5, 6, 7 and 8 nor in 9, 3, 1, and 0\ldots).
\item IEEE754 formats:
This sort of addressing requirement is common and is referred to as
\gls{alignment}. An aligned instruction begins at a memory address
that is a multiple of four. An {\em unaligned} instruction would
be one beginning at any other address and is {\em illegal}.
\begin{tabular}{|l|l|l|}
\hline
& IEEE754 32-bit & IEEE754 64-bit \\
\hline
sign & 1 bit & 1 bit \\
exponent & 8 bits (excess-127) & 11 bits (excess-1023) \\
mantissa & 23 bits & 52 bits \\
max exponent & 127 & 1023 \\
min exponent & -126 & -1022 \\
\hline
\end{tabular}
An attempt to fetch an instruction from an unaligned address
will result in an error referred to as an alignment {\em \gls{exception}}.
This and other exceptions cause the CPU to stop executing the
curent instruction and start executing a different set of instructions
that are prepared to handle the problem. Often an exception is
handled by completely stopping the program in a way that is commonly
refered to as a system or application {\em crash}.
\item When the exponent is all ones, the mantissa is all zeros, and
the sign is zero, the number represents positive infinity.
Given a properly aligned instruction address, the CPU can request
that the main memory locate and deliver the values of the four bytes
in the address sequence to the CPU using what is called a memory
read operation. Some systems can deliver four (or more) bytes at the
same time while others might only be capable of delivering one or
two bytes at a time. These differences in hardware typically impact the
cost and performance of a system.\footnote{The design and implementation
choices that determine how any given system operates are part of what is
called a system's {\em organization} and is beyond the scope of this text.
See~\cite{codriscv:2017} for more information on computer organization.}
\item When the exponent is all ones, the mantissa is all zeros, and
the sign is one, the number represents negative infinity.
\item Note that the binary representation of an IEEE754 number in memory
can be compared for magnitude with another one using the same logic as for
comparing two's complement signed integers because the magnitude of an
IEEE number grows upward and downward in the same fashion as signed integers.
This is why we use excess notation and locate the significand's sign bit on
the left of the exponent.
\item Note that zero is a special case number. Recall that a normalized
number has an implied 1-bit to the left of the significand\ldots\ which
means that there is no way to represent zero!
Zero is represented by an exponent of all-zeros and a significand of
all-zeros. This definition allows for a positive and a negative zero
if we observe that the sign can be either 1 or 0.
\item On the number-line, numbers between zero and the smallest fraction in
either direction are in the {\em \gls{underflow}} areas.
\enote{Need to add the standard lecture numberline diagram showing
where the over/under-flow areas are and why.}
\item On the number line, numbers greater than the mantissa of all-ones and the
largest exponent allowed are in the {\em \gls{overflow}} areas.
\item Note that numbers have a higher resolution on the number line when the
exponent is smaller.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Floating Point Number Accuracy}
Due to the finite number of bits used to store the value of a floating point
number, it is not possible to represent every one of the infinite values
on the real number line. The following C programs illustrate this point.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Powers Of Two}
Just like the integer numbers, the powers of two that have bits to represent
them can be represented perfectly\ldots\ as can their sums (provided that the
significand requires no more than 23 bits.)
\listing{powersoftwo.c}{Precise Powers of Two}
\listing{powersoftwo.out}{Output from {\tt powersoftwo.c}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Clean Decimal Numbers}
When dealing with decimal values, you will find that they don't map simply
into binary floating point values.
% (the same holds true for binary integer numbers).
Note how the decimal numbers are not accurately represented as they get larger.
The decimal number on line 10 of \listingRef{cleandecimal.out}
can be perfectly represented in IEEE format. However, a problem arises in
the 11Th loop iteration. It is due to the fact that the
binary number can not be represented accurately in IEEE format. Its least
significant bits were truncated in a best-effort attempt at rounding the value
off in order to fit the value into the bits provided. This is an example of
{\em low order truncation}. Once this happens, the value of \verb@x.f@ is
no longer as precise as it could be given more bits in which to save its value.
\listing{cleandecimal.c}{Print Clean Decimal Numbers}
\listing{cleandecimal.out}{Output from {\tt cleandecimal.c}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Accumulation of Error}
These rounding errors can be exaggerated when the number we multiply
the \verb@x.f@ value by is, itself, something that can not be accurately
represented in IEEE
form.\footnote{Applications requiring accurate decimal values, such as
financial accounting systems, can use a packed-decimal numeric format
to avoid unexpected oddities caused by the use of binary numbers.}
\enote{In a lecture one would show that one tenth is a repeating
non-terminating binary number that gets truncated. This discussion
should be reproduced here in text form.}
For example, if we multiply our \verb@x.f@ value by $\frac{1}{10}$ each time,
we can never be accurate and we start accumulating errors immediately.
\listing{erroraccumulation.c}{Accumulation of Error}
\listing{erroraccumulation.out}{Output from {\tt erroraccumulation.c}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Reducing Error Accumulation}
In order to use floating point numbers in a program without causing
excessive rounding problems an algorithm can be redesigned such that the
accumulation is eliminated.
This example is similar to the previous one, but this time we recalculate the
desired value from a known-accurate integer value.
Some rounding errors remain present, but they can not accumulate.
\listing{errorcompensation.c}{Accumulation of Error}
\listing{errorcompensation.out}{Output from {\tt erroraccumulation.c}}

View File

@ -59,7 +59,8 @@
%\part{Introduction}
\include{intro/chapter}
\include{numbers/chapter}
\include{binary/chapter}
\include{elements/chapter}
\include{toolchain/chapter}
\include{rv32/chapter}
@ -67,6 +68,8 @@
% These 'chapters' are lettered rather than numbered
\appendix
\include{install/chapter}
\include{float/chapter}
\include{license/chapter}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

149
book/elements/chapter.tex Normal file
View File

@ -0,0 +1,149 @@
\chapter{The Elements of a Assembly Language Program}
\label{chapter:elements}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{A Sample Program Source Listing}
A simple program that illustrates how this text presents
program source code is seen in \listingRef{zero4regs.S}.
This program will place a zero in each of the 4 registers
named x28, x29, x30 and x31.
\listing{zero4regs.S}{Setting four registers to zero.}
This program listing illustrates a number of things:
\begin{itemize}
\item Listings are identified by the name of the file within which
they are stored. This listing is from a file named: \verb@zero4regs.S@.
\item The assembly language programs discussed in this text will be saved
in files that end with: \verb@.S@ (Alternately you can use \verb@.sx@
on systems that don't understand the difference between upper and
lowercase letters.\footnote{The author of this text prefers to avoid
using such systems.})
\item A description of the listing's purpose appears under the name of the
file. The description of \listingRef{zero4regs.S} is
{\em Setting four registers to zero.}
\item The lines of the listing are numberd on the left margin for
easy reference.
\item An assembly program consists of lines of plain text.
\item The RISC-V ISA does not provide an operation that will simply
set a register to a numeric value. To accomplish our goal this
program will add zero to zero and place the sum in in each of the
four registers.
\item The lines that start with a dot `.' (on lines 1, 2 and 3) are
called {\em assembler directives} as they tell the assembler itself
how we want it to translate the following {\em assembly language instructions}
into {\em machine language instructions.}
\item Line 4 shows a {\em label} named {\em \_start}. The colon
at the end is the indicator to the assembler that causes it to
recognize the preceeding characters as a label.
\item Lines 5-8 are the four assembly language instructions that
make up the program. Each instruction in this program
consists of four {\em fields}. (Different instructions can have
a different number of fields.) The fields on line 5 are:
\begin{itemize}
\item [addi] The instruction mneumonic. It indicates the operation
that the CPU will perform.
\item [x28] The {\em destination} register that will receive the
sum when the {\em addi} instruction is finished. The names of
the 32 registers are expressed as x0 -- x31.
\item [x0] One of the addends of the sum operation. (The x0 register
will always contain the vlaue zero. It can never be changed.)
\item [0] The second addend is the number zero.
\item [\# set \ldots] Any text anywhere in a RISC-V assembly language
program that starts with the pound-sign is ignored by the assembler.
They are used to place a {\em comment} in the program to help
the reader better understand the motive of the programmer.
\end{itemize}
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Running a Program With rvddt}
\index{rvddt}
To illustrate what a CPU does when it executes instructions this text
will use the \gls{rvddt} simulator to display shows sequence of events
and the binary values involved. This simulator supports the RV32I ISA
and has a configurable ammount of memory.%
\footnote{The {\em rvddt} simulator was written to generate the listings for
this text. It is similar to the fancier {\em spike} simulator.
Given the simplicity of the RV32I ISA, rvddt is less than 1700 lines of C++
and was written in one (long) afternoon.}
\listingRef{zero4regs.out} shows the operation of the four
{\em addi} instructions from \listingRef{zero4regs.S} when it is executed
in trace-mode.
\listing{zero4regs.out}{Running a program with the rvddt simulator}
\begin{itemize}
\item [$\ell$ 1] This listing includes the command-line that shows how the simulator
was executed to load a file containing the machine instructions (aka
machine code) from the assembler.
\item [$\ell$ 2] A message from the simulator indicating that it loaded the machine
code into simulated memory at address 0.
\item [$\ell$ 3] This line shows the prompt from the debugger and the command
\verb@t4@ that the user entered to request that the simulator trace
the execution of four extructions.
\item [$\ell$ 4-8] Prior to executing the first instruction, the state of the
CPU registers is displayed.
\item [$\ell$ 4] The values in registers 0, 1, 2, 3, 4, 5, 6 and 7 are printed
from left to right in \gls{bigendian}, \gls{hexadecimal} form.
The dash `\verb@-@' character in the middle of the line is a reference
to make it easier to visually navigate across the line without being
forced to count the values from the far left when seeking the value
of, say, x5.
\item [$\ell$ 5-7] The values of registers 8--31 are printed.
\item [$\ell$ 8] The {\em program counter} (\reg{pc}) register is printed.
It contains the address of the instruction that the CPU will execute.
After each instruction, the \reg{pc} will either advance four bytes
ahead or be set to another value by a branch instruction as discussed above.
\item [$\ell$ 9] A four-byte instruction is fetched from memory at the address
in the \reg{pc} register, is decoded and printed. From left to right
the fields shown on this line are:
\begin{itemize}
\item [00000000] The memory address from which the instruction was
fetched. This address is displayed in \gls{bigendian},
\gls{hexadecimal} form.
\item [00000e13] The machine code of the instruction displayed in
\gls{bigendian}, \gls{hexadecimal} form.
\item [addi] The mneumonic for the machine instruction.
\item [x28] The \reg{rd} field of the addi instruction.
\item [x0] The \reg{rs1} field of the addi instruction that
holds one of the two addends of the operation.
\item [0] The \reg{imm} field of the addi instruction that
holds the second of the two addends of the operation.
\item [\# \ldots] A simulator-generated comment that exaplains
what the instruction is doing. For this instruction it indicates
that \reg{x28} will have the value zero stored into it as a result
of performing the addition: $0+0$.
\end{itemize}
\item [$\ell$ 10-14] These lines are printed as the prelude while tracing the
second instruction. Lines 7 and 13 show that \reg{x28} has changed
from \verb@f0f0f0f0@ to \verb@00000000@ as a result of executing the
first instruction and lines 8 and 14 show that the \reg{pc} has
advanced from zero (the location of the first instruction) to
four, where the second instruction will be fetched. None of the
rest of the registers have changed values.
\item [$\ell$ 15] The second instruction decoded executed and described.
This time register \reg{x29} will be assigned a value.
\item [$\ell$ 16-27] The third and fourth instructions are traced.
\item [$\ell$ 28] Tracing has completed. The simulator prints its prompt
and the user enters the `r' command to see the register state
after the fourth instruction has completed executing.
\item [$\ell$ 29-33] Following the fourth instruction it can be observed
that registers \reg{x28}, \reg{x29}, \reg{x30} and \reg{x31}
have been set to zero and that the \reg{pc} has advanced from
zero to four, then eight, then 12 (the hex value for 12 is c)
and then to 16 (which, in hex, is 10).
\item [$\ell$ 34] The simulator exit command `x' is entered by the user and
the terminal displays the shell prompt.
\end{itemize}

218
book/float/chapter.tex Normal file
View File

@ -0,0 +1,218 @@
\chapter{Floating Point Numbers}
\label{chapter:NumberSystems}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{IEEE-754 Floating Point Number Representation}
\label{chapter::floatingpoint}
This section provides an overview of the IEEE-754 32-bit binary floating
point format.
\begin{itemize}
\item Recall that the place values for integer binary numbers are:
\begin{verbatim}
... 128 64 32 16 8 4 2 1
\end{verbatim}
\item We can extend this to the right in binary similar to the way we do for
decimal numbers:
\begin{verbatim}
... 128 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1/64 1/128 ...
\end{verbatim}
The `.' in a binary number is a binary point, not a decimal point.
\item We use scientific notation as in $2.7 \times 10^{-47}$ to express either
small fractions or large numbers when we are not concerned every last digit
needed to represent the entire, exact, value of a number.
\item The format of a number in scientific notation is $mantissa \times base^{exponent}$
\item In binary we have $mantissa \times 2^{exponent}$
\item IEEE-754 format requires binary numbers to be {\em normalized} to
$1.significand \times 2^{exponent}$ where the {\em significand}
is the portion of the {\em mantissa} that is to the right of the binary-point.
\begin{itemize}
\item The unnormalized binary value of $-2.625$ is $10.101$
\item The normalized value of $-2.625$ is $1.0101 \times 2^1$
\end{itemize}
\item We need not store the `1.' because {\em all} normalized floating
point numbers will start that way. Thus we can save memory when storing
normalized values by adding 1 to the significand.
{
\small
\setlength{\unitlength}{.15in}
\begin{picture}(32,4)(0,0)
\put(0,1){\line(1,0){32}} % bottom line
\put(0,2){\line(1,0){32}} % top line
\put(0,1){\line(0,1){2}} % left vertical
\put(0,2){\makebox(1,1){\tiny 31}} % left end bit number marker
\put(32,1){\line(0,1){2}} % vertical right end
\put(31,2){\makebox(1,1){\tiny 0}} % right end bit number marker
\put(0,0){\makebox(1,1){\small sign}}
\put(1,0){\makebox(8,1){\small exponent}}
\put(9,0){\makebox(23,1){\small significand}}
\put(0,1){\makebox(1,1){1}} % sign
\put(1,1){\line(0,1){2}} % seperator
\put(1,2){\makebox(1,1){\tiny 30}} % bit marker
\put(1,1){\makebox(1,1){1}} % exponent
\put(2,1){\makebox(1,1){0}}
\put(3,1){\makebox(1,1){0}}
\put(4,1){\makebox(1,1){0}}
\put(5,1){\makebox(1,1){0}}
\put(6,1){\makebox(1,1){0}}
\put(7,1){\makebox(1,1){0}}
\put(8,1){\makebox(1,1){0}}
\put(8,2){\makebox(1,1){\tiny 23}} % bit marker
\put(9,1){\line(0,1){2}} % seperator
\put(9,2){\makebox(1,1){\tiny 22}} % bit marker
\put(9,1){\makebox(1,1){0}}
\put(10,1){\makebox(1,1){1}}
\put(11,1){\makebox(1,1){0}}
\put(12,1){\makebox(1,1){1}}
\put(13,1){\makebox(1,1){0}}
\put(14,1){\makebox(1,1){0}}
\put(15,1){\makebox(1,1){0}}
\put(16,1){\makebox(1,1){0}}
\put(17,1){\makebox(1,1){0}}
\put(18,1){\makebox(1,1){0}}
\put(19,1){\makebox(1,1){0}}
\put(20,1){\makebox(1,1){0}}
\put(21,1){\makebox(1,1){0}}
\put(22,1){\makebox(1,1){0}}
\put(23,1){\makebox(1,1){0}}
\put(24,1){\makebox(1,1){0}}
\put(25,1){\makebox(1,1){0}}
\put(26,1){\makebox(1,1){0}}
\put(27,1){\makebox(1,1){0}}
\put(28,1){\makebox(1,1){0}}
\put(29,1){\makebox(1,1){0}}
\put(30,1){\makebox(1,1){0}}
\put(31,1){\makebox(1,1){0}}
\end{picture}
}
%\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -(1 \frac{5}{16} \times 2^{1}) = -(1.3125 \times 2^{1}) = -2.625$
\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -((1 + \frac{1}{4} + \frac{1}{16}) \times 2^1) = -(2 + \frac{1}{2} + \frac{1}{8}) = -(2 + .5 + .125) = -2.625$
\item IEEE754 formats:
\begin{tabular}{|l|l|l|}
\hline
& IEEE754 32-bit & IEEE754 64-bit \\
\hline
sign & 1 bit & 1 bit \\
exponent & 8 bits (excess-127) & 11 bits (excess-1023) \\
mantissa & 23 bits & 52 bits \\
max exponent & 127 & 1023 \\
min exponent & -126 & -1022 \\
\hline
\end{tabular}
\item When the exponent is all ones, the mantissa is all zeros, and
the sign is zero, the number represents positive infinity.
\item When the exponent is all ones, the mantissa is all zeros, and
the sign is one, the number represents negative infinity.
\item Note that the binary representation of an IEEE754 number in memory
can be compared for magnitude with another one using the same logic as for
comparing two's complement signed integers because the magnitude of an
IEEE number grows upward and downward in the same fashion as signed integers.
This is why we use excess notation and locate the significand's sign bit on
the left of the exponent.
\item Note that zero is a special case number. Recall that a normalized
number has an implied 1-bit to the left of the significand\ldots\ which
means that there is no way to represent zero!
Zero is represented by an exponent of all-zeros and a significand of
all-zeros. This definition allows for a positive and a negative zero
if we observe that the sign can be either 1 or 0.
\item On the number-line, numbers between zero and the smallest fraction in
either direction are in the {\em \gls{underflow}} areas.
\enote{Need to add the standard lecture numberline diagram showing
where the over/under-flow areas are and why.}
\item On the number line, numbers greater than the mantissa of all-ones and the
largest exponent allowed are in the {\em \gls{overflow}} areas.
\item Note that numbers have a higher resolution on the number line when the
exponent is smaller.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Floating Point Number Accuracy}
Due to the finite number of bits used to store the value of a floating point
number, it is not possible to represent every one of the infinite values
on the real number line. The following C programs illustrate this point.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Powers Of Two}
Just like the integer numbers, the powers of two that have bits to represent
them can be represented perfectly\ldots\ as can their sums (provided that the
significand requires no more than 23 bits.)
\listing{powersoftwo.c}{Precise Powers of Two}
\listing{powersoftwo.out}{Output from {\tt powersoftwo.c}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Clean Decimal Numbers}
When dealing with decimal values, you will find that they don't map simply
into binary floating point values.
% (the same holds true for binary integer numbers).
Note how the decimal numbers are not accurately represented as they get larger.
The decimal number on line 10 of \listingRef{cleandecimal.out}
can be perfectly represented in IEEE format. However, a problem arises in
the 11Th loop iteration. It is due to the fact that the
binary number can not be represented accurately in IEEE format. Its least
significant bits were truncated in a best-effort attempt at rounding the value
off in order to fit the value into the bits provided. This is an example of
{\em low order truncation}. Once this happens, the value of \verb@x.f@ is
no longer as precise as it could be given more bits in which to save its value.
\listing{cleandecimal.c}{Print Clean Decimal Numbers}
\listing{cleandecimal.out}{Output from {\tt cleandecimal.c}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsubsection{Accumulation of Error}
These rounding errors can be exaggerated when the number we multiply
the \verb@x.f@ value by is, itself, something that can not be accurately
represented in IEEE
form.\footnote{Applications requiring accurate decimal values, such as
financial accounting systems, can use a packed-decimal numeric format
to avoid unexpected oddities caused by the use of binary numbers.}
\enote{In a lecture one would show that one tenth is a repeating
non-terminating binary number that gets truncated. This discussion
should be reproduced here in text form.}
For example, if we multiply our \verb@x.f@ value by $\frac{1}{10}$ each time,
we can never be accurate and we start accumulating errors immediately.
\listing{erroraccumulation.c}{Accumulation of Error}
\listing{erroraccumulation.out}{Output from {\tt erroraccumulation.c}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Reducing Error Accumulation}
In order to use floating point numbers in a program without causing
excessive rounding problems an algorithm can be redesigned such that the
accumulation is eliminated.
This example is similar to the previous one, but this time we recalculate the
desired value from a known-accurate integer value.
Some rounding errors remain present, but they can not accumulate.
\listing{errorcompensation.c}{Accumulation of Error}
\listing{errorcompensation.out}{Output from {\tt erroraccumulation.c}}

View File

@ -165,6 +165,13 @@
so the programmer need not memorize the biary values of each
machine instruction}
}
\newglossaryentry{thread}
{
name={thread},
description={An stream of instructions. When plural, it is
used to refer to the ability of a CPU to execute multiple
instruction streams at the same time}
}
\newacronym{hart}{hart}{Hardware Thread}
\newacronym{msb}{MSB}{Most Significant Bit}

72
book/install/chapter.tex Normal file
View File

@ -0,0 +1,72 @@
\chapter{Installing a RISC-V Toolchain}
\label{chapter:install}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The GNU Toolchain}
Discuss the GNU toolchain elements used to experiment with the
material in this book.
\enote{It would be good to find some Mac and Windows users to write
and test proper variations on this section to address those systems.
Pull requests, welcome!}%
The instructions and examples here were all implemented on Ubuntu 16.04 LTS.
Install custom code in a location that will not cause interference with
other applications and allow for easy cleanup. These instructions
install the toolchain in \verb@/usr/local/riscv@. At any time
you can remove the lot and start over by executing the following
command:
\begin{verbatim}
rm -rf /usr/local/riscv/*
\end{verbatim}
Tested on Ubuntu 16.04 LTS.
18.04 was just released\ldots\ update accordingly.
These are the only commands that you should perform as root when installing
the toolchain:
\begin{verbatim}
sudo apt-get install autoconf automake autotools-dev curl libmpc-dev \
libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf \
libtool patchutils bc zlib1g-dev libexpat-dev
sudo mkdir -p /usr/local/riscv/
sudo chmod 777 /usr/local/riscv/
\end{verbatim}
All other commands should be executed as a regular user. This will eliminate the
possibility of clobbering system files that should not be touched when tinkering with
the toolchain applicaitons.
To download, compile and ``install'' the toolchain:
\begin{verbatim}
# riscv toolchain:
#
# https://riscv.org/software-tools/risc-v-gnu-compiler-toolchain/
git clone --recursive https://github.com/riscv/riscv-gnu-toolchain
cd riscv-gnu-toolchain
./configure --prefix=/usr/local/riscv/rv32i --with-arch=rv32i --with-abi=ilp32
make
make install
\end{verbatim}
Need to discuss augmenting the PATH environment variable.
Discuss the choice of ilp32 as well as what the other variations would do.
Discuss rv32im and note that the details are found in \autoref{chapter:RV32}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{rvddt}
Disciuss installing the rvddt simulator here.

View File

@ -57,6 +57,11 @@ the data and instructions that can not fit into the CPU registers.
Typically, a CPU's registers can hold tens of data values while
the main memory can contain many billions of data values.
To keep track of the data values, each register is assigned a number and
the main memory is broken up into small blocks called \gls{byte}s that
are also each assigned number called an \gls{address}
(an address is often referred to as a {\em location.}
A CPU can process data in a register at a speed that can be an order
of magnitude faster than the rate that it can process (specifically,
transfer data and instructions to and from) the main memory.
@ -81,15 +86,72 @@ more slowly than its main memory.
This text is not particularly concerned with non-volatile storage.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{CPU}
\index{CPU}
The \acrshort{cpu} is a collection of registers and circuitry designed
to read data and instructions from the storage system. The instructions
tell the CPU to perform various mathamatical and logical operations on
the data in its registers and where to save the results of those operations.
\enote{Add a block diagram of the CPU components described here.}
The \acrshort{cpu} is a collection of registers and circuitry designed
manipulate the register data and to exchange data and instructions with the
storage system. The instructions that it reads from the main memory tells
the CPU to perform various mathamatical and logical operations on the data
in its registers and where to save the results of those operations.
\subsubsection{Execution Unit}
The part of a CPU that coordinates all aspects of the operations of each
instruction is called the {\em execution unit.} It is what performs the transfers
of instructions and bata between the CPU and the main memory and tells the
registers when they are supposed to either store or recall data being transferred.
The execution unit also controls the ALU (Arithmetic and Logic Unit).
\subsubsection{Arithmetic and Logic Unit}
\index{ALU}
When an instruction manipulates data by performing things like an {\em addition},
{\em subtraction}, {\em comparison} or other similar operations, the ALU is what
will calculate the sum, difference, and so on.
\subsubsection{Registers}
\index{register}
In the RV32 CPU there are 31 general purpose registers that each contain 32 \gls{bit}s
(where each bit is one \gls{binary} digit value of one or zero) and a number
of special-purpose registers.
Each of the general purpose registers is given a name such as \reg{x1}, \reg{x2},
\ldots\ on up to \reg{x31} ({\em general purpose} refers to the fact that the CPU
itself does not prescribe any particular function to any these registers.)
Two important special-purpose registers are \reg{x0} and \reg{pc}.
Register \reg{x0} will always represent the value zero or logical {\em false}
no matter what. If any instruction tries to change the value is \reg{x0} value the
operation will fail. The need for {\em zero} is so common that, other than the
fact that it is hard-wired to zero, the \reg{x0} register is made available as
if it were otherwise a general purpose register.%
\footnote{Having a special
{\em zero} register allows the total set of instructions that the CPU can execute
to be simplified. Thus reducing its complexity, power consumption and cost.}
The \reg{pc} regiter is called the {\em program counter}. The CPU uses it to
remember the memory address where its program istructions are located.
The number of bits in each register is defined by the \acrfull{isa}.
\subsubsection{Harts}
\index{hart}
Analogous to a {\em core} in other types of CPUs, a {\em \acrshort{hart}}
(hardware \gls{thread}) in a RISC-V CPU refers to the collection of 32 registers,
instruction execution unit and ALU.
When more than one hart is present in a CPU, a different stream of instructions can
be executed on each hart all at the same time.
Programs that are written to take advantage of this are called {\em multithreaded}.
This text will primairly focus on CPUs that have only one hart.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Peripherals}
@ -106,8 +168,8 @@ instructions are used to initiate, execute and/or synchronize data transfers.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Instruction Set Architecture}
\index{ISA}
The catalog of rules that describes the details of the instructions
and features that a given CPU provides is called its \acrfull{isa}.
@ -125,80 +187,58 @@ modules and zero or more of the {\em extension} modules.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{RV Base Modules}
\index{RV32I}
The base modules are RV32I (32-bit general purpose),
RV32E (32-bit embedded), RV64I (64-bit general purpose)
and RV128I (128-bit general purpose).
These base modules provide the minimal functional set of integer operations
needed to execute an application. The differing bit-widths address
needed to execute a useful application. The differing bit-widths address
the needs of different main-memory sizes.
This text primairly focuses on the RV32I base module and how to program it.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Extension Modules}
\index{RV32M}
\index{RV32A}
\index{RV32F}
\index{RV32D}
\index{RV32Q}
\index{RV32C}
\index{RV32G}
RISC-V extension modules may be included by an implementor interested
in optimizing a design for one or more purposes.
\index{RV32M}%
\index{RV32A}%
\index{RV32F}%
\index{RV32D}%
\index{RV32Q}%
\index{RV32C}%
Available extension modules include M (integer math), A (atomic),
F (32-bit floating point), D (64-bit floating point),
Q (128-bit floating point), C (compressed size instructions) and others.
\index{RV32G}%
The extension name {\em G} is used to represent the combined set of IMAFD
extensions as it is expected to be a common combination.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{An Example Computer}
\enote{Need a block diagram and description of the virtual machine
that is used in this text.}%
The machine used to execute the programs presented in this text
has one RV32I CPU with 32 registers, one \acrshort{hart}
(analogous to what is called a {\em core} on other CPUs such as an ARM)
and 65536 bytes of memory.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Executing a Program}
\section{How the CPU Executes a Program}
To observe the operation of our example computer an RV32I simulator
will be used that will print a message describing the status of the
CPU and the instructions that it executes as it goes along.
The process of executing an instruction is called an
\index{instruction cycle}{\em instruction cycle} and it is comprised
The process of executing a program is continuously repeating series of
\index{instruction cycle}{\em instruction cycles} that are each comprised
of an {\em instruction fetch} and an {\em instruction execute} phase.
The status of the CPU is entirely embodied in the data values that
are stored in its registers at any moment in time. The simulator
can print all of the register values before it executes an instruction
for reference.
The current status of a CPU is entirely embodied in the data values that
are stored in its registers at any moment in time. Of particular interest
to an executing a program is the \reg{pc} register. The \reg{pc} contains
the memory address containing the instruction that the CPU will execute next.
When an instruction is executed the simulator can print a message
describing where in main memory it came from, its numeric machine code
value, its mneumonic, a description of any associated parameters,
the values of those parameters and then carry out the operation as
defined by the ISA.
For this to work, the instructions to be executed will have been
previously stored in a list in the main memory and any parameters that
an instruction specifies will either be part of the instruction itself
or read from (or stored into) one or more of the registers.
For this to work, the instructions to be executed must have been previously
stored in ajacent main memory locations and the address of the first instruction
placed into the \reg{pc} register.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -209,54 +249,22 @@ In order to {\em fetch} an instruction from the main memory the CPU
must have a method to identify which instruction should be fetched and
a method to fetch it.
To make this possible the main memory is broken up into small blocks
called \gls{byte}s that are each given a unique identifying number
called an \gls{address}. The process of identifying which instruction
to fetch is therefore a matter of knowing what address it is stored in.
Given that the main memory is broken up and that each of its bytes is
assigned an address, the \reg{pc} is used to hold the address of the
location where the next instruction to execute is located.
A byte is comprised of eight binary digits called \gls{bit}s.
Every possible instruction that the RV32I can execute contains
exactly 32 bits. Therefore each instruction must be stored in
four bytes of the main memory.
To simplify the hardware, each instruction
must be placed into four adjacent bytes whose numeric address sequence
begins with a multiple four. For example, an instruction might be
located in bytes 12, 13, 14 and 15 (but not in 15, 16, 17 and 18
nor 8, 207, 5, and 1073\ldots).
This sort of addressing requirement is common and is referred to as
\gls{alignment}. An aligned instruction begins at a memory address
that is a multiple of four. An {\em unaligned} instruction would
be one beginning at any other address and is {\em illegal}.
An attempt to fetch an instruction from an unaligned address
will result in an error referred to as an alignment {\em \gls{exception}}.
This and other exceptions cause the CPU to stop executing the
curent instruction and start executing a different set of instructions
that are prepared to handle the problem. Often an exception is
handled by completely stopping the program in a way that is commonly
refered to as a system or application {\em crash}.
Given a properly aligned instruction address, the CPU can request
that the main memory locate and deliver the values of the four bytes
in the address sequence to the CPU using what is called a memory
read operation. Some systems can deliver four (or more) bytes at the
same time while others might only be capable of delivering one or
two bytes at a time. These differences in hardware typically impact the
cost and performance of a system.\footnote{The design and implementation
choices that determine how any given system operates are part of what is
called a system's {\em organization} and is beyond the scope of this text.
See~\cite{codriscv:2017} for more information on computer organization.}
Given an instruction address, the CPU can request that the main memory
locate and return the value of the data stored there using what is called
a {\em memory read} operation and then the CPU can treat that {\em fetched}
value as an instruction and execute it.\footnote{RV32I instructions are
more than one byte in size, but this general description is suitable for now.}
Once an instruction has been fetched, it can be executed.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Instruction Execute}
\index{instruction execute}
Once an instruction has been fetched by the CPU, it can be executed.
Typical instructions do things like add a number to the value
currently stored in one of the registers or store the contents of a
register into the main memory at some given address.
@ -265,14 +273,19 @@ Also part of every instruction is a notion of what should be done next.
Most of the time an instruction will be complete by indicating that
the CPU should proceed to fetch and execute the instruction at the next
larger main memory address.
larger main memory address. In these cases the \reg{pc} is incremented
to point to the memory address after the current instruction.
Any parameters that an instruction requires must either be part of
the instruction itself or read from (or stored into) one or more of the
general purpose registers.
Some instructions can specify that the CPU proceed to execute an
instruction at an address other than the one that follows itself.
This class of instructions have names like {\em jump} and {\em branch}
and are available in a variety of different styles.
The RV ISA uses the word {\em jump} to refer to an {\em unconditional}
The RISC-V ISA uses the word {\em jump} to refer to an {\em unconditional}
change in the sequential processing of instructions and the word
{\em branch} to refer to a {\em conditional} change.
@ -285,143 +298,5 @@ one of two different actions pending the resulting {\em condition} of
the comparison.\footnote{This is the fundamental method used by a CPU
to make decisions.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{A Sample Program Source Listing}
A simple program that illustrates how this text presents
program source code is seen in \listingRef{zero4regs.S}.
This program will place a zero in each of the 4 registers
named x28, x29, x30 and x31.
\listing{zero4regs.S}{Setting four registers to zero.}
This program listing illustrates a number of things:
\begin{itemize}
\item Listings are identified by the name of the file within which
they are stored. This listing is from a file named: \verb@zero4regs.S@.
\item The assembly language programs discussed in this text will be saved
in files that end with: \verb@.S@ (Alternately you can use \verb@.sx@
on systems that don't understand the difference between upper and
lowercase letters.\footnote{The author of this text prefers to avoid
using such systems.})
\item A description of the listing's purpose appears under the name of the
file. The description of \listingRef{zero4regs.S} is
{\em Setting four registers to zero.}
\item The lines of the listing are numberd on the left margin for
easy reference.
\item An assembly program consists of lines of plain text.
\item The RISC-V ISA does not provide an operation that will simply
set a register to a numeric value. To accomplish our goal this
program will add zero to zero and place the sum in in each of the
four registers.
\item The lines that start with a dot `.' (on lines 1, 2 and 3) are
called {\em assembler directives} as they tell the assembler itself
how we want it to translate the following {\em assembly language instructions}
into {\em machine language instructions.}
\item Line 4 shows a {\em label} named {\em \_start}. The colon
at the end is the indicator to the assembler that causes it to
recognize the preceeding characters as a label.
\item Lines 5-8 are the four assembly language instructions that
make up the program. Each instruction in this program
consists of four {\em fields}. (Different instructions can have
a different number of fields.) The fields on line 5 are:
\begin{itemize}
\item [addi] The instruction mneumonic. It indicates the operation
that the CPU will perform.
\item [x28] The {\em destination} register that will receive the
sum when the {\em addi} instruction is finished. The names of
the 32 registers are expressed as x0 -- x31.
\item [x0] One of the addends of the sum operation. (The x0 register
will always contain the vlaue zero. It can never be changed.)
\item [0] The second addend is the number zero.
\item [\# set \ldots] Any text anywhere in a RISC-V assembly language
program that starts with the pound-sign is ignored by the assembler.
They are used to place a {\em comment} in the program to help
the reader better understand the motive of the programmer.
\end{itemize}
\end{itemize}
\subsection{Running a Program With rvddt}
\index{rvddt}
To illustrate what a CPU does when it executes instructions this text
will use a simulator that shows sequence of events and the binary values
involved. \listingRef{zero4regs.out} shows the operation of the four
{\em addi} instructions from \listingRef{zero4regs.S} when executed using the
\gls{rvddt} simulator.\footnote{The {\em rvddt} application was written to
generate the listings for this text. It is similar to the fancier
{\em spike} simulator. Given the simplicity of the RV32I ISA, rvddt
is less than 1700 lines of C++ and was written in one (long) afternoon.}
\listing{zero4regs.out}{Running a program with the rvddt simulator}
\begin{itemize}
\item [$\ell$ 1] This listing includes the command-line that shows how the simulator
was executed to load a file containing the machine instructions (aka
machine code) from the assembler.
\item [$\ell$ 2] A message from the simulator indicating that it loaded the machine
code into simulated memory at address 0.
\item [$\ell$ 3] This line shows the prompt from the debugger and the command
\verb@t4@ that the user entered to request that the simulator trace
the execution of four extructions.
\item [$\ell$ 4-8] Prior to executing the first instruction, the state of the
CPU registers is displayed.
\item [$\ell$ 4] The values in registers 0, 1, 2, 3, 4, 5, 6 and 7 are printed
from left to right in \gls{bigendian}, \gls{hexadecimal} form.
The dash `\verb@-@' character in the middle of the line is a reference
to make it easier to visually navigate across the line without being
forced to count the values from the far left when seeking the value
of, say, x5.
\item [$\ell$ 5-7] The values of registers 8--31 are printed.
\item [$\ell$ 8] The {\em program counter} (\reg{pc}) register is printed.
It contains the address of the instruction that the CPU will execute.
After each instruction, the \reg{pc} will either advance four bytes
ahead or be set to another value by a branch instruction as discussed above.
\item [$\ell$ 9] A four-byte instruction is fetched from memory at the address
in the \reg{pc} register, is decoded and printed. From left to right
the fields shown on this line are:
\begin{itemize}
\item [00000000] The memory address from which the instruction was
fetched. This address is displayed in \gls{bigendian},
\gls{hexadecimal} form.
\item [00000e13] The machine code of the instruction displayed in
\gls{bigendian}, \gls{hexadecimal} form.
\item [addi] The mneumonic for the machine instruction.
\item [x28] The \reg{rd} field of the addi instruction.
\item [x0] The \reg{rs1} field of the addi instruction that
holds one of the two addends of the operation.
\item [0] The \reg{imm} field of the addi instruction that
holds the second of the two addends of the operation.
\item [\# \ldots] A simulator-generated comment that exaplains
what the instruction is doing. For this instruction it indicates
that \reg{x28} will have the value zero stored into it as a result
of performing the addition: $0+0$.
\end{itemize}
\item [$\ell$ 10-14] These lines are printed as the prelude while tracing the
second instruction. Lines 7 and 13 show that \reg{x28} has changed
from \verb@f0f0f0f0@ to \verb@00000000@ as a result of executing the
first instruction and lines 8 and 14 show that the \reg{pc} has
advanced from zero (the location of the first instruction) to
four, where the second instruction will be fetched. None of the
rest of the registers have changed values.
\item [$\ell$ 15] The second instruction decoded executed and described.
This time register \reg{x29} will be assigned a value.
\item [$\ell$ 16-27] The third and fourth instructions are traced.
\item [$\ell$ 28] Tracing has completed. The simulator prints its prompt
and the user enters the `r' command to see the register state
after the fourth instruction has completed executing.
\item [$\ell$ 29-33] Following the fourth instruction it can be observed
that registers \reg{x28}, \reg{x29}, \reg{x30} and \reg{x31}
have been set to zero and that the \reg{pc} has advanced from
zero to four, then eight, then 12 (the hex value for 12 is c)
and then to 16 (which, in hex, is 10).
\item [$\ell$ 34] The simulator exit command `x' is entered by the user and
the terminal displays the shell prompt.
\end{itemize}
Once the instruction execution phase has completed, the next instruction
cycle will be performed using the new \reg{pc} register address.

View File

@ -1,57 +1,11 @@
\chapter{The RISC-V GNU Toolchain}
\chapter{Using The RISC-V GNU Toolchain}
This chapter discusses the GNU toolchain elements used to
This chapter discusses using the GNU toolchain elements to
experiment with the material in this book.
The\enote{It would be good to find some Mac and Windows users to write
and test proper variations on this section to address those systems.
Pull requests, welcome!}
instructions and examples here were all implemented on Ubuntu 16.04 LTS.
See \autoref{chapter:install} if you do not already have the
GNU crosscompiler toolchain availale on your system.
Install custom code in a location that will not cause interference with
other applications and allow for easy cleanup. These instructions
install the toolchain in \verb@/usr/local/riscv@. At any time
you can remove the lot and start over by executing the following
command:
\begin{verbatim}
rm -rf /usr/local/riscv/*
\end{verbatim}
Tested on Ubuntu 16.04 LTS.
18.04 was just released\ldots\ update accordingly.
These are the only commands that you should perform as root when installing
the toolchain:
\begin{verbatim}
sudo apt-get install autoconf automake autotools-dev curl libmpc-dev \
libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf \
libtool patchutils bc zlib1g-dev libexpat-dev
sudo mkdir -p /usr/local/riscv/
sudo chmod 777 /usr/local/riscv/
\end{verbatim}
All other commands should be executed as a regular user. This will eliminate the
possibility of clobbering system files that should not be touched when tinkering with
the toolchain applicaitons.
To download, compile and ``install'' the toolchain:
\begin{verbatim}
# riscv toolchain:
#
# https://riscv.org/software-tools/risc-v-gnu-compiler-toolchain/
git clone --recursive https://github.com/riscv/riscv-gnu-toolchain
cd riscv-gnu-toolchain
./configure --prefix=/usr/local/riscv/rv32i --with-arch=rv32i --with-abi=ilp32
make
make install
\end{verbatim}
Need to discuss augmenting the PATH environment variable.
Discuss the choice of ilp32 as well as what the other variations would do.