mirror of
https://github.com/johnwinans/rvalp.git
synced 2025-09-27 21:22:44 -04:00
Refactor, reorganize, rephrase,...
This commit is contained in:
parent
3276b58be4
commit
4ad6e160c8
@ -1,7 +1,7 @@
|
||||
TOP=..
|
||||
include $(TOP)/Make.rules
|
||||
|
||||
TEXPATH=./numbers:./intro:./rv32:./copyright:./license
|
||||
TEXPATH=./float:./intro:./rv32:./copyright:./license:./elements
|
||||
|
||||
SUBDIRS=
|
||||
|
||||
|
@ -1,11 +1,137 @@
|
||||
\chapter{Number Systems}
|
||||
\label{chapter:NumberSystems}
|
||||
\chapter{Numbers and Storage Systems}
|
||||
\label{chapter:numbers}
|
||||
|
||||
RISC-V systems represent information using binary values stored in
|
||||
little-endian order.\footnote{See\cite{IEN137} for some history of
|
||||
the big/little-endian ``controversy.''}
|
||||
This chapter discusses how data are represented and stored in a computer.
|
||||
|
||||
\section{Integers}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%\section{Context}
|
||||
%
|
||||
%Numbers can be interpreted differently depending on the context in
|
||||
%which they are used. For example a number may represent the quantity
|
||||
%of millimeters between two points. It may enumerate a
|
||||
%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$,
|
||||
%$01000011=C$\ldots\ In fact, any finite set of items can be identified
|
||||
%(enumerated) by a assigning a code number to each element in this fashon.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Logical/Boolean Functions}
|
||||
|
||||
\enote{This is unclear. Need to define bit positions and probably
|
||||
should add basic truth table diagrams.}%
|
||||
Unlike addition and subtraction, boolean functions apply
|
||||
on a per-bit basis.
|
||||
%in that they do not impact neighboring bits.
|
||||
%by generating things like a carry or a borrow.
|
||||
When applied to multi-bit values, each bit position is operated upon
|
||||
independantly of the other bits.
|
||||
\enote{Need to define 1 as true and 0 as false somewhere.}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{NOT}
|
||||
|
||||
The {\em NOT} operator applies to a single operand and represents the
|
||||
opposite of the input.
|
||||
\enote{Need to define unary, binary and ternary operators without
|
||||
confusing binary operators with binary numbers.}
|
||||
|
||||
If the input is 1 then the output is 0. If the input is 0 then the
|
||||
output is 1. In other words, the output value is {\em not} that of the
|
||||
input value.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em NOT} operator in symbolic form. Specifically the tilde: `\verb@~@'.
|
||||
|
||||
\begin{verbatim}
|
||||
~ 1 1 1 1 0 1 0 1 <== A
|
||||
-----------------
|
||||
0 0 0 0 1 0 1 0 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = ~A@
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{AND}
|
||||
|
||||
The boolean {\em and} function has two or more inputs and the output is a
|
||||
single bit. The output is 1 if and only if all of the input values are 1.
|
||||
Otherwise it is 0.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em AND} operator in symbolic form. Specifically the ampersand: `\verb@&@'.
|
||||
|
||||
This function works like it does in spoken language. For example
|
||||
if A is 1 {\em AND} B is 1 then the output is 1 (true).
|
||||
Otherwise the output is 0 (false). For example:
|
||||
|
||||
\begin{verbatim}
|
||||
1 1 1 1 0 1 0 1 <== A
|
||||
& 1 0 0 1 0 0 1 1 <== B
|
||||
-----------------
|
||||
1 0 0 1 0 0 0 1 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = A & B@
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{OR}
|
||||
|
||||
The boolean {\em or} function has two or more inputs and the output is a
|
||||
single bit. The output is 1 if at least one of the input values are 1.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em OR} operator in symbolic form. Specifically the pipe: `\verb@|@'.
|
||||
|
||||
This function works like it does in spoken language. For example
|
||||
if A is 1 {\em OR} B is 1 then the output is 1 (true).
|
||||
Otherwise the output is 0 (false). For example:
|
||||
|
||||
\begin{verbatim}
|
||||
1 1 1 1 0 1 0 1 <== A
|
||||
| 1 0 0 1 0 0 1 1 <== B
|
||||
-----------------
|
||||
1 1 1 1 0 1 1 1 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = A | B@
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{XOR}
|
||||
|
||||
The boolean {\em exclusive or} function has two or more inputs and the
|
||||
output is a single bit. The output is 1 if only an odd number of inputs
|
||||
are 1. Otherwise the output will be 0.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'.
|
||||
|
||||
Note that when {\em XOR} is used with two inputs, the output
|
||||
is set to 1 (true) when the inputs have different values and 0
|
||||
(false) when the inputs both have the same value.
|
||||
|
||||
For example:
|
||||
|
||||
\begin{verbatim}
|
||||
1 1 1 1 0 1 0 1 <== A
|
||||
^ 1 0 0 1 0 0 1 1 <== B
|
||||
-----------------
|
||||
0 1 1 0 0 1 1 0 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = A ^ B@
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Integers and Counting}
|
||||
|
||||
A binary integer is constructed with only 1s and 0s in the same
|
||||
manner as decimal numbers are constructed with values from 0 to 9.
|
||||
@ -408,343 +534,85 @@ Disscuss the details of truncation and overflow here.
|
||||
{\em truncation} and {\em overflow} as occur with signed and unsigned
|
||||
addition and subtraction.}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Logical/Boolean Functions}
|
||||
|
||||
Unlike addition and subtraction, boolean functions apply
|
||||
on a per-bit basis.
|
||||
%in that they do not impact neighboring bits.
|
||||
%by generating things like a carry or a borrow.
|
||||
When applied to multi-bit values, each bit position is operated upon
|
||||
independantly of the other bits.
|
||||
\enote{This is unclear. Need to define bit positions and probably
|
||||
should add basic truth table diagrams.}
|
||||
\enote{Need to define 1 as true and 0 as false somewhere.}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{NOT}
|
||||
|
||||
The {\em NOT} operator applies to a single operand and represents the
|
||||
opposite of the input.
|
||||
\enote{Need to define unary, binary and ternary operators without
|
||||
confusing binary operators with binary numbers.}
|
||||
|
||||
If the input is 1 then the output is 0. If the input is 0 then the
|
||||
output is 1. In other words, the output value is {\em not} that of the
|
||||
input value.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em NOT} operator in symbolic form. Specifically the tilde: `\verb@~@'.
|
||||
|
||||
\begin{verbatim}
|
||||
~ 1 1 1 1 0 1 0 1 <== A
|
||||
-----------------
|
||||
0 0 0 0 1 0 1 0 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = ~A@
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{AND}
|
||||
|
||||
The boolean {\em and} function has two or more inputs and the output is a
|
||||
single bit. The output is 1 if and only if all of the input values are 1.
|
||||
Otherwise it is 0.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em AND} operator in symbolic form. Specifically the ampersand: `\verb@&@'.
|
||||
|
||||
This function works like it does in spoken language. For example
|
||||
if A is 1 {\em AND} B is 1 then the output is 1 (true).
|
||||
Otherwise the output is 0 (false). For example:
|
||||
|
||||
\begin{verbatim}
|
||||
1 1 1 1 0 1 0 1 <== A
|
||||
& 1 0 0 1 0 0 1 1 <== B
|
||||
-----------------
|
||||
1 0 0 1 0 0 0 1 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = A & B@
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{OR}
|
||||
|
||||
The boolean {\em or} function has two or more inputs and the output is a
|
||||
single bit. The output is 1 if at least one of the input values are 1.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em OR} operator in symbolic form. Specifically the pipe: `\verb@|@'.
|
||||
|
||||
This function works like it does in spoken language. For example
|
||||
if A is 1 {\em OR} B is 1 then the output is 1 (true).
|
||||
Otherwise the output is 0 (false). For example:
|
||||
|
||||
\begin{verbatim}
|
||||
1 1 1 1 0 1 0 1 <== A
|
||||
| 1 0 0 1 0 0 1 1 <== B
|
||||
-----------------
|
||||
1 1 1 1 0 1 1 1 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = A | B@
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{XOR}
|
||||
|
||||
The boolean {\em exclusive or} function has two or more inputs and the
|
||||
output is a single bit. The output is 1 if only an odd number of inputs
|
||||
are 1. Otherwise the output will be 0.
|
||||
|
||||
This text will use the operator used in the C language when discussing
|
||||
the {\em XOR} operator in symbolic form. Specifically the carrot: `\verb@^@'.
|
||||
|
||||
Note that when {\em XOR} is used with two inputs, the output
|
||||
is set to 1 (true) when the inputs have different values and 0
|
||||
(false) when the inputs both have the same value.
|
||||
|
||||
For example:
|
||||
|
||||
\begin{verbatim}
|
||||
1 1 1 1 0 1 0 1 <== A
|
||||
^ 1 0 0 1 0 0 1 1 <== B
|
||||
-----------------
|
||||
0 1 1 0 0 1 1 0 <== output
|
||||
\end{verbatim}
|
||||
|
||||
In a line of code the above might read like this: \verb@output = A ^ B@
|
||||
|
||||
|
||||
|
||||
%\section{Context}
|
||||
%
|
||||
%Numbers can be interpreted differently depending on the context in
|
||||
%which they are used. For example a number may represent the quantity
|
||||
%of millimeters between two points. It may enumerate a
|
||||
%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$,
|
||||
%$01000011=C$\ldots\ In fact, any finite set of items can be identified
|
||||
%(enumerated) by a assigning a code number to each element in this fashon.
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{IEEE-754 Floating Point Number Representation}
|
||||
\label{chapter::floatingpoint}
|
||||
\section{Main Memory Storage}
|
||||
|
||||
This section provides an overview of the IEEE-754 32-bit binary floating
|
||||
point format.
|
||||
\enote{Refactor this section and the memory discussion in RV32 reference chapter}%
|
||||
When transferring data between its registers registers and main memory a
|
||||
RISC-V system uses the little-endian byte order.\footnote{
|
||||
See\cite{IEN137} for some history of the big/little-endian ``controversy.''}
|
||||
|
||||
\begin{itemize}
|
||||
\item Recall that the place values for integer binary numbers are:
|
||||
\begin{verbatim}
|
||||
... 128 64 32 16 8 4 2 1
|
||||
\end{verbatim}
|
||||
\item We can extend this to the right in binary similar to the way we do for
|
||||
decimal numbers:
|
||||
\begin{verbatim}
|
||||
... 128 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1/64 1/128 ...
|
||||
\end{verbatim}
|
||||
The `.' in a binary number is a binary point, not a decimal point.
|
||||
\enote{Discuss byte ordering, addressing and character strings.}
|
||||
|
||||
\item We use scientific notation as in $2.7 \times 10^{-47}$ to express either
|
||||
small fractions or large numbers when we are not concerned every last digit
|
||||
needed to represent the entire, exact, value of a number.
|
||||
\subsection{Memory Dump}
|
||||
|
||||
\item The format of a number in scientific notation is $mantissa \times base^{exponent}$
|
||||
Introduce the memory dump and how to read them here.
|
||||
|
||||
\item In binary we have $mantissa \times 2^{exponent}$
|
||||
Discuss the pitfalls of assuming what a set of bytes is used for based
|
||||
on their contents!
|
||||
|
||||
\item IEEE-754 format requires binary numbers to be {\em normalized} to
|
||||
$1.significand \times 2^{exponent}$ where the {\em significand}
|
||||
is the portion of the {\em mantissa} that is to the right of the binary-point.
|
||||
\subsection{Big Endian Representation}
|
||||
|
||||
\begin{itemize}
|
||||
\item The unnormalized binary value of $-2.625$ is $10.101$
|
||||
\item The normalized value of $-2.625$ is $1.0101 \times 2^1$
|
||||
\end{itemize}
|
||||
Using the memory dump contents in prior section, discuss how
|
||||
big endian values are stored.
|
||||
|
||||
\item We need not store the `1.' because {\em all} normalized floating
|
||||
point numbers will start that way. Thus we can save memory when storing
|
||||
normalized values by adding 1 to the significand.
|
||||
\subsection{Little Endian Representation}
|
||||
|
||||
{
|
||||
\small
|
||||
\setlength{\unitlength}{.15in}
|
||||
\begin{picture}(32,4)(0,0)
|
||||
\put(0,1){\line(1,0){32}} % bottom line
|
||||
\put(0,2){\line(1,0){32}} % top line
|
||||
Using the memory dump contents in prior section, discuss how
|
||||
little endian values are stored.
|
||||
|
||||
\put(0,1){\line(0,1){2}} % left vertical
|
||||
\put(0,2){\makebox(1,1){\tiny 31}} % left end bit number marker
|
||||
\subsection{Character Strings and Arrays}
|
||||
|
||||
\put(32,1){\line(0,1){2}} % vertical right end
|
||||
\put(31,2){\makebox(1,1){\tiny 0}} % right end bit number marker
|
||||
Define character strings and arrays.
|
||||
|
||||
\put(0,0){\makebox(1,1){\small sign}}
|
||||
\put(1,0){\makebox(8,1){\small exponent}}
|
||||
\put(9,0){\makebox(23,1){\small significand}}
|
||||
Using the prior memory dump, discuss how and where things are stored and
|
||||
retrieved.
|
||||
|
||||
\put(0,1){\makebox(1,1){1}} % sign
|
||||
\subsection{Alignment}
|
||||
|
||||
\put(1,1){\line(0,1){2}} % seperator
|
||||
\put(1,2){\makebox(1,1){\tiny 30}} % bit marker
|
||||
Draw a diagram showing the overlapping data types when they are all aligned.
|
||||
|
||||
\put(1,1){\makebox(1,1){1}} % exponent
|
||||
\put(2,1){\makebox(1,1){0}}
|
||||
\put(3,1){\makebox(1,1){0}}
|
||||
\put(4,1){\makebox(1,1){0}}
|
||||
\put(5,1){\makebox(1,1){0}}
|
||||
\put(6,1){\makebox(1,1){0}}
|
||||
\put(7,1){\makebox(1,1){0}}
|
||||
\put(8,1){\makebox(1,1){0}}
|
||||
|
||||
\put(8,2){\makebox(1,1){\tiny 23}} % bit marker
|
||||
\put(9,1){\line(0,1){2}} % seperator
|
||||
\put(9,2){\makebox(1,1){\tiny 22}} % bit marker
|
||||
\subsection{Instruction Alignment}
|
||||
|
||||
\put(9,1){\makebox(1,1){0}}
|
||||
\put(10,1){\makebox(1,1){1}}
|
||||
\put(11,1){\makebox(1,1){0}}
|
||||
\put(12,1){\makebox(1,1){1}}
|
||||
\put(13,1){\makebox(1,1){0}}
|
||||
\put(14,1){\makebox(1,1){0}}
|
||||
\put(15,1){\makebox(1,1){0}}
|
||||
\put(16,1){\makebox(1,1){0}}
|
||||
\put(17,1){\makebox(1,1){0}}
|
||||
\put(18,1){\makebox(1,1){0}}
|
||||
\put(19,1){\makebox(1,1){0}}
|
||||
\put(20,1){\makebox(1,1){0}}
|
||||
\put(21,1){\makebox(1,1){0}}
|
||||
\put(22,1){\makebox(1,1){0}}
|
||||
\put(23,1){\makebox(1,1){0}}
|
||||
\put(24,1){\makebox(1,1){0}}
|
||||
\put(25,1){\makebox(1,1){0}}
|
||||
\put(26,1){\makebox(1,1){0}}
|
||||
\put(27,1){\makebox(1,1){0}}
|
||||
\put(28,1){\makebox(1,1){0}}
|
||||
\put(29,1){\makebox(1,1){0}}
|
||||
\put(30,1){\makebox(1,1){0}}
|
||||
\put(31,1){\makebox(1,1){0}}
|
||||
\end{picture}
|
||||
}
|
||||
\enote{Rewrite this section for data rather than instructions and then
|
||||
note here that instructions must be naturally aligned. For RV32 that
|
||||
is on a 4-byte boundary}%
|
||||
Every possible instruction that an RV32I CPU can execute contains
|
||||
exactly 32 bits. Therefore each one must be stored in four bytes
|
||||
of the main memory.
|
||||
|
||||
%\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -(1 \frac{5}{16} \times 2^{1}) = -(1.3125 \times 2^{1}) = -2.625$
|
||||
\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -((1 + \frac{1}{4} + \frac{1}{16}) \times 2^1) = -(2 + \frac{1}{2} + \frac{1}{8}) = -(2 + .5 + .125) = -2.625$
|
||||
To simplify the hardware, each instruction must be placed into four
|
||||
adjacent bytes whose numeric address sequence begins with a multiple
|
||||
four. For example, an instruction might be located in bytes
|
||||
4, 5, 6 and 7 (but not in 5, 6, 7 and 8 nor in 9, 3, 1, and 0\ldots).
|
||||
|
||||
\item IEEE754 formats:
|
||||
This sort of addressing requirement is common and is referred to as
|
||||
\gls{alignment}. An aligned instruction begins at a memory address
|
||||
that is a multiple of four. An {\em unaligned} instruction would
|
||||
be one beginning at any other address and is {\em illegal}.
|
||||
|
||||
\begin{tabular}{|l|l|l|}
|
||||
\hline
|
||||
& IEEE754 32-bit & IEEE754 64-bit \\
|
||||
\hline
|
||||
sign & 1 bit & 1 bit \\
|
||||
exponent & 8 bits (excess-127) & 11 bits (excess-1023) \\
|
||||
mantissa & 23 bits & 52 bits \\
|
||||
max exponent & 127 & 1023 \\
|
||||
min exponent & -126 & -1022 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
An attempt to fetch an instruction from an unaligned address
|
||||
will result in an error referred to as an alignment {\em \gls{exception}}.
|
||||
This and other exceptions cause the CPU to stop executing the
|
||||
curent instruction and start executing a different set of instructions
|
||||
that are prepared to handle the problem. Often an exception is
|
||||
handled by completely stopping the program in a way that is commonly
|
||||
refered to as a system or application {\em crash}.
|
||||
|
||||
\item When the exponent is all ones, the mantissa is all zeros, and
|
||||
the sign is zero, the number represents positive infinity.
|
||||
Given a properly aligned instruction address, the CPU can request
|
||||
that the main memory locate and deliver the values of the four bytes
|
||||
in the address sequence to the CPU using what is called a memory
|
||||
read operation. Some systems can deliver four (or more) bytes at the
|
||||
same time while others might only be capable of delivering one or
|
||||
two bytes at a time. These differences in hardware typically impact the
|
||||
cost and performance of a system.\footnote{The design and implementation
|
||||
choices that determine how any given system operates are part of what is
|
||||
called a system's {\em organization} and is beyond the scope of this text.
|
||||
See~\cite{codriscv:2017} for more information on computer organization.}
|
||||
|
||||
\item When the exponent is all ones, the mantissa is all zeros, and
|
||||
the sign is one, the number represents negative infinity.
|
||||
|
||||
\item Note that the binary representation of an IEEE754 number in memory
|
||||
can be compared for magnitude with another one using the same logic as for
|
||||
comparing two's complement signed integers because the magnitude of an
|
||||
IEEE number grows upward and downward in the same fashion as signed integers.
|
||||
This is why we use excess notation and locate the significand's sign bit on
|
||||
the left of the exponent.
|
||||
|
||||
\item Note that zero is a special case number. Recall that a normalized
|
||||
number has an implied 1-bit to the left of the significand\ldots\ which
|
||||
means that there is no way to represent zero!
|
||||
Zero is represented by an exponent of all-zeros and a significand of
|
||||
all-zeros. This definition allows for a positive and a negative zero
|
||||
if we observe that the sign can be either 1 or 0.
|
||||
|
||||
\item On the number-line, numbers between zero and the smallest fraction in
|
||||
either direction are in the {\em \gls{underflow}} areas.
|
||||
\enote{Need to add the standard lecture numberline diagram showing
|
||||
where the over/under-flow areas are and why.}
|
||||
|
||||
\item On the number line, numbers greater than the mantissa of all-ones and the
|
||||
largest exponent allowed are in the {\em \gls{overflow}} areas.
|
||||
|
||||
\item Note that numbers have a higher resolution on the number line when the
|
||||
exponent is smaller.
|
||||
\end{itemize}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Floating Point Number Accuracy}
|
||||
Due to the finite number of bits used to store the value of a floating point
|
||||
number, it is not possible to represent every one of the infinite values
|
||||
on the real number line. The following C programs illustrate this point.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{Powers Of Two}
|
||||
Just like the integer numbers, the powers of two that have bits to represent
|
||||
them can be represented perfectly\ldots\ as can their sums (provided that the
|
||||
significand requires no more than 23 bits.)
|
||||
|
||||
\listing{powersoftwo.c}{Precise Powers of Two}
|
||||
\listing{powersoftwo.out}{Output from {\tt powersoftwo.c}}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{Clean Decimal Numbers}
|
||||
When dealing with decimal values, you will find that they don't map simply
|
||||
into binary floating point values.
|
||||
% (the same holds true for binary integer numbers).
|
||||
|
||||
Note how the decimal numbers are not accurately represented as they get larger.
|
||||
The decimal number on line 10 of \listingRef{cleandecimal.out}
|
||||
can be perfectly represented in IEEE format. However, a problem arises in
|
||||
the 11Th loop iteration. It is due to the fact that the
|
||||
binary number can not be represented accurately in IEEE format. Its least
|
||||
significant bits were truncated in a best-effort attempt at rounding the value
|
||||
off in order to fit the value into the bits provided. This is an example of
|
||||
{\em low order truncation}. Once this happens, the value of \verb@x.f@ is
|
||||
no longer as precise as it could be given more bits in which to save its value.
|
||||
|
||||
\listing{cleandecimal.c}{Print Clean Decimal Numbers}
|
||||
\listing{cleandecimal.out}{Output from {\tt cleandecimal.c}}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{Accumulation of Error}
|
||||
These rounding errors can be exaggerated when the number we multiply
|
||||
the \verb@x.f@ value by is, itself, something that can not be accurately
|
||||
represented in IEEE
|
||||
form.\footnote{Applications requiring accurate decimal values, such as
|
||||
financial accounting systems, can use a packed-decimal numeric format
|
||||
to avoid unexpected oddities caused by the use of binary numbers.}
|
||||
\enote{In a lecture one would show that one tenth is a repeating
|
||||
non-terminating binary number that gets truncated. This discussion
|
||||
should be reproduced here in text form.}
|
||||
|
||||
For example, if we multiply our \verb@x.f@ value by $\frac{1}{10}$ each time,
|
||||
we can never be accurate and we start accumulating errors immediately.
|
||||
|
||||
\listing{erroraccumulation.c}{Accumulation of Error}
|
||||
\listing{erroraccumulation.out}{Output from {\tt erroraccumulation.c}}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Reducing Error Accumulation}
|
||||
In order to use floating point numbers in a program without causing
|
||||
excessive rounding problems an algorithm can be redesigned such that the
|
||||
accumulation is eliminated.
|
||||
This example is similar to the previous one, but this time we recalculate the
|
||||
desired value from a known-accurate integer value.
|
||||
Some rounding errors remain present, but they can not accumulate.
|
||||
|
||||
\listing{errorcompensation.c}{Accumulation of Error}
|
||||
\listing{errorcompensation.out}{Output from {\tt erroraccumulation.c}}
|
@ -59,7 +59,8 @@
|
||||
%\part{Introduction}
|
||||
|
||||
\include{intro/chapter}
|
||||
\include{numbers/chapter}
|
||||
\include{binary/chapter}
|
||||
\include{elements/chapter}
|
||||
\include{toolchain/chapter}
|
||||
\include{rv32/chapter}
|
||||
|
||||
@ -67,6 +68,8 @@
|
||||
% These 'chapters' are lettered rather than numbered
|
||||
|
||||
\appendix
|
||||
\include{install/chapter}
|
||||
\include{float/chapter}
|
||||
\include{license/chapter}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
149
book/elements/chapter.tex
Normal file
149
book/elements/chapter.tex
Normal file
@ -0,0 +1,149 @@
|
||||
\chapter{The Elements of a Assembly Language Program}
|
||||
\label{chapter:elements}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{A Sample Program Source Listing}
|
||||
|
||||
A simple program that illustrates how this text presents
|
||||
program source code is seen in \listingRef{zero4regs.S}.
|
||||
This program will place a zero in each of the 4 registers
|
||||
named x28, x29, x30 and x31.
|
||||
|
||||
\listing{zero4regs.S}{Setting four registers to zero.}
|
||||
|
||||
This program listing illustrates a number of things:
|
||||
\begin{itemize}
|
||||
\item Listings are identified by the name of the file within which
|
||||
they are stored. This listing is from a file named: \verb@zero4regs.S@.
|
||||
\item The assembly language programs discussed in this text will be saved
|
||||
in files that end with: \verb@.S@ (Alternately you can use \verb@.sx@
|
||||
on systems that don't understand the difference between upper and
|
||||
lowercase letters.\footnote{The author of this text prefers to avoid
|
||||
using such systems.})
|
||||
\item A description of the listing's purpose appears under the name of the
|
||||
file. The description of \listingRef{zero4regs.S} is
|
||||
{\em Setting four registers to zero.}
|
||||
\item The lines of the listing are numberd on the left margin for
|
||||
easy reference.
|
||||
\item An assembly program consists of lines of plain text.
|
||||
\item The RISC-V ISA does not provide an operation that will simply
|
||||
set a register to a numeric value. To accomplish our goal this
|
||||
program will add zero to zero and place the sum in in each of the
|
||||
four registers.
|
||||
\item The lines that start with a dot `.' (on lines 1, 2 and 3) are
|
||||
called {\em assembler directives} as they tell the assembler itself
|
||||
how we want it to translate the following {\em assembly language instructions}
|
||||
into {\em machine language instructions.}
|
||||
\item Line 4 shows a {\em label} named {\em \_start}. The colon
|
||||
at the end is the indicator to the assembler that causes it to
|
||||
recognize the preceeding characters as a label.
|
||||
\item Lines 5-8 are the four assembly language instructions that
|
||||
make up the program. Each instruction in this program
|
||||
consists of four {\em fields}. (Different instructions can have
|
||||
a different number of fields.) The fields on line 5 are:
|
||||
|
||||
\begin{itemize}
|
||||
\item [addi] The instruction mneumonic. It indicates the operation
|
||||
that the CPU will perform.
|
||||
\item [x28] The {\em destination} register that will receive the
|
||||
sum when the {\em addi} instruction is finished. The names of
|
||||
the 32 registers are expressed as x0 -- x31.
|
||||
\item [x0] One of the addends of the sum operation. (The x0 register
|
||||
will always contain the vlaue zero. It can never be changed.)
|
||||
\item [0] The second addend is the number zero.
|
||||
\item [\# set \ldots] Any text anywhere in a RISC-V assembly language
|
||||
program that starts with the pound-sign is ignored by the assembler.
|
||||
They are used to place a {\em comment} in the program to help
|
||||
the reader better understand the motive of the programmer.
|
||||
\end{itemize}
|
||||
\end{itemize}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Running a Program With rvddt}
|
||||
\index{rvddt}
|
||||
|
||||
To illustrate what a CPU does when it executes instructions this text
|
||||
will use the \gls{rvddt} simulator to display shows sequence of events
|
||||
and the binary values involved. This simulator supports the RV32I ISA
|
||||
and has a configurable ammount of memory.%
|
||||
\footnote{The {\em rvddt} simulator was written to generate the listings for
|
||||
this text. It is similar to the fancier {\em spike} simulator.
|
||||
Given the simplicity of the RV32I ISA, rvddt is less than 1700 lines of C++
|
||||
and was written in one (long) afternoon.}
|
||||
|
||||
\listingRef{zero4regs.out} shows the operation of the four
|
||||
{\em addi} instructions from \listingRef{zero4regs.S} when it is executed
|
||||
in trace-mode.
|
||||
|
||||
\listing{zero4regs.out}{Running a program with the rvddt simulator}
|
||||
|
||||
\begin{itemize}
|
||||
\item [$\ell$ 1] This listing includes the command-line that shows how the simulator
|
||||
was executed to load a file containing the machine instructions (aka
|
||||
machine code) from the assembler.
|
||||
\item [$\ell$ 2] A message from the simulator indicating that it loaded the machine
|
||||
code into simulated memory at address 0.
|
||||
\item [$\ell$ 3] This line shows the prompt from the debugger and the command
|
||||
\verb@t4@ that the user entered to request that the simulator trace
|
||||
the execution of four extructions.
|
||||
\item [$\ell$ 4-8] Prior to executing the first instruction, the state of the
|
||||
CPU registers is displayed.
|
||||
\item [$\ell$ 4] The values in registers 0, 1, 2, 3, 4, 5, 6 and 7 are printed
|
||||
from left to right in \gls{bigendian}, \gls{hexadecimal} form.
|
||||
The dash `\verb@-@' character in the middle of the line is a reference
|
||||
to make it easier to visually navigate across the line without being
|
||||
forced to count the values from the far left when seeking the value
|
||||
of, say, x5.
|
||||
\item [$\ell$ 5-7] The values of registers 8--31 are printed.
|
||||
\item [$\ell$ 8] The {\em program counter} (\reg{pc}) register is printed.
|
||||
It contains the address of the instruction that the CPU will execute.
|
||||
After each instruction, the \reg{pc} will either advance four bytes
|
||||
ahead or be set to another value by a branch instruction as discussed above.
|
||||
\item [$\ell$ 9] A four-byte instruction is fetched from memory at the address
|
||||
in the \reg{pc} register, is decoded and printed. From left to right
|
||||
the fields shown on this line are:
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
\item [00000000] The memory address from which the instruction was
|
||||
fetched. This address is displayed in \gls{bigendian},
|
||||
\gls{hexadecimal} form.
|
||||
\item [00000e13] The machine code of the instruction displayed in
|
||||
\gls{bigendian}, \gls{hexadecimal} form.
|
||||
\item [addi] The mneumonic for the machine instruction.
|
||||
\item [x28] The \reg{rd} field of the addi instruction.
|
||||
\item [x0] The \reg{rs1} field of the addi instruction that
|
||||
holds one of the two addends of the operation.
|
||||
\item [0] The \reg{imm} field of the addi instruction that
|
||||
holds the second of the two addends of the operation.
|
||||
\item [\# \ldots] A simulator-generated comment that exaplains
|
||||
what the instruction is doing. For this instruction it indicates
|
||||
that \reg{x28} will have the value zero stored into it as a result
|
||||
of performing the addition: $0+0$.
|
||||
\end{itemize}
|
||||
|
||||
\item [$\ell$ 10-14] These lines are printed as the prelude while tracing the
|
||||
second instruction. Lines 7 and 13 show that \reg{x28} has changed
|
||||
from \verb@f0f0f0f0@ to \verb@00000000@ as a result of executing the
|
||||
first instruction and lines 8 and 14 show that the \reg{pc} has
|
||||
advanced from zero (the location of the first instruction) to
|
||||
four, where the second instruction will be fetched. None of the
|
||||
rest of the registers have changed values.
|
||||
\item [$\ell$ 15] The second instruction decoded executed and described.
|
||||
This time register \reg{x29} will be assigned a value.
|
||||
\item [$\ell$ 16-27] The third and fourth instructions are traced.
|
||||
\item [$\ell$ 28] Tracing has completed. The simulator prints its prompt
|
||||
and the user enters the `r' command to see the register state
|
||||
after the fourth instruction has completed executing.
|
||||
\item [$\ell$ 29-33] Following the fourth instruction it can be observed
|
||||
that registers \reg{x28}, \reg{x29}, \reg{x30} and \reg{x31}
|
||||
have been set to zero and that the \reg{pc} has advanced from
|
||||
zero to four, then eight, then 12 (the hex value for 12 is c)
|
||||
and then to 16 (which, in hex, is 10).
|
||||
\item [$\ell$ 34] The simulator exit command `x' is entered by the user and
|
||||
the terminal displays the shell prompt.
|
||||
|
||||
\end{itemize}
|
218
book/float/chapter.tex
Normal file
218
book/float/chapter.tex
Normal file
@ -0,0 +1,218 @@
|
||||
\chapter{Floating Point Numbers}
|
||||
\label{chapter:NumberSystems}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{IEEE-754 Floating Point Number Representation}
|
||||
\label{chapter::floatingpoint}
|
||||
|
||||
This section provides an overview of the IEEE-754 32-bit binary floating
|
||||
point format.
|
||||
|
||||
\begin{itemize}
|
||||
\item Recall that the place values for integer binary numbers are:
|
||||
\begin{verbatim}
|
||||
... 128 64 32 16 8 4 2 1
|
||||
\end{verbatim}
|
||||
\item We can extend this to the right in binary similar to the way we do for
|
||||
decimal numbers:
|
||||
\begin{verbatim}
|
||||
... 128 64 32 16 8 4 2 1 . 1/2 1/4 1/8 1/16 1/32 1/64 1/128 ...
|
||||
\end{verbatim}
|
||||
The `.' in a binary number is a binary point, not a decimal point.
|
||||
|
||||
\item We use scientific notation as in $2.7 \times 10^{-47}$ to express either
|
||||
small fractions or large numbers when we are not concerned every last digit
|
||||
needed to represent the entire, exact, value of a number.
|
||||
|
||||
\item The format of a number in scientific notation is $mantissa \times base^{exponent}$
|
||||
|
||||
\item In binary we have $mantissa \times 2^{exponent}$
|
||||
|
||||
\item IEEE-754 format requires binary numbers to be {\em normalized} to
|
||||
$1.significand \times 2^{exponent}$ where the {\em significand}
|
||||
is the portion of the {\em mantissa} that is to the right of the binary-point.
|
||||
|
||||
\begin{itemize}
|
||||
\item The unnormalized binary value of $-2.625$ is $10.101$
|
||||
\item The normalized value of $-2.625$ is $1.0101 \times 2^1$
|
||||
\end{itemize}
|
||||
|
||||
\item We need not store the `1.' because {\em all} normalized floating
|
||||
point numbers will start that way. Thus we can save memory when storing
|
||||
normalized values by adding 1 to the significand.
|
||||
|
||||
{
|
||||
\small
|
||||
\setlength{\unitlength}{.15in}
|
||||
\begin{picture}(32,4)(0,0)
|
||||
\put(0,1){\line(1,0){32}} % bottom line
|
||||
\put(0,2){\line(1,0){32}} % top line
|
||||
|
||||
\put(0,1){\line(0,1){2}} % left vertical
|
||||
\put(0,2){\makebox(1,1){\tiny 31}} % left end bit number marker
|
||||
|
||||
\put(32,1){\line(0,1){2}} % vertical right end
|
||||
\put(31,2){\makebox(1,1){\tiny 0}} % right end bit number marker
|
||||
|
||||
\put(0,0){\makebox(1,1){\small sign}}
|
||||
\put(1,0){\makebox(8,1){\small exponent}}
|
||||
\put(9,0){\makebox(23,1){\small significand}}
|
||||
|
||||
\put(0,1){\makebox(1,1){1}} % sign
|
||||
|
||||
\put(1,1){\line(0,1){2}} % seperator
|
||||
\put(1,2){\makebox(1,1){\tiny 30}} % bit marker
|
||||
|
||||
\put(1,1){\makebox(1,1){1}} % exponent
|
||||
\put(2,1){\makebox(1,1){0}}
|
||||
\put(3,1){\makebox(1,1){0}}
|
||||
\put(4,1){\makebox(1,1){0}}
|
||||
\put(5,1){\makebox(1,1){0}}
|
||||
\put(6,1){\makebox(1,1){0}}
|
||||
\put(7,1){\makebox(1,1){0}}
|
||||
\put(8,1){\makebox(1,1){0}}
|
||||
|
||||
\put(8,2){\makebox(1,1){\tiny 23}} % bit marker
|
||||
\put(9,1){\line(0,1){2}} % seperator
|
||||
\put(9,2){\makebox(1,1){\tiny 22}} % bit marker
|
||||
|
||||
\put(9,1){\makebox(1,1){0}}
|
||||
\put(10,1){\makebox(1,1){1}}
|
||||
\put(11,1){\makebox(1,1){0}}
|
||||
\put(12,1){\makebox(1,1){1}}
|
||||
\put(13,1){\makebox(1,1){0}}
|
||||
\put(14,1){\makebox(1,1){0}}
|
||||
\put(15,1){\makebox(1,1){0}}
|
||||
\put(16,1){\makebox(1,1){0}}
|
||||
\put(17,1){\makebox(1,1){0}}
|
||||
\put(18,1){\makebox(1,1){0}}
|
||||
\put(19,1){\makebox(1,1){0}}
|
||||
\put(20,1){\makebox(1,1){0}}
|
||||
\put(21,1){\makebox(1,1){0}}
|
||||
\put(22,1){\makebox(1,1){0}}
|
||||
\put(23,1){\makebox(1,1){0}}
|
||||
\put(24,1){\makebox(1,1){0}}
|
||||
\put(25,1){\makebox(1,1){0}}
|
||||
\put(26,1){\makebox(1,1){0}}
|
||||
\put(27,1){\makebox(1,1){0}}
|
||||
\put(28,1){\makebox(1,1){0}}
|
||||
\put(29,1){\makebox(1,1){0}}
|
||||
\put(30,1){\makebox(1,1){0}}
|
||||
\put(31,1){\makebox(1,1){0}}
|
||||
\end{picture}
|
||||
}
|
||||
|
||||
%\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -(1 \frac{5}{16} \times 2^{1}) = -(1.3125 \times 2^{1}) = -2.625$
|
||||
\item $-((1 + \frac{1}{4} + \frac{1}{16}) \times 2^{128-127}) = -((1 + \frac{1}{4} + \frac{1}{16}) \times 2^1) = -(2 + \frac{1}{2} + \frac{1}{8}) = -(2 + .5 + .125) = -2.625$
|
||||
|
||||
\item IEEE754 formats:
|
||||
|
||||
\begin{tabular}{|l|l|l|}
|
||||
\hline
|
||||
& IEEE754 32-bit & IEEE754 64-bit \\
|
||||
\hline
|
||||
sign & 1 bit & 1 bit \\
|
||||
exponent & 8 bits (excess-127) & 11 bits (excess-1023) \\
|
||||
mantissa & 23 bits & 52 bits \\
|
||||
max exponent & 127 & 1023 \\
|
||||
min exponent & -126 & -1022 \\
|
||||
\hline
|
||||
\end{tabular}
|
||||
|
||||
\item When the exponent is all ones, the mantissa is all zeros, and
|
||||
the sign is zero, the number represents positive infinity.
|
||||
|
||||
\item When the exponent is all ones, the mantissa is all zeros, and
|
||||
the sign is one, the number represents negative infinity.
|
||||
|
||||
\item Note that the binary representation of an IEEE754 number in memory
|
||||
can be compared for magnitude with another one using the same logic as for
|
||||
comparing two's complement signed integers because the magnitude of an
|
||||
IEEE number grows upward and downward in the same fashion as signed integers.
|
||||
This is why we use excess notation and locate the significand's sign bit on
|
||||
the left of the exponent.
|
||||
|
||||
\item Note that zero is a special case number. Recall that a normalized
|
||||
number has an implied 1-bit to the left of the significand\ldots\ which
|
||||
means that there is no way to represent zero!
|
||||
Zero is represented by an exponent of all-zeros and a significand of
|
||||
all-zeros. This definition allows for a positive and a negative zero
|
||||
if we observe that the sign can be either 1 or 0.
|
||||
|
||||
\item On the number-line, numbers between zero and the smallest fraction in
|
||||
either direction are in the {\em \gls{underflow}} areas.
|
||||
\enote{Need to add the standard lecture numberline diagram showing
|
||||
where the over/under-flow areas are and why.}
|
||||
|
||||
\item On the number line, numbers greater than the mantissa of all-ones and the
|
||||
largest exponent allowed are in the {\em \gls{overflow}} areas.
|
||||
|
||||
\item Note that numbers have a higher resolution on the number line when the
|
||||
exponent is smaller.
|
||||
\end{itemize}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Floating Point Number Accuracy}
|
||||
Due to the finite number of bits used to store the value of a floating point
|
||||
number, it is not possible to represent every one of the infinite values
|
||||
on the real number line. The following C programs illustrate this point.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{Powers Of Two}
|
||||
Just like the integer numbers, the powers of two that have bits to represent
|
||||
them can be represented perfectly\ldots\ as can their sums (provided that the
|
||||
significand requires no more than 23 bits.)
|
||||
|
||||
\listing{powersoftwo.c}{Precise Powers of Two}
|
||||
\listing{powersoftwo.out}{Output from {\tt powersoftwo.c}}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{Clean Decimal Numbers}
|
||||
When dealing with decimal values, you will find that they don't map simply
|
||||
into binary floating point values.
|
||||
% (the same holds true for binary integer numbers).
|
||||
|
||||
Note how the decimal numbers are not accurately represented as they get larger.
|
||||
The decimal number on line 10 of \listingRef{cleandecimal.out}
|
||||
can be perfectly represented in IEEE format. However, a problem arises in
|
||||
the 11Th loop iteration. It is due to the fact that the
|
||||
binary number can not be represented accurately in IEEE format. Its least
|
||||
significant bits were truncated in a best-effort attempt at rounding the value
|
||||
off in order to fit the value into the bits provided. This is an example of
|
||||
{\em low order truncation}. Once this happens, the value of \verb@x.f@ is
|
||||
no longer as precise as it could be given more bits in which to save its value.
|
||||
|
||||
\listing{cleandecimal.c}{Print Clean Decimal Numbers}
|
||||
\listing{cleandecimal.out}{Output from {\tt cleandecimal.c}}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsubsection{Accumulation of Error}
|
||||
These rounding errors can be exaggerated when the number we multiply
|
||||
the \verb@x.f@ value by is, itself, something that can not be accurately
|
||||
represented in IEEE
|
||||
form.\footnote{Applications requiring accurate decimal values, such as
|
||||
financial accounting systems, can use a packed-decimal numeric format
|
||||
to avoid unexpected oddities caused by the use of binary numbers.}
|
||||
\enote{In a lecture one would show that one tenth is a repeating
|
||||
non-terminating binary number that gets truncated. This discussion
|
||||
should be reproduced here in text form.}
|
||||
|
||||
For example, if we multiply our \verb@x.f@ value by $\frac{1}{10}$ each time,
|
||||
we can never be accurate and we start accumulating errors immediately.
|
||||
|
||||
\listing{erroraccumulation.c}{Accumulation of Error}
|
||||
\listing{erroraccumulation.out}{Output from {\tt erroraccumulation.c}}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Reducing Error Accumulation}
|
||||
In order to use floating point numbers in a program without causing
|
||||
excessive rounding problems an algorithm can be redesigned such that the
|
||||
accumulation is eliminated.
|
||||
This example is similar to the previous one, but this time we recalculate the
|
||||
desired value from a known-accurate integer value.
|
||||
Some rounding errors remain present, but they can not accumulate.
|
||||
|
||||
\listing{errorcompensation.c}{Accumulation of Error}
|
||||
\listing{errorcompensation.out}{Output from {\tt erroraccumulation.c}}
|
@ -165,6 +165,13 @@
|
||||
so the programmer need not memorize the biary values of each
|
||||
machine instruction}
|
||||
}
|
||||
\newglossaryentry{thread}
|
||||
{
|
||||
name={thread},
|
||||
description={An stream of instructions. When plural, it is
|
||||
used to refer to the ability of a CPU to execute multiple
|
||||
instruction streams at the same time}
|
||||
}
|
||||
|
||||
\newacronym{hart}{hart}{Hardware Thread}
|
||||
\newacronym{msb}{MSB}{Most Significant Bit}
|
||||
|
72
book/install/chapter.tex
Normal file
72
book/install/chapter.tex
Normal file
@ -0,0 +1,72 @@
|
||||
\chapter{Installing a RISC-V Toolchain}
|
||||
\label{chapter:install}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{The GNU Toolchain}
|
||||
|
||||
Discuss the GNU toolchain elements used to experiment with the
|
||||
material in this book.
|
||||
|
||||
\enote{It would be good to find some Mac and Windows users to write
|
||||
and test proper variations on this section to address those systems.
|
||||
Pull requests, welcome!}%
|
||||
The instructions and examples here were all implemented on Ubuntu 16.04 LTS.
|
||||
|
||||
Install custom code in a location that will not cause interference with
|
||||
other applications and allow for easy cleanup. These instructions
|
||||
install the toolchain in \verb@/usr/local/riscv@. At any time
|
||||
you can remove the lot and start over by executing the following
|
||||
command:
|
||||
|
||||
\begin{verbatim}
|
||||
rm -rf /usr/local/riscv/*
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
Tested on Ubuntu 16.04 LTS.
|
||||
18.04 was just released\ldots\ update accordingly.
|
||||
|
||||
These are the only commands that you should perform as root when installing
|
||||
the toolchain:
|
||||
|
||||
\begin{verbatim}
|
||||
sudo apt-get install autoconf automake autotools-dev curl libmpc-dev \
|
||||
libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf \
|
||||
libtool patchutils bc zlib1g-dev libexpat-dev
|
||||
sudo mkdir -p /usr/local/riscv/
|
||||
sudo chmod 777 /usr/local/riscv/
|
||||
\end{verbatim}
|
||||
|
||||
All other commands should be executed as a regular user. This will eliminate the
|
||||
possibility of clobbering system files that should not be touched when tinkering with
|
||||
the toolchain applicaitons.
|
||||
|
||||
To download, compile and ``install'' the toolchain:
|
||||
|
||||
\begin{verbatim}
|
||||
# riscv toolchain:
|
||||
#
|
||||
# https://riscv.org/software-tools/risc-v-gnu-compiler-toolchain/
|
||||
|
||||
git clone --recursive https://github.com/riscv/riscv-gnu-toolchain
|
||||
cd riscv-gnu-toolchain
|
||||
./configure --prefix=/usr/local/riscv/rv32i --with-arch=rv32i --with-abi=ilp32
|
||||
make
|
||||
make install
|
||||
\end{verbatim}
|
||||
|
||||
Need to discuss augmenting the PATH environment variable.
|
||||
|
||||
Discuss the choice of ilp32 as well as what the other variations would do.
|
||||
|
||||
Discuss rv32im and note that the details are found in \autoref{chapter:RV32}.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{rvddt}
|
||||
|
||||
Disciuss installing the rvddt simulator here.
|
@ -57,6 +57,11 @@ the data and instructions that can not fit into the CPU registers.
|
||||
Typically, a CPU's registers can hold tens of data values while
|
||||
the main memory can contain many billions of data values.
|
||||
|
||||
To keep track of the data values, each register is assigned a number and
|
||||
the main memory is broken up into small blocks called \gls{byte}s that
|
||||
are also each assigned number called an \gls{address}
|
||||
(an address is often referred to as a {\em location.}
|
||||
|
||||
A CPU can process data in a register at a speed that can be an order
|
||||
of magnitude faster than the rate that it can process (specifically,
|
||||
transfer data and instructions to and from) the main memory.
|
||||
@ -81,15 +86,72 @@ more slowly than its main memory.
|
||||
|
||||
This text is not particularly concerned with non-volatile storage.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{CPU}
|
||||
|
||||
\index{CPU}
|
||||
The \acrshort{cpu} is a collection of registers and circuitry designed
|
||||
to read data and instructions from the storage system. The instructions
|
||||
tell the CPU to perform various mathamatical and logical operations on
|
||||
the data in its registers and where to save the results of those operations.
|
||||
|
||||
\enote{Add a block diagram of the CPU components described here.}
|
||||
The \acrshort{cpu} is a collection of registers and circuitry designed
|
||||
manipulate the register data and to exchange data and instructions with the
|
||||
storage system. The instructions that it reads from the main memory tells
|
||||
the CPU to perform various mathamatical and logical operations on the data
|
||||
in its registers and where to save the results of those operations.
|
||||
|
||||
\subsubsection{Execution Unit}
|
||||
|
||||
The part of a CPU that coordinates all aspects of the operations of each
|
||||
instruction is called the {\em execution unit.} It is what performs the transfers
|
||||
of instructions and bata between the CPU and the main memory and tells the
|
||||
registers when they are supposed to either store or recall data being transferred.
|
||||
The execution unit also controls the ALU (Arithmetic and Logic Unit).
|
||||
|
||||
\subsubsection{Arithmetic and Logic Unit}
|
||||
\index{ALU}
|
||||
|
||||
When an instruction manipulates data by performing things like an {\em addition},
|
||||
{\em subtraction}, {\em comparison} or other similar operations, the ALU is what
|
||||
will calculate the sum, difference, and so on.
|
||||
|
||||
\subsubsection{Registers}
|
||||
\index{register}
|
||||
|
||||
In the RV32 CPU there are 31 general purpose registers that each contain 32 \gls{bit}s
|
||||
(where each bit is one \gls{binary} digit value of one or zero) and a number
|
||||
of special-purpose registers.
|
||||
Each of the general purpose registers is given a name such as \reg{x1}, \reg{x2},
|
||||
\ldots\ on up to \reg{x31} ({\em general purpose} refers to the fact that the CPU
|
||||
itself does not prescribe any particular function to any these registers.)
|
||||
Two important special-purpose registers are \reg{x0} and \reg{pc}.
|
||||
|
||||
Register \reg{x0} will always represent the value zero or logical {\em false}
|
||||
no matter what. If any instruction tries to change the value is \reg{x0} value the
|
||||
operation will fail. The need for {\em zero} is so common that, other than the
|
||||
fact that it is hard-wired to zero, the \reg{x0} register is made available as
|
||||
if it were otherwise a general purpose register.%
|
||||
\footnote{Having a special
|
||||
{\em zero} register allows the total set of instructions that the CPU can execute
|
||||
to be simplified. Thus reducing its complexity, power consumption and cost.}
|
||||
|
||||
The \reg{pc} regiter is called the {\em program counter}. The CPU uses it to
|
||||
remember the memory address where its program istructions are located.
|
||||
|
||||
The number of bits in each register is defined by the \acrfull{isa}.
|
||||
|
||||
\subsubsection{Harts}
|
||||
\index{hart}
|
||||
|
||||
Analogous to a {\em core} in other types of CPUs, a {\em \acrshort{hart}}
|
||||
(hardware \gls{thread}) in a RISC-V CPU refers to the collection of 32 registers,
|
||||
instruction execution unit and ALU.
|
||||
|
||||
When more than one hart is present in a CPU, a different stream of instructions can
|
||||
be executed on each hart all at the same time.
|
||||
Programs that are written to take advantage of this are called {\em multithreaded}.
|
||||
|
||||
This text will primairly focus on CPUs that have only one hart.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Peripherals}
|
||||
|
||||
@ -106,8 +168,8 @@ instructions are used to initiate, execute and/or synchronize data transfers.
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Instruction Set Architecture}
|
||||
|
||||
\index{ISA}
|
||||
|
||||
The catalog of rules that describes the details of the instructions
|
||||
and features that a given CPU provides is called its \acrfull{isa}.
|
||||
|
||||
@ -125,80 +187,58 @@ modules and zero or more of the {\em extension} modules.
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{RV Base Modules}
|
||||
\index{RV32I}
|
||||
|
||||
The base modules are RV32I (32-bit general purpose),
|
||||
RV32E (32-bit embedded), RV64I (64-bit general purpose)
|
||||
and RV128I (128-bit general purpose).
|
||||
|
||||
These base modules provide the minimal functional set of integer operations
|
||||
needed to execute an application. The differing bit-widths address
|
||||
needed to execute a useful application. The differing bit-widths address
|
||||
the needs of different main-memory sizes.
|
||||
|
||||
This text primairly focuses on the RV32I base module and how to program it.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Extension Modules}
|
||||
|
||||
\index{RV32M}
|
||||
\index{RV32A}
|
||||
\index{RV32F}
|
||||
\index{RV32D}
|
||||
\index{RV32Q}
|
||||
\index{RV32C}
|
||||
\index{RV32G}
|
||||
RISC-V extension modules may be included by an implementor interested
|
||||
in optimizing a design for one or more purposes.
|
||||
|
||||
\index{RV32M}%
|
||||
\index{RV32A}%
|
||||
\index{RV32F}%
|
||||
\index{RV32D}%
|
||||
\index{RV32Q}%
|
||||
\index{RV32C}%
|
||||
Available extension modules include M (integer math), A (atomic),
|
||||
F (32-bit floating point), D (64-bit floating point),
|
||||
Q (128-bit floating point), C (compressed size instructions) and others.
|
||||
|
||||
\index{RV32G}%
|
||||
The extension name {\em G} is used to represent the combined set of IMAFD
|
||||
extensions as it is expected to be a common combination.
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{An Example Computer}
|
||||
|
||||
\enote{Need a block diagram and description of the virtual machine
|
||||
that is used in this text.}%
|
||||
The machine used to execute the programs presented in this text
|
||||
has one RV32I CPU with 32 registers, one \acrshort{hart}
|
||||
(analogous to what is called a {\em core} on other CPUs such as an ARM)
|
||||
and 65536 bytes of memory.
|
||||
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Executing a Program}
|
||||
\section{How the CPU Executes a Program}
|
||||
|
||||
To observe the operation of our example computer an RV32I simulator
|
||||
will be used that will print a message describing the status of the
|
||||
CPU and the instructions that it executes as it goes along.
|
||||
|
||||
The process of executing an instruction is called an
|
||||
\index{instruction cycle}{\em instruction cycle} and it is comprised
|
||||
The process of executing a program is continuously repeating series of
|
||||
\index{instruction cycle}{\em instruction cycles} that are each comprised
|
||||
of an {\em instruction fetch} and an {\em instruction execute} phase.
|
||||
|
||||
The status of the CPU is entirely embodied in the data values that
|
||||
are stored in its registers at any moment in time. The simulator
|
||||
can print all of the register values before it executes an instruction
|
||||
for reference.
|
||||
The current status of a CPU is entirely embodied in the data values that
|
||||
are stored in its registers at any moment in time. Of particular interest
|
||||
to an executing a program is the \reg{pc} register. The \reg{pc} contains
|
||||
the memory address containing the instruction that the CPU will execute next.
|
||||
|
||||
When an instruction is executed the simulator can print a message
|
||||
describing where in main memory it came from, its numeric machine code
|
||||
value, its mneumonic, a description of any associated parameters,
|
||||
the values of those parameters and then carry out the operation as
|
||||
defined by the ISA.
|
||||
|
||||
For this to work, the instructions to be executed will have been
|
||||
previously stored in a list in the main memory and any parameters that
|
||||
an instruction specifies will either be part of the instruction itself
|
||||
or read from (or stored into) one or more of the registers.
|
||||
For this to work, the instructions to be executed must have been previously
|
||||
stored in ajacent main memory locations and the address of the first instruction
|
||||
placed into the \reg{pc} register.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
@ -209,54 +249,22 @@ In order to {\em fetch} an instruction from the main memory the CPU
|
||||
must have a method to identify which instruction should be fetched and
|
||||
a method to fetch it.
|
||||
|
||||
To make this possible the main memory is broken up into small blocks
|
||||
called \gls{byte}s that are each given a unique identifying number
|
||||
called an \gls{address}. The process of identifying which instruction
|
||||
to fetch is therefore a matter of knowing what address it is stored in.
|
||||
Given that the main memory is broken up and that each of its bytes is
|
||||
assigned an address, the \reg{pc} is used to hold the address of the
|
||||
location where the next instruction to execute is located.
|
||||
|
||||
A byte is comprised of eight binary digits called \gls{bit}s.
|
||||
|
||||
Every possible instruction that the RV32I can execute contains
|
||||
exactly 32 bits. Therefore each instruction must be stored in
|
||||
four bytes of the main memory.
|
||||
|
||||
To simplify the hardware, each instruction
|
||||
must be placed into four adjacent bytes whose numeric address sequence
|
||||
begins with a multiple four. For example, an instruction might be
|
||||
located in bytes 12, 13, 14 and 15 (but not in 15, 16, 17 and 18
|
||||
nor 8, 207, 5, and 1073\ldots).
|
||||
|
||||
This sort of addressing requirement is common and is referred to as
|
||||
\gls{alignment}. An aligned instruction begins at a memory address
|
||||
that is a multiple of four. An {\em unaligned} instruction would
|
||||
be one beginning at any other address and is {\em illegal}.
|
||||
|
||||
An attempt to fetch an instruction from an unaligned address
|
||||
will result in an error referred to as an alignment {\em \gls{exception}}.
|
||||
This and other exceptions cause the CPU to stop executing the
|
||||
curent instruction and start executing a different set of instructions
|
||||
that are prepared to handle the problem. Often an exception is
|
||||
handled by completely stopping the program in a way that is commonly
|
||||
refered to as a system or application {\em crash}.
|
||||
|
||||
Given a properly aligned instruction address, the CPU can request
|
||||
that the main memory locate and deliver the values of the four bytes
|
||||
in the address sequence to the CPU using what is called a memory
|
||||
read operation. Some systems can deliver four (or more) bytes at the
|
||||
same time while others might only be capable of delivering one or
|
||||
two bytes at a time. These differences in hardware typically impact the
|
||||
cost and performance of a system.\footnote{The design and implementation
|
||||
choices that determine how any given system operates are part of what is
|
||||
called a system's {\em organization} and is beyond the scope of this text.
|
||||
See~\cite{codriscv:2017} for more information on computer organization.}
|
||||
Given an instruction address, the CPU can request that the main memory
|
||||
locate and return the value of the data stored there using what is called
|
||||
a {\em memory read} operation and then the CPU can treat that {\em fetched}
|
||||
value as an instruction and execute it.\footnote{RV32I instructions are
|
||||
more than one byte in size, but this general description is suitable for now.}
|
||||
|
||||
Once an instruction has been fetched, it can be executed.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Instruction Execute}
|
||||
\index{instruction execute}
|
||||
|
||||
Once an instruction has been fetched by the CPU, it can be executed.
|
||||
|
||||
Typical instructions do things like add a number to the value
|
||||
currently stored in one of the registers or store the contents of a
|
||||
register into the main memory at some given address.
|
||||
@ -265,14 +273,19 @@ Also part of every instruction is a notion of what should be done next.
|
||||
|
||||
Most of the time an instruction will be complete by indicating that
|
||||
the CPU should proceed to fetch and execute the instruction at the next
|
||||
larger main memory address.
|
||||
larger main memory address. In these cases the \reg{pc} is incremented
|
||||
to point to the memory address after the current instruction.
|
||||
|
||||
Any parameters that an instruction requires must either be part of
|
||||
the instruction itself or read from (or stored into) one or more of the
|
||||
general purpose registers.
|
||||
|
||||
Some instructions can specify that the CPU proceed to execute an
|
||||
instruction at an address other than the one that follows itself.
|
||||
This class of instructions have names like {\em jump} and {\em branch}
|
||||
and are available in a variety of different styles.
|
||||
|
||||
The RV ISA uses the word {\em jump} to refer to an {\em unconditional}
|
||||
The RISC-V ISA uses the word {\em jump} to refer to an {\em unconditional}
|
||||
change in the sequential processing of instructions and the word
|
||||
{\em branch} to refer to a {\em conditional} change.
|
||||
|
||||
@ -285,143 +298,5 @@ one of two different actions pending the resulting {\em condition} of
|
||||
the comparison.\footnote{This is the fundamental method used by a CPU
|
||||
to make decisions.}
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{A Sample Program Source Listing}
|
||||
|
||||
A simple program that illustrates how this text presents
|
||||
program source code is seen in \listingRef{zero4regs.S}.
|
||||
This program will place a zero in each of the 4 registers
|
||||
named x28, x29, x30 and x31.
|
||||
|
||||
\listing{zero4regs.S}{Setting four registers to zero.}
|
||||
|
||||
This program listing illustrates a number of things:
|
||||
\begin{itemize}
|
||||
\item Listings are identified by the name of the file within which
|
||||
they are stored. This listing is from a file named: \verb@zero4regs.S@.
|
||||
\item The assembly language programs discussed in this text will be saved
|
||||
in files that end with: \verb@.S@ (Alternately you can use \verb@.sx@
|
||||
on systems that don't understand the difference between upper and
|
||||
lowercase letters.\footnote{The author of this text prefers to avoid
|
||||
using such systems.})
|
||||
\item A description of the listing's purpose appears under the name of the
|
||||
file. The description of \listingRef{zero4regs.S} is
|
||||
{\em Setting four registers to zero.}
|
||||
\item The lines of the listing are numberd on the left margin for
|
||||
easy reference.
|
||||
\item An assembly program consists of lines of plain text.
|
||||
\item The RISC-V ISA does not provide an operation that will simply
|
||||
set a register to a numeric value. To accomplish our goal this
|
||||
program will add zero to zero and place the sum in in each of the
|
||||
four registers.
|
||||
\item The lines that start with a dot `.' (on lines 1, 2 and 3) are
|
||||
called {\em assembler directives} as they tell the assembler itself
|
||||
how we want it to translate the following {\em assembly language instructions}
|
||||
into {\em machine language instructions.}
|
||||
\item Line 4 shows a {\em label} named {\em \_start}. The colon
|
||||
at the end is the indicator to the assembler that causes it to
|
||||
recognize the preceeding characters as a label.
|
||||
\item Lines 5-8 are the four assembly language instructions that
|
||||
make up the program. Each instruction in this program
|
||||
consists of four {\em fields}. (Different instructions can have
|
||||
a different number of fields.) The fields on line 5 are:
|
||||
|
||||
\begin{itemize}
|
||||
\item [addi] The instruction mneumonic. It indicates the operation
|
||||
that the CPU will perform.
|
||||
\item [x28] The {\em destination} register that will receive the
|
||||
sum when the {\em addi} instruction is finished. The names of
|
||||
the 32 registers are expressed as x0 -- x31.
|
||||
\item [x0] One of the addends of the sum operation. (The x0 register
|
||||
will always contain the vlaue zero. It can never be changed.)
|
||||
\item [0] The second addend is the number zero.
|
||||
\item [\# set \ldots] Any text anywhere in a RISC-V assembly language
|
||||
program that starts with the pound-sign is ignored by the assembler.
|
||||
They are used to place a {\em comment} in the program to help
|
||||
the reader better understand the motive of the programmer.
|
||||
\end{itemize}
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\subsection{Running a Program With rvddt}
|
||||
\index{rvddt}
|
||||
|
||||
To illustrate what a CPU does when it executes instructions this text
|
||||
will use a simulator that shows sequence of events and the binary values
|
||||
involved. \listingRef{zero4regs.out} shows the operation of the four
|
||||
{\em addi} instructions from \listingRef{zero4regs.S} when executed using the
|
||||
\gls{rvddt} simulator.\footnote{The {\em rvddt} application was written to
|
||||
generate the listings for this text. It is similar to the fancier
|
||||
{\em spike} simulator. Given the simplicity of the RV32I ISA, rvddt
|
||||
is less than 1700 lines of C++ and was written in one (long) afternoon.}
|
||||
|
||||
\listing{zero4regs.out}{Running a program with the rvddt simulator}
|
||||
|
||||
\begin{itemize}
|
||||
\item [$\ell$ 1] This listing includes the command-line that shows how the simulator
|
||||
was executed to load a file containing the machine instructions (aka
|
||||
machine code) from the assembler.
|
||||
\item [$\ell$ 2] A message from the simulator indicating that it loaded the machine
|
||||
code into simulated memory at address 0.
|
||||
\item [$\ell$ 3] This line shows the prompt from the debugger and the command
|
||||
\verb@t4@ that the user entered to request that the simulator trace
|
||||
the execution of four extructions.
|
||||
\item [$\ell$ 4-8] Prior to executing the first instruction, the state of the
|
||||
CPU registers is displayed.
|
||||
\item [$\ell$ 4] The values in registers 0, 1, 2, 3, 4, 5, 6 and 7 are printed
|
||||
from left to right in \gls{bigendian}, \gls{hexadecimal} form.
|
||||
The dash `\verb@-@' character in the middle of the line is a reference
|
||||
to make it easier to visually navigate across the line without being
|
||||
forced to count the values from the far left when seeking the value
|
||||
of, say, x5.
|
||||
\item [$\ell$ 5-7] The values of registers 8--31 are printed.
|
||||
\item [$\ell$ 8] The {\em program counter} (\reg{pc}) register is printed.
|
||||
It contains the address of the instruction that the CPU will execute.
|
||||
After each instruction, the \reg{pc} will either advance four bytes
|
||||
ahead or be set to another value by a branch instruction as discussed above.
|
||||
\item [$\ell$ 9] A four-byte instruction is fetched from memory at the address
|
||||
in the \reg{pc} register, is decoded and printed. From left to right
|
||||
the fields shown on this line are:
|
||||
|
||||
\begin{itemize}
|
||||
|
||||
\item [00000000] The memory address from which the instruction was
|
||||
fetched. This address is displayed in \gls{bigendian},
|
||||
\gls{hexadecimal} form.
|
||||
\item [00000e13] The machine code of the instruction displayed in
|
||||
\gls{bigendian}, \gls{hexadecimal} form.
|
||||
\item [addi] The mneumonic for the machine instruction.
|
||||
\item [x28] The \reg{rd} field of the addi instruction.
|
||||
\item [x0] The \reg{rs1} field of the addi instruction that
|
||||
holds one of the two addends of the operation.
|
||||
\item [0] The \reg{imm} field of the addi instruction that
|
||||
holds the second of the two addends of the operation.
|
||||
\item [\# \ldots] A simulator-generated comment that exaplains
|
||||
what the instruction is doing. For this instruction it indicates
|
||||
that \reg{x28} will have the value zero stored into it as a result
|
||||
of performing the addition: $0+0$.
|
||||
\end{itemize}
|
||||
|
||||
\item [$\ell$ 10-14] These lines are printed as the prelude while tracing the
|
||||
second instruction. Lines 7 and 13 show that \reg{x28} has changed
|
||||
from \verb@f0f0f0f0@ to \verb@00000000@ as a result of executing the
|
||||
first instruction and lines 8 and 14 show that the \reg{pc} has
|
||||
advanced from zero (the location of the first instruction) to
|
||||
four, where the second instruction will be fetched. None of the
|
||||
rest of the registers have changed values.
|
||||
\item [$\ell$ 15] The second instruction decoded executed and described.
|
||||
This time register \reg{x29} will be assigned a value.
|
||||
\item [$\ell$ 16-27] The third and fourth instructions are traced.
|
||||
\item [$\ell$ 28] Tracing has completed. The simulator prints its prompt
|
||||
and the user enters the `r' command to see the register state
|
||||
after the fourth instruction has completed executing.
|
||||
\item [$\ell$ 29-33] Following the fourth instruction it can be observed
|
||||
that registers \reg{x28}, \reg{x29}, \reg{x30} and \reg{x31}
|
||||
have been set to zero and that the \reg{pc} has advanced from
|
||||
zero to four, then eight, then 12 (the hex value for 12 is c)
|
||||
and then to 16 (which, in hex, is 10).
|
||||
\item [$\ell$ 34] The simulator exit command `x' is entered by the user and
|
||||
the terminal displays the shell prompt.
|
||||
|
||||
\end{itemize}
|
||||
Once the instruction execution phase has completed, the next instruction
|
||||
cycle will be performed using the new \reg{pc} register address.
|
||||
|
@ -1,57 +1,11 @@
|
||||
\chapter{The RISC-V GNU Toolchain}
|
||||
\chapter{Using The RISC-V GNU Toolchain}
|
||||
|
||||
This chapter discusses the GNU toolchain elements used to
|
||||
This chapter discusses using the GNU toolchain elements to
|
||||
experiment with the material in this book.
|
||||
|
||||
The\enote{It would be good to find some Mac and Windows users to write
|
||||
and test proper variations on this section to address those systems.
|
||||
Pull requests, welcome!}
|
||||
instructions and examples here were all implemented on Ubuntu 16.04 LTS.
|
||||
See \autoref{chapter:install} if you do not already have the
|
||||
GNU crosscompiler toolchain availale on your system.
|
||||
|
||||
Install custom code in a location that will not cause interference with
|
||||
other applications and allow for easy cleanup. These instructions
|
||||
install the toolchain in \verb@/usr/local/riscv@. At any time
|
||||
you can remove the lot and start over by executing the following
|
||||
command:
|
||||
|
||||
\begin{verbatim}
|
||||
rm -rf /usr/local/riscv/*
|
||||
\end{verbatim}
|
||||
|
||||
|
||||
Tested on Ubuntu 16.04 LTS.
|
||||
18.04 was just released\ldots\ update accordingly.
|
||||
|
||||
These are the only commands that you should perform as root when installing
|
||||
the toolchain:
|
||||
|
||||
\begin{verbatim}
|
||||
sudo apt-get install autoconf automake autotools-dev curl libmpc-dev \
|
||||
libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf \
|
||||
libtool patchutils bc zlib1g-dev libexpat-dev
|
||||
sudo mkdir -p /usr/local/riscv/
|
||||
sudo chmod 777 /usr/local/riscv/
|
||||
\end{verbatim}
|
||||
|
||||
All other commands should be executed as a regular user. This will eliminate the
|
||||
possibility of clobbering system files that should not be touched when tinkering with
|
||||
the toolchain applicaitons.
|
||||
|
||||
To download, compile and ``install'' the toolchain:
|
||||
|
||||
\begin{verbatim}
|
||||
# riscv toolchain:
|
||||
#
|
||||
# https://riscv.org/software-tools/risc-v-gnu-compiler-toolchain/
|
||||
|
||||
git clone --recursive https://github.com/riscv/riscv-gnu-toolchain
|
||||
cd riscv-gnu-toolchain
|
||||
./configure --prefix=/usr/local/riscv/rv32i --with-arch=rv32i --with-abi=ilp32
|
||||
make
|
||||
make install
|
||||
\end{verbatim}
|
||||
|
||||
Need to discuss augmenting the PATH environment variable.
|
||||
|
||||
Discuss the choice of ilp32 as well as what the other variations would do.
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user