Reorg and prepare to relocate memory details into binary chapter.

This commit is contained in:
John Winans 2018-05-17 22:51:25 -05:00
parent ea1e7f1bb7
commit 62ae79c007
2 changed files with 148 additions and 23 deletions

View File

@ -3,32 +3,43 @@
This chapter discusses how data are represented and stored in a computer.
In the context of computing, {\em boolean} refers to a condition that can
be either true and false and {\em binary} refers to the use of a base-2
numeric system to rpresent numbers.
RISC-V assembly language uses binary to represent all values, be they
boolean or numeric. It is the context within which they are used that
determines whether they are boolean or numeric.
RISC-V assembly language uses zero to represent {\em false} and one
to represent {\em true}. In general, however, it is useful to relax
this and define zero {\bf and only zero} to be {\em false} and anything
that is not {\em false} is therefore {\em true}.%
\footnote{This is how {\em true} and {\em false} behave in C, C++, and
many other languages as well as the common assembly language idioms
discussed in this text.}
\enote{Add some diagrams here showing bits, bytes and the MSB,
LSB,\ldots\ perhaps relocated from the RV32I chapter?}%
The reason for this relaxation is because, while a single binary digit
(\gls{bit}) can represent the two values zero and one, the vast majority
of the time data is processed by the CPU in groups of bits. These
groups have names like \gls{byte}, \gls{halfword} and \gls{fullword}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\section{Context}
%
%Numbers can be interpreted differently depending on the context in
%which they are used. For example a number may represent the quantity
%of millimeters between two points. It may enumerate a
%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$,
%$01000011=C$\ldots\ In fact, any finite set of items can be identified
%(enumerated) by a assigning a code number to each element in this fashon.
\section{Boolean Functions}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Logical/Boolean Functions}
\enote{This is unclear. Need to define bit positions and probably
should add basic truth table diagrams.}%
Unlike addition and subtraction, boolean functions apply
on a per-bit basis.
\enote{Probably should add basic truth table diagrams.}%
Boolean functions apply on a per-bit basis.
%in that they do not impact neighboring bits.
%by generating things like a carry or a borrow.
When applied to multi-bit values, each bit position is operated upon
independently of the other bits.
\enote{Need to define 1 as true and 0 as false somewhere.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@ -534,38 +545,123 @@ Discuss the details of truncation and overflow here.
{\em truncation} and {\em overflow} as occur with signed and unsigned
addition and subtraction.}
I prefer to define {\em truncation} as the loss of data as result of
the bit-length of the destination being too small to hold result of an
operation and {\em overflow} as when the carry into a sign bit is not
the same as the carry out of the sign bit.
Where addition and subtraction on the RV32 is concerned, the sum or difference of
two unsigned 32-bit numbers will be {\em truncated} when the operation results in
a carry out of bit 31. Unsinged operations can not overflow (as defined above).
(show a truncation picture here)
An Overflow occurs with signed numbers when the two addends are positive and
sum is negative or the addends are both negative and the sum is positive.
(show an overflow picture here)
(show mixed overflow and truncation situations here to drive home the need
to ignore truncation when dealing with signed numbers.)
0xffffffff + 0x00000002 has truncation but not overflow
(OK for signed, not OK for unsigned).
0xffffffff + 0xfffffffe also has truncation but not overflow.
0x40000000 + 0x40000000 has overflow but not truncation. (We care if are signed numbers.)
0x80000000 + 0x80000000 has both overflow and truncation. (we care regardless of signedness)
Where subtraction is concerned the notion of a borrow is the same as carry.
\enote{I think that overloading the word {\em overflow} like this can be is
confusing to new programmers.}%
Page 13 of~\cite{rvismv1v22:2017} mixes these two notions of
(and never mentions the word {\em truncate}) like this:
\begin{quote}
We did not include special instruction set support for overflow checks on
integer arithmetic operations in the base instruction set, as many overflow
checks can be cheaply implemented using RISC-V branches. Overflow checking for
unsigned addition requires only a single additional branch instruction after the
addition: \verb@add t0, t1, t2; bltu t0, t1, overflow@.
For signed addition, if one operand's sign is known, overflow checking requires
only a single branch after the addition:
\verb@addi t0, t1, +imm; blt t0, t1, overflow@. This covers the common
case of addition with an immediate operand.
For general signed addition, three additional instructions after the addition
are required, leveraging the observation that the sum should be less than one
of the operands if and only if the other operand is negative.
\begin{verbatim}
add t0, t1, t2
slti t3, t2, 0
slt t4, t0, t1
bne t3, t4, overflow
\end{verbatim}
In RV64, checks of 32-bit signed additions can be optimized further by comparing
the results of ADD and ADDW on the operands.
\end{quote}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Sign and Zero Extension}
\enote{Refactor the sx() and zx() discussion in the RV32I chapter
and locate the details here.}%
Seems like a good place to discuss extension.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Shifting}
Seems like a good place to discuss logical and arithmetic shifting.
shift left logical
shift right logical
shift right arithmetic
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Main Memory Storage}
\enote{Refactor this section and the memory discussion in RV32 reference chapter}%
\enote{Consider refactoring the memory discussion in RV32 reference chapter
and placing some of it in this section.}%
When transferring data between its registers registers and main memory a
RISC-V system uses the little-endian byte order.\footnote{
See\cite{IEN137} for some history of the big/little-endian ``controversy.''}
\enote{Discuss byte ordering, addressing and character strings.}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Memory Dump}
Introduce the memory dump and how to read them here.
Discuss the pitfalls of assuming what a set of bytes is used for based
on their contents!
\listing{rvddt_memdump.out}{{\tt rvddt} memory dump}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Big Endian Representation}
Using the memory dump contents in prior section, discuss how
big endian values are stored.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Little Endian Representation}
Using the memory dump contents in prior section, discuss how
little endian values are stored.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Character Strings and Arrays}
Define character strings and arrays.
@ -573,11 +669,23 @@ Define character strings and arrays.
Using the prior memory dump, discuss how and where things are stored and
retrieved.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Context is Important!}
Data values can be interpreted differently depending on the context in
which they are used. Assuming what a set of bytes is used for based on
their contents can be very misleading! For example, there is a 0x76 at
address 0x00002658. This is a `v' is you use it as an ASCII
(see~\autoref{chapter:ascii}) character, a $118_{10}$ if it is an integer
value and TRUE if it is a conditional.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Alignment}
Draw a diagram showing the overlapping data types when they are all aligned.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Instruction Alignment}
\enote{Rewrite this section for data rather than instructions and then

View File

@ -0,0 +1,17 @@
ddt> d 0x00002600
00002600: 93 05 00 00 13 06 00 00-93 06 00 00 13 07 00 00 *................*
00002610: 93 07 00 00 93 08 d0 05-73 00 00 00 63 54 05 02 *........s...cT..*
00002620: 13 01 01 ff 23 24 81 00-13 04 05 00 23 26 11 00 *....#$......#&..*
00002630: 33 04 80 40 97 00 00 00-e7 80 40 01 23 20 85 00 *3..@......@.# ..*
00002640: 6f 00 00 00 6f 00 00 00-b7 87 00 00 03 a5 07 43 *o...o..........C*
00002650: 67 80 00 00 00 00 00 00-76 61 6c 3d 00 00 00 00 *g.......val=....*
00002660: 00 00 00 00 80 84 2e 41-1f 85 45 41 80 40 9a 44 *.......A..EA.@.D*
00002670: 4f 11 f3 c3 6e 8a 67 41-20 1b 00 00 20 1b 00 00 *O...n.gA ... ...*
00002680: 44 1b 00 00 14 1b 00 00-14 1b 00 00 04 1c 00 00 *D...............*
00002690: 44 1b 00 00 14 1b 00 00-04 1c 00 00 14 1b 00 00 *D...............*
000026a0: 44 1b 00 00 10 1b 00 00-10 1b 00 00 10 1b 00 00 *D...............*
000026b0: 04 1c 00 00 54 1f 00 00-54 1f 00 00 d4 1f 00 00 *....T...T.......*
000026c0: 4c 1f 00 00 4c 1f 00 00-34 20 00 00 d4 1f 00 00 *L...L...4 ......*
000026d0: 4c 1f 00 00 34 20 00 00-4c 1f 00 00 d4 1f 00 00 *L...4 ..L.......*
000026e0: 48 1f 00 00 48 1f 00 00-48 1f 00 00 34 20 00 00 *H...H...H...4 ..*
000026f0: 00 01 02 02 03 03 03 03-04 04 04 04 04 04 04 04 *................*