From 62ae79c007b788f6aa065216d0aafb4f4fbf8ad2 Mon Sep 17 00:00:00 2001 From: John Winans Date: Thu, 17 May 2018 22:51:25 -0500 Subject: [PATCH] Reorg and prepare to relocate memory details into binary chapter. --- book/binary/chapter.tex | 154 +++++++++++++++++++++++++++++----- book/binary/rvddt_memdump.out | 17 ++++ 2 files changed, 148 insertions(+), 23 deletions(-) create mode 100644 book/binary/rvddt_memdump.out diff --git a/book/binary/chapter.tex b/book/binary/chapter.tex index 28f745b..eaab021 100644 --- a/book/binary/chapter.tex +++ b/book/binary/chapter.tex @@ -3,32 +3,43 @@ This chapter discusses how data are represented and stored in a computer. +In the context of computing, {\em boolean} refers to a condition that can +be either true and false and {\em binary} refers to the use of a base-2 +numeric system to rpresent numbers. + +RISC-V assembly language uses binary to represent all values, be they +boolean or numeric. It is the context within which they are used that +determines whether they are boolean or numeric. + +RISC-V assembly language uses zero to represent {\em false} and one +to represent {\em true}. In general, however, it is useful to relax +this and define zero {\bf and only zero} to be {\em false} and anything +that is not {\em false} is therefore {\em true}.% +\footnote{This is how {\em true} and {\em false} behave in C, C++, and +many other languages as well as the common assembly language idioms +discussed in this text.} + +\enote{Add some diagrams here showing bits, bytes and the MSB, +LSB,\ldots\ perhaps relocated from the RV32I chapter?}% +The reason for this relaxation is because, while a single binary digit +(\gls{bit}) can represent the two values zero and one, the vast majority +of the time data is processed by the CPU in groups of bits. These +groups have names like \gls{byte}, \gls{halfword} and \gls{fullword}. + + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%\section{Context} -% -%Numbers can be interpreted differently depending on the context in -%which they are used. For example a number may represent the quantity -%of millimeters between two points. It may enumerate a -%a letter of the alphabet -- ie. $01000001=A$, $01000010=B$, -%$01000011=C$\ldots\ In fact, any finite set of items can be identified -%(enumerated) by a assigning a code number to each element in this fashon. +\section{Boolean Functions} -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% -\section{Logical/Boolean Functions} - -\enote{This is unclear. Need to define bit positions and probably -should add basic truth table diagrams.}% -Unlike addition and subtraction, boolean functions apply -on a per-bit basis. +\enote{Probably should add basic truth table diagrams.}% +Boolean functions apply on a per-bit basis. %in that they do not impact neighboring bits. %by generating things like a carry or a borrow. When applied to multi-bit values, each bit position is operated upon independently of the other bits. -\enote{Need to define 1 as true and 0 as false somewhere.} + + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -534,38 +545,123 @@ Discuss the details of truncation and overflow here. {\em truncation} and {\em overflow} as occur with signed and unsigned addition and subtraction.} +I prefer to define {\em truncation} as the loss of data as result of +the bit-length of the destination being too small to hold result of an +operation and {\em overflow} as when the carry into a sign bit is not +the same as the carry out of the sign bit. + +Where addition and subtraction on the RV32 is concerned, the sum or difference of +two unsigned 32-bit numbers will be {\em truncated} when the operation results in +a carry out of bit 31. Unsinged operations can not overflow (as defined above). + +(show a truncation picture here) + +An Overflow occurs with signed numbers when the two addends are positive and +sum is negative or the addends are both negative and the sum is positive. + +(show an overflow picture here) + +(show mixed overflow and truncation situations here to drive home the need +to ignore truncation when dealing with signed numbers.) + +0xffffffff + 0x00000002 has truncation but not overflow +(OK for signed, not OK for unsigned). + +0xffffffff + 0xfffffffe also has truncation but not overflow. + +0x40000000 + 0x40000000 has overflow but not truncation. (We care if are signed numbers.) + +0x80000000 + 0x80000000 has both overflow and truncation. (we care regardless of signedness) + +Where subtraction is concerned the notion of a borrow is the same as carry. +\enote{I think that overloading the word {\em overflow} like this can be is +confusing to new programmers.}% +Page 13 of~\cite{rvismv1v22:2017} mixes these two notions of +(and never mentions the word {\em truncate}) like this: +\begin{quote} +We did not include special instruction set support for overflow checks on +integer arithmetic operations in the base instruction set, as many overflow +checks can be cheaply implemented using RISC-V branches. Overflow checking for +unsigned addition requires only a single additional branch instruction after the +addition: \verb@add t0, t1, t2; bltu t0, t1, overflow@. + +For signed addition, if one operand's sign is known, overflow checking requires +only a single branch after the addition: +\verb@addi t0, t1, +imm; blt t0, t1, overflow@. This covers the common +case of addition with an immediate operand. + +For general signed addition, three additional instructions after the addition +are required, leveraging the observation that the sum should be less than one +of the operands if and only if the other operand is negative. +\begin{verbatim} +add t0, t1, t2 +slti t3, t2, 0 +slt t4, t0, t1 +bne t3, t4, overflow +\end{verbatim} +In RV64, checks of 32-bit signed additions can be optimized further by comparing +the results of ADD and ADDW on the operands. +\end{quote} + + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Sign and Zero Extension} + +\enote{Refactor the sx() and zx() discussion in the RV32I chapter +and locate the details here.}% +Seems like a good place to discuss extension. + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\section{Shifting} + +Seems like a good place to discuss logical and arithmetic shifting. + +shift left logical + +shift right logical + +shift right arithmetic %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Main Memory Storage} -\enote{Refactor this section and the memory discussion in RV32 reference chapter}% +\enote{Consider refactoring the memory discussion in RV32 reference chapter +and placing some of it in this section.}% When transferring data between its registers registers and main memory a RISC-V system uses the little-endian byte order.\footnote{ See\cite{IEN137} for some history of the big/little-endian ``controversy.''} -\enote{Discuss byte ordering, addressing and character strings.} - +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Memory Dump} Introduce the memory dump and how to read them here. -Discuss the pitfalls of assuming what a set of bytes is used for based -on their contents! +\listing{rvddt_memdump.out}{{\tt rvddt} memory dump} + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Big Endian Representation} Using the memory dump contents in prior section, discuss how big endian values are stored. + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Little Endian Representation} Using the memory dump contents in prior section, discuss how little endian values are stored. +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Character Strings and Arrays} Define character strings and arrays. @@ -573,11 +669,23 @@ Define character strings and arrays. Using the prior memory dump, discuss how and where things are stored and retrieved. +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +\subsection{Context is Important!} + +Data values can be interpreted differently depending on the context in +which they are used. Assuming what a set of bytes is used for based on +their contents can be very misleading! For example, there is a 0x76 at +address 0x00002658. This is a `v' is you use it as an ASCII +(see~\autoref{chapter:ascii}) character, a $118_{10}$ if it is an integer +value and TRUE if it is a conditional. + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Alignment} Draw a diagram showing the overlapping data types when they are all aligned. +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \subsection{Instruction Alignment} \enote{Rewrite this section for data rather than instructions and then diff --git a/book/binary/rvddt_memdump.out b/book/binary/rvddt_memdump.out new file mode 100644 index 0000000..47b2334 --- /dev/null +++ b/book/binary/rvddt_memdump.out @@ -0,0 +1,17 @@ +ddt> d 0x00002600 + 00002600: 93 05 00 00 13 06 00 00-93 06 00 00 13 07 00 00 *................* + 00002610: 93 07 00 00 93 08 d0 05-73 00 00 00 63 54 05 02 *........s...cT..* + 00002620: 13 01 01 ff 23 24 81 00-13 04 05 00 23 26 11 00 *....#$......#&..* + 00002630: 33 04 80 40 97 00 00 00-e7 80 40 01 23 20 85 00 *3..@......@.# ..* + 00002640: 6f 00 00 00 6f 00 00 00-b7 87 00 00 03 a5 07 43 *o...o..........C* + 00002650: 67 80 00 00 00 00 00 00-76 61 6c 3d 00 00 00 00 *g.......val=....* + 00002660: 00 00 00 00 80 84 2e 41-1f 85 45 41 80 40 9a 44 *.......A..EA.@.D* + 00002670: 4f 11 f3 c3 6e 8a 67 41-20 1b 00 00 20 1b 00 00 *O...n.gA ... ...* + 00002680: 44 1b 00 00 14 1b 00 00-14 1b 00 00 04 1c 00 00 *D...............* + 00002690: 44 1b 00 00 14 1b 00 00-04 1c 00 00 14 1b 00 00 *D...............* + 000026a0: 44 1b 00 00 10 1b 00 00-10 1b 00 00 10 1b 00 00 *D...............* + 000026b0: 04 1c 00 00 54 1f 00 00-54 1f 00 00 d4 1f 00 00 *....T...T.......* + 000026c0: 4c 1f 00 00 4c 1f 00 00-34 20 00 00 d4 1f 00 00 *L...L...4 ......* + 000026d0: 4c 1f 00 00 34 20 00 00-4c 1f 00 00 d4 1f 00 00 *L...4 ..L.......* + 000026e0: 48 1f 00 00 48 1f 00 00-48 1f 00 00 34 20 00 00 *H...H...H...4 ..* + 000026f0: 00 01 02 02 03 03 03 03-04 04 04 04 04 04 04 04 *................*