From 477b37fc8072e56ad7e4f4dc70abe75aad4b7ecd Mon Sep 17 00:00:00 2001 From: UnknownShadow200 Date: Thu, 16 Sep 2021 19:35:33 +1000 Subject: [PATCH] Add documentation for strings --- doc/readme.md | 4 +- doc/strings.md | 213 +++++++++++++++++++++++++++++++++++++++++++++++++ doc/style.md | 17 ++-- 3 files changed, 221 insertions(+), 13 deletions(-) create mode 100644 doc/strings.md diff --git a/doc/readme.md b/doc/readme.md index e0f9a7d59..43ec03706 100644 --- a/doc/readme.md +++ b/doc/readme.md @@ -5,6 +5,8 @@ This folder contains general information relating to the game's source code. |compile-fixes.md | Steps on how to fix some common compilation errors | |hosting-flask.md | Example website that hosts the web client using [Flask](https://flask.palletsprojects.com/)| |hosting-webclient.md | Explains how to integrate the web client into your own website | +|modules.md | Provides a summary about the modules that constitute the game's code| |plugin-dev.md | Explains how to compile a simple plugin for the game | |portability.md | Provides information about porting this game to other platforms | -|style.md | Explains the style guidelines that the source code generally follows | \ No newline at end of file +|style.md | Explains the style guidelines that the source code generally follows | +|strings.md | Provides details about the custom string type used by this game | \ No newline at end of file diff --git a/doc/strings.md b/doc/strings.md new file mode 100644 index 000000000..2567e070f --- /dev/null +++ b/doc/strings.md @@ -0,0 +1,213 @@ +## Introduction + +ClassiCube uses a custom string type rather than the standard C `char*` string in most places + +ClassiCube strings (`cc_string`) are a struct with the following fields: +- `buffer` -> Pointer to 8 bit characters (unsigned code page 437 indices) +- `length` -> Number of characters currently used +- `capacity` -> Maximum number of characters (i.e buffer size) + +Note: This means **STRINGS MAY NOT BE NULL TERMINATED** (and are not in most cases) + +You should also read the **Strings** section in the [style guide](doc/style.md) + +## Memory management +Some general guidelines to keep in mind when it comes to `cc_string` strings: +- String buffers can be allocated on either the stack or heap
+(i.e. make sure you don't return strings that are using stack allocated buffers) +- Strings are fixed capacity (strings do not grow when length reaches capcity)
+(i.e. make sure you allocate a large enough buffer upfront) +- Strings are not garbage collected or reference counted
+(i.e. you are responsible for managing the lifetime of strings) + +## C String conversion + +### C string -> cc_string + +Creating a `cc_string` string from a C string is straightforward: + +#### From a constant C string +```C +void Example(void) { + cc_string str = String_FromConst("test"); +} +``` + +#### From a C string +```C +void Example(const char* c_str) { + cc_string str = String_FromReadonly(c_str); +} +``` +Note: `String_FromReadonly` can also be used with constant C strings, it's just a bit slower + +#### From a C fixed size string +```C +struct Something { int value; char name[50]; }; + +void Example(struct Something* some) { + cc_string str = String_FromRawArray(some->name); +} +``` + +### cc_string -> C string + +The `buffer` field **should not** be treated as a C string, because `cc_string` strings **MAY NOT BE NULL TERMINATED** + +The general way to achieve this is to +1. Initialise `capacity` with 1 less than actual buffer size (e.g. use `String_InitArray_NT` instead of `String_InitArray`) +2. Perform various operations on the `cc_string` string +3. Add null terminator to end (i.e. `buffer[length] = '\0'; +4. Use `buffer` as a C string now + +For example: +```C +void PrintInt(int value) { + cc_string str; char strBuffer[128]; + String_InitArray_NT(str, strBuffer); + String_AppendInt(&str, value); + str.buffer[str.length] = '\0'; + puts(str.buffer); +} +``` + +## OS String conversion + +`cc_string` strings cannot be directly used as arguments for operating system functions and must be converted first. + +The following functions are provided to convert `cc_string` strings into operating system specific encoded strings: + +### cc_string -> Windows string + +`Platform_EncodeUtf16` converts a `cc_string` into a null terminated `WCHAR` string + +#### Example +```C +void SetWorkingDir(cc_string* title) { + WCHAR buffer[NATIVE_STR_LEN]; + Platform_EncodeUtf16(buffer, title); + SetCurrentDirectoryW(buffer); +} +``` + +### cc_string -> Unix string + +`Platform_EncodeUtf8` converts a `cc_string` into a null terminated UTF8-encoded `char*` string + +#### Example +```C +void SetWorkingDir(cc_string* title) { + char buffer[NATIVE_STR_LEN]; + Platform_EncodeUtf8(buffer, title); + chdir(buffer); +} +``` + +## API + +I'm lazy so I will just link to [String.h](src/String.h) + +If you'd rather I provided a more detailed reference here, please let me know. + +# Extra details + +## C comparison + +A rough mapping of C string API to ClassiCube's string API: +``` +atof -> Convert_ParseFloat +strtof -> Convert_ParseFloat +atoi -> Convert_ParseInt +strtoi -> Convert_ParseInt + +strcat -> String_AppendConst/String_AppendString +strcpy -> String_Copy +strtok -> String_UNSAFE_Split + +strlen -> str.length +strcmp -> String_Equals/String_Compare +strchr -> String_IndexOf +strrchr -> String_LastIndexOf +strstr -> String_IndexOfConst + +sprintf -> String_Format1/2/3/4 + %d -> %i + %04d -> %p4 + %i -> %i + %c -> %r + %.4f -> %f4 + %s -> %s (cc_string) + %s -> %c (char*) + %x -> %h +``` + +## C# comparison + +A rough mapping of C# string API to ClassiCube's string API: +``` +byte.Parse -> Convert_ParseUInt8 +ushort.Parse -> Convert_ParseUInt16 +float.Parse -> Convert_ParseFloat +int.Parse -> Convert_ParseInt +ulong.Parse -> Convert_ParseUInt64 +bool.Parse -> Convert_ParseBool + +a += "X"; -> String_AppendString +b = a; -> String_Copy +string.Insert -> String_InsertAt +string.Remove -> String_DeleteAt + +string.Substring -> String_UNSAFE_Substring/String_UNSAFE_SubstringAt +string.Split -> String_UNSAFE_Split/String_UNSAFE_SplitBy +string.TrimStart -> String_UNSAFE_TrimStart +string.TrimEnd -> String_UNSAFE_TrimEnd + +a.Length -> str.length +a == b -> String_Equals +string.Equals -> String_CaslessEquals (StringComparison.OrdinalIgnoreCase) +string.IndexOf -> String_IndexOf/String_IndexOfConst +string.LastIndexOf -> String_LastIndexOf +string.StartsWith -> String_CaselessStarts (StringComparison.OrdinalIgnoreCase) +string.EndsWith -> String_CaselessEnds (StringComparison.OrdinalIgnoreCase) +string.CompareTo -> String_Compare + +string.Format -> String_Format1/2/3/4 +``` +*Note: I modelled cc_string after C# strings, hence the similar function names* + +## C++ comparison + +A rough mapping of C++ std::string API to ClassiCube's string API: +``` +std::stof -> Convert_ParseFloat +std::stoi -> Convert_ParseInt +std::stoul -> Convert_ParseUInt64 + +string::append -> String_AppendString/String_AppendConst +b = a; -> String_Copy +string::insert -> String_InsertAt +string::erase -> String_DeleteAt + +string::substr -> String_UNSAFE_Substring/String_UNSAFE_SubstringAt +string.Split -> String_UNSAFE_Split/String_UNSAFE_SplitBy +string.TrimStart -> String_UNSAFE_TrimStart +string.TrimEnd -> String_UNSAFE_TrimEnd + +a.Length -> str.length +a == b -> String_Equals +string.Equals -> String_CaslessEquals (StringComparison.OrdinalIgnoreCase) +string::find -> String_IndexOf/String_IndexOfConst +string::rfind -> String_LastIndexOf +string::compare -> String_Compare + +std::sprintf -> String_Format1/2/3/4 +``` + + +## lifetime examples + +Stack allocated returning example + +Mem_Alloc/Mem_Free and function example + +UNSAFE and mutating characters example \ No newline at end of file diff --git a/doc/style.md b/doc/style.md index 4cc7da753..5d27a2c20 100644 --- a/doc/style.md +++ b/doc/style.md @@ -14,27 +14,20 @@ I may not have defined the appropriate types for your compiler, so you may need to modify ```Core.h``` ### Strings -Strings are one of the most troublesome aspects of C. In this software, strings consist of: -- Pointer to 8 bit characters (unsigned code page 437 indices) -- Number of characters currently used (length) -- Maximum number of characters / buffer size (capacity) -Although this makes substrings / concatenating very fast, it also means -**STRINGS ARE NOT NULL TERMINATED** (and are not in most cases). - -Thus, when using or implementing a per-platform API, you must null-terminate and convert characters to native encoding. You should implement the ```Platform_ConvertString``` function and use that. +A custom string type (`cc_string`) is used rather than `char*` strings in most places (see [strings](doc/strings.md) page for more details) *Note: Several functions will take raw ```char*``` for performance, but this is not encouraged* #### String arguments String arguments are annotated to indicate storage and readonly requirements. These are: -- ```const String*``` - String is not modified at all -- ```String*``` - Characters in string may be modified -- ```STRING_REF``` - Macro annotation indicating a **reference is kept to characters** +- ```const cc_string*``` - String is not modified at all +- ```cc_string*``` - Characters in string may be modified +- ```STRING_REF``` - Macro annotation indicating a **reference is kept to the characters** To make it extra clear, functions with ```STRING_REF``` arguments usually also have ```_UNSAFE_``` as part of their name. -For example, consider the function ```String Substring_UNSAFE(STRING_REF const String* str, length)``` +For example, consider the function ```cc_string Substring_UNSAFE(STRING_REF const cc_string* str, length)``` The *input string* is not modified at all. However, the characters of the *returned string* points to the characters of the *input string*, so modifying the characters in the *input string* also modifies the *returned string*.