class RED::String
Unicode string class. More...
#include <REDString.h>
Inherits: Object.
Inherited by: ShaderString.
Public functions:
String ( char iChar ) | |
String ( ) | |
String ( const wchar_t * iInputString ) | |
String ( unsigned int iLength ) | |
String ( const RED::String & iInputString ) | |
String ( const char * iInputString, unsigned int iLength = 0 ) | |
virtual | ~String ( ) |
RED::String & | Add ( const RED::String & iAddedString ) |
RED::String & | AddAscii ( const char * iAscii, int iLength ) |
RED::String & | Arg ( const RED::String & iArg ) |
RED::String & | Arg ( int iArg ) |
RED::String & | Arg ( RED::int64 iArg ) |
RED::String & | Arg ( unsigned int iArg ) |
RED::String & | Arg ( RED::uint64 iArg ) |
RED::String & | Arg ( float iArg ) |
RED::String & | Arg ( double iArg ) |
virtual const void * | As ( const RED::CID & iCID ) const |
template< class T_As > const T_As * | As ( ) const |
virtual void * | As ( const RED::CID & iCID ) |
template< class T_As > T_As * | As ( ) |
const char * | Buffer ( ) const |
void | Clear ( ) |
int | Compare ( const RED::String & iOther, int iCount = -1 ) const |
int | CompareNoCase ( const RED::String & iOther, int iCount = -1 ) const |
int | Find ( const RED::String & iSearchedString, int iOffset = 0 ) const |
char * | GetChar ( int iIndex, char * iPreviousChar = NULL ) const |
int | GetCharBytes ( const char * iInputUTF8Char ) const |
unsigned int | GetIDFromString ( ) const |
int | GetStringBytes ( ) const |
int | IndexOf ( const RED::String & iChar, int iOffset = 0 ) const |
bool | IsEmpty ( ) const |
int | LastIndexOf ( const RED::String & iChar, int iOffset = 0 ) const |
RED::String | Left ( int iPosition ) const |
int | Length ( ) const |
unsigned int | MemorySize ( ) const |
RED::String | Mid ( int iPosition, int iLength = -1 ) const |
operator wchar_t * ( ) const | |
int | operator!= ( const RED::String & iTestedString ) const |
RED::String | operator+ ( const RED::String & iString ) const |
RED_RC | operator+= ( const RED::String & iAddedString ) |
bool | operator< ( const RED::String & iOther ) const |
RED_RC | operator= ( const RED::String & iInputString ) |
RED_RC | operator= ( const char * iInputString ) |
RED_RC | operator= ( const wchar_t * iInputString ) |
int | operator== ( const RED::String & iTestedString ) const |
bool | operator> ( const RED::String & iOther ) const |
RED_RC | Replace ( const RED::String & iString1, const RED::String & iString2, int iOffset = 0 ) |
void | Replace ( char iChar1, char iChar2, int iOffset = 0 ) |
RED::String | Right ( int iPosition ) const |
void | SetChar ( int iPosition, char iChar ) |
RED_RC | SetUTF8Buffer ( const char * iUTF8Buffer ) |
unsigned int | ToID ( ) const |
unsigned short * | ToUCS2 ( ) const |
unsigned int * | ToUCS4 ( ) const |
Public static functions:
static RED::CID | GetClassID ( ) |
Protected functions:
RED_RC | FindTemplates ( RED::Vector< RED::int64 > & oTemplates ) |
Protected variables:
int | _size |
char * | _string |
Detailed description:
Unicode string class.
The String class encapsulates all usual string functions in an unicode version. Unicode allows any language string to be encoded. The string internally uses the UTF-8 encoding.
UTF-8 stands for "UCS Transformation Format". The way a String is accessed is done either using UTF-8 strings or UCS-2 unicode formats or through (char *) pointers, in which case the rough byte array with the string contents is returned.
According on unicode value (left column), the UTF-8 encoded value is described on the right column of the following table:
U-00000000 - U-0000007F: | 0xxxxxxx |
U-00000080 - U-000007FF: | 110xxxxx 10xxxxxx |
U-00000800 - U-0000FFFF: | 1110xxxx 10xxxxxx 10xxxxxx |
U-00010000 - U-001FFFFF: | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
U-00200000 - U-03FFFFFF: | 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
U-04000000 - U-7FFFFFFF: | 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
"xxxx" represents bits of the value to be encoded in the UTF-8 string. Bits are filled from right to left. Therefore, the beta greek sign (0xE1 Ascii value - 11100001 binary) is represented as 110xxxxx 10xxxxxx => 11000011 10100001.
A key advantage with UTF-8 is that standard Ascii strings using western characters (unicode 0-127) are encoded "as is" in the string, without the need for any encoding on a single byte per character.
Functions documentation
public RED::String::String | ( | char | iChar | ) |
String construction method from a single Ascii character.
Parameters:
iChar: | Input Ascii character. |
public RED::String::String | ( | ) |
String construction method.
Builds an empty string. The string exists and is '\0' terminated (using a terminating '\0' byte, which is the valid UTF-8 string termination character).
public RED::String::String | ( | const wchar_t * | iInputString | ) |
String construction from a wchar_t buffer.
The input buffer must be zero terminated.
Parameters:
iInputString: | Source string to copy in this. |
public RED::String::String | ( | unsigned int | iLength | ) |
String construction filled with with 'iLength' spaces.
Creates a string with an allocated string of 'iLength' bytes plus one for the termination character.
Parameters:
iLength: | The number of spaces to set into the string. |
public RED::String::String | ( | const RED::String & | iInputString | ) |
String copy construction method.
This method duplicates iInputString contents inside This.
Parameters:
iInputString: | Input string object to copy in This. |
public RED::String::String | ( | const char * | iInputString, |
unsigned int | iLength = 0 | ||
) |
String construction method from an Ascii formatted character chain.
If the buffer length is 0, a terminal zero is inserted marking the end of the buffer, otherwise the iLength number of bytes is read from the buffer and a terminal zero is added.
Parameters:
iInputString: | Input Ascii formatted string. Must be '\0' terminated. |
iLength: | Optional length of the input buffer. |
public virtual RED::String::~String | ( | ) |
String destruction method.
This method releases the stored string content.
public static RED::CID RED::String::GetClassID | ( | ) |
Reimplements: RED::Object::GetClassID.
Reimplemented by: RED::ShaderString::GetClassID.
public RED::String & RED::String::Add | ( | const RED::String & | iAddedString | ) |
Concatenates the provided string with the object content. '\0' terminated result.
This method concatenates iAddedString at the end of this. The resulting string is '\0' terminated.
Parameters:
iAddedString: | String being concatenated to This. |
Returns:
public RED::String & RED::String::AddAscii | ( | const char * | iAscii, |
int | iLength | ||
) |
Concatenates the provided ASCII string with the object contents. '\0' terminated result.
This method concatenates iAscii at the end of this. The resulting string is '\0' terminated.
Parameters:
iAscii: | String being concatenated to this. |
iLength: | Number of bytes to concatenate in iAscii. Does not include the termination character. The method does nothing if iLength is <= 0. |
Returns:
public RED::String & RED::String::Arg | ( | const RED::String & | iArg | ) |
Replaces the lowest occurence of 1, 2, ... n by the provided string.
example:
RED::String( "Red %1 Development %2" ).arg( "Software" ).arg( "Kit" );
will return
"Red Software Development Kit"
Parameters:
iArg: | the string to insert. |
Returns:
public RED::String & RED::String::Arg | ( | int | iArg | ) |
Replaces the lowest occurence of 1, 2, ... n by the provided integer.
example:
RED::String( "1 = %1 - %2" ).arg( 4 ).arg( 3 );
will return
"1 = 4 - 3"
Parameters:
iArg: | the integer to insert. |
Returns:
public RED::String & RED::String::Arg | ( | RED::int64 | iArg | ) |
Replaces the lowest occurence of 1, 2, ... n by the provided integer.
example:
RED::String( "1 = %1 - %2" ).arg( 4 ).arg( 3 );
will return
"1 = 4 - 3"
Parameters:
iArg: | the integer to insert. |
Returns:
public RED::String & RED::String::Arg | ( | unsigned int | iArg | ) |
Replaces the lowest occurence of 1, 2, ... n by the provided unsigned integer.
example:
RED::String( "1 = %1 - %2" ).arg( 4 ).arg( 3 );
will return
"1 = 4 - 3"
Parameters:
iArg: | the unsigned integer to insert. |
Returns:
public RED::String & RED::String::Arg | ( | RED::uint64 | iArg | ) |
Replaces the lowest occurence of 1, 2, ... n by the provided unsigned integer.
example:
RED::String( "1 = %1 - %2" ).arg( 4 ).arg( 3 );
will return
"1 = 4 - 3"
Parameters:
iArg: | the unsigned integer to insert. |
Returns:
public RED::String & RED::String::Arg | ( | float | iArg | ) |
Replaces the lowest occurence of 1, 2, ... n by the provided float.
example:
RED::String( "Pi = %1" ).arg( 3.1415926536f );
will return
"Pi = 3.1415926536"
Parameters:
iArg: | the float to insert. |
Returns:
public RED::String & RED::String::Arg | ( | double | iArg | ) |
Replaces the lowest occurence of 1, 2, ... n by the provided double.
example:
RED::String( "Pi = %1" ).arg( 3.1415926536 );
will return
"Pi = 3.1415926536"
Parameters:
iArg: | the double to insert. |
Returns:
public virtual const void * RED::String::As | ( | const RED::CID & | iCID | ) const |
Converts the object to an instance of the given type.
Parameters:
iCID: | Requested class. |
Returns:
Reimplements: RED::Object::As.
Reimplemented by: RED::ShaderString::As.
template< class T_As > public const T_As * RED::String::As | ( | ) const |
Template version of the as const method.
Simply set T to be the class you want to retrieve an interface to.
Returns:
Reimplements: RED::Object::As.
Reimplemented by: RED::ShaderString::As.
public virtual void * RED::String::As | ( | const RED::CID & | iCID | ) |
Converts the object to an instance of the given type.
Parameters:
iCID: | Requested class. |
Returns:
Reimplements: RED::Object::As.
Reimplemented by: RED::ShaderString::As.
template< class T_As > public T_As * RED::String::As | ( | ) |
Template version of the as method.
Simply set T to be the class you want to retrieve an interface to.
Returns:
Reimplements: RED::Object::As.
Reimplemented by: RED::ShaderString::As.
public const char * RED::String::Buffer | ( | ) const |
Returns the untranslated string buffer.
This method returns the untranslated content of the string. The encoding of the returned string is UTF-8.
Returns:
public void RED::String::Clear | ( | ) |
Clears the string content, and resets it to an UTF-8 '\0' termination sequence.
This methods clears the string content, and sets a '\0' termination character.
public int RED::String::Compare | ( | const RED::String & | iOther, |
int | iCount = -1 | ||
) | const |
Case sensitive comparison of two strings.
Parameters:
iOther: | String to compare with. |
iCount: | Optional number of characters to compare (Default is -1 for the whole string). |
Returns:
-1 if the first string is less than the second,
+1 if the first string is greater than the second.
public int RED::String::CompareNoCase | ( | const RED::String & | iOther, |
int | iCount = -1 | ||
) | const |
Case insensitive comparison of two strings.
Parameters:
iOther: | String to compare with. |
iCount: | Optional number of characters to compare (Default is -1 for the whole string). |
Returns:
-1 if the first string is less than the second,
+1 if the first string is greater than the second.
public int RED::String::Find | ( | const RED::String & | iSearchedString, |
int | iOffset = 0 | ||
) | const |
Looks for the first occurrence of a string.
Parameters:
iSearchedString: | String to look for. |
iOffset: | offset (in characters) to start the search from (Default is 0). |
Returns:
-1 otherwise.
public char * RED::String::GetChar | ( | int | iIndex, |
char * | iPreviousChar = NULL | ||
) | const |
Gets a pointer to the leading byte of the n-th string UTF-8 encoded character.
Parameters:
iIndex: | Index of the character to access. |
iPreviousChar: | The position of the previous character in the string. If the provided value is not the previous character of the string, the routine may produce wrong results. |
Returns:
public int RED::String::GetCharBytes | ( | const char * | iInputUTF8Char | ) const |
Returns the number of bytes an UTF-8 encoded char is made of.
This method returns the number of bytes composing the UTF-8 character iInputUTF8Char is made of. iInputUTF8Char points to the leading byte of the character.
Parameters:
iInputUTF8Char: | UTF-8 encoded character for which we want to know the number of bytes it's encoded with. |
Returns:
public unsigned int RED::String::GetIDFromString | ( | ) const |
Gets a unique id from a string.
Converts the contents of this into an unique id that is returned by the call. This id can be used as the identifier on all REDObjects.
Returns:
public int RED::String::GetStringBytes | ( | ) const |
Returns the number of bytes composing the provided string.
This method returns the number of bytes composing the UTF-8 encoded string buffer that is provided to the routine. The count occurs until an '\0' termination character is found.
Returns:
public int RED::String::IndexOf | ( | const RED::String & | iChar, |
int | iOffset = 0 | ||
) | const |
Gets the first occurrence of a character.
Parameters:
iChar: | Character to look for. |
iOffset: | Optional offset to start the search from. |
Returns:
-1 otherwise.
public bool RED::String::IsEmpty | ( | ) const |
Returns true if the string is empty.
Returns:
public int RED::String::LastIndexOf | ( | const RED::String & | iChar, |
int | iOffset = 0 | ||
) | const |
Gets the first occurrence of a character from the end of the string.
Parameters:
iChar: | Character to look for. |
iOffset: | Optional offset that should be negative. It is added to the position of the last character in the string to compute that search starting offset. |
Returns:
-1 otherwise.
public RED::String RED::String::Left | ( | int | iPosition | ) const |
Gets the substring made of the leftmost characters of the string.
This method returns a string built starting from the first string character and terminated by the character before the one indicated by iPosition.
If iPosition is negative, the method returns the full string. If iPosition is greater or equal to the string length, the full string is returned.
Parameters:
iPosition: | position of the first character not included into the sub-string. |
Returns:
public int RED::String::Length | ( | ) const |
Counts the number of characters.
Returns:
public unsigned int RED::String::MemorySize | ( | ) const |
Returns the memory size of the internal string buffer.
Returns:
public RED::String RED::String::Mid | ( | int | iPosition, |
int | iLength = -1 | ||
) | const |
Extracts a substring of a given length from a given position.
This method returns the substring of length iLength that starts at position iPosition in the string.
If iPosition is negative, the full string is returned.
If iPosition is greater or equal to the string length, an empty string is returned.
If iPosition + iLength is greater than the string length, the returned string ends at the string's end.
Parameters:
iPosition: | Position to start extraction from. |
iLength: | Length of the extracted string; if iLength is < 0 or bigger than the count of characters left from iPosition, the full sub-string starting at iPosition is returned. |
Returns:
public RED::String::operator wchar_t * | ( | ) const |
Converts the string to native wchar_t format.
This method converts the content of the string to the UCS-2 for windows and UCS-4 for linux and mac.
A termination character is added at the string's end.
The buffer in which the string is dumped is dynamically allocated by the call using an internal 'rmalloc'. The caller must ensure that the created buffer is released using 'rfree' after usage.
Returns:
public int RED::String::operator!= | ( | const RED::String & | iTestedString | ) const |
Tests difference of This with the right hand operator string.
This operator checks at the byte level that the string contents are different. If they are not, the operator returns 0, and if they are different, it returns 1.
Parameters:
iTestedString: | Right hand string operator compared with this. |
public RED::String RED::String::operator+ | ( | const RED::String & | iString | ) const |
Gets the concatenation of two strings.
Parameters:
iString: | String to be concatenated to this. |
Returns:
public RED_RC RED::String::operator+= | ( | const RED::String & | iAddedString | ) |
Concatenates two strings.
Same as method RED::String::Add.
Parameters:
iAddedString: | Reference to the string to concatenate to this. |
Returns:
RED_ALLOC_FAILURE if an internal memory allocation has failed.
public bool RED::String::operator< | ( | const RED::String & | iOther | ) const |
Sorting operator.
Compares two strings and returns true if the first is "less" than the second. "Less" means shorter and/or using characters with Ascii codes inferior to those of the other string.
Returns:
public RED_RC RED::String::operator= | ( | const RED::String & | iInputString | ) |
Assigns string content using right operand '=' string.
This method replaces any old string content with the one provided as right hand argument to the '=' operator in the assignment.
Parameters:
iInputString: | String to set equal to This. |
Returns:
public RED_RC RED::String::operator= | ( | const char * | iInputString | ) |
Assigns string content using right operand '=' UTF8 string.
This method replaces any old string content with the one provided as right hand argument to the '=' operator in the assignment.
Parameters:
iInputString: | String to set equal to This. |
Returns:
public RED_RC RED::String::operator= | ( | const wchar_t * | iInputString | ) |
Assigns string content using right operand '=' UCS2 string.
This method replaces any old string content with the one provided as right hand argument to the '=' operator in the assignment.
Parameters:
iInputString: | String to set equal to This. |
Returns:
public int RED::String::operator== | ( | const RED::String & | iTestedString | ) const |
Tests equality of This with the right hand operator string.
This operator checks at the byte level that the string contents are identical. If they are not, the operator returns 0, and if they are identical, it returns 1.
Parameters:
iTestedString: | Right hand string operator compared with this. |
public bool RED::String::operator> | ( | const RED::String & | iOther | ) const |
Sorting operator.
Compares two strings and returns true if the first is "more" than the second. "More" means longer and/or using characters with Ascii codes superior to those of the other string.
Returns:
public RED_RC RED::String::Replace | ( | const RED::String & | iString1, |
const RED::String & | iString2, | ||
int | iOffset = 0 | ||
) |
Replaces all occurences of a string by another.
Parameters:
iString1: | String to look for. |
iString2: | String to replace with. |
iOffset: | Optional offset so start from (default is 0). |
public void RED::String::Replace | ( | char | iChar1, |
char | iChar2, | ||
int | iOffset = 0 | ||
) |
Replaces all occurrences of an Ascii character by another.
Parameters:
iChar1: | Character to look for. |
iChar2: | Character to replace with. |
iOffset: | Optional offset to start from (default is 0). |
public RED::String RED::String::Right | ( | int | iPosition | ) const |
Gets the substring made of the rightmost characters of the string.
This method returns a string built starting from the character indicated by iPosition and ends at the string's end.
If iPosition is negative, the method returns the full string. If iPosition is greater or equal to the string length, an empty string is returned.
Parameters:
iPosition: | position of the last character not included into the sub-string. |
Returns:
public void RED::String::SetChar | ( | int | iPosition, |
char | iChar | ||
) |
Sets an Ascii character at a given position.
Note that if the character at iPosition is not an Ascii character (0-127 unicode range) then the string contents may become incorrect, as only one byte is replaced using this method, at the first byte of iPosition-th character.
Parameters:
iPosition: | Number of the character to set. |
iChar: | New character to set. |
public RED_RC RED::String::SetUTF8Buffer | ( | const char * | iUTF8Buffer | ) |
Directly assigns the string UTF-8 encoded buffer.
This method directly overrides the string contents with the provided UTF-8 buffer.
Parameters:
iUTF8Buffer: | UTF-8 encoded buffer, that must be NULL terminated. |
public unsigned int RED::String::ToID | ( | ) const |
Converts the string to an hash key using the RED::Object::GetIDFromString method.
Returns:
public unsigned short * RED::String::ToUCS2 | ( | ) const |
Converts the string to UCS2 (UTF16) format.
This method converts the content of the string to the UCS-2 "Universal Character Set" format. A termination character is added at the string's end (a double 0). All characters are now 16 bits wide.
The buffer in which the string is dumped is dynamically allocated by the call using an internal 'rmalloc'. The caller must ensure that the created buffer is released using 'rfree' after usage.
Returns:
public unsigned int * RED::String::ToUCS4 | ( | ) const |
Converts the string to UCS4 (UTF32) format.
This method converts the content of the string to the UCS-4 "Universal Character Set" format. A termination character is added at the string's end (four 0). All characters are now 32 bits wide.
The buffer in which the string is dumped is dynamically allocated by the call using an internal 'rmalloc'. The caller must ensure that the created buffer is released using 'rfree' after usage.
Returns:
protected RED_RC RED::String::FindTemplates | ( | RED::Vector< RED::int64 > & | oTemplates | ) |
Finds the positions of the templates 'x' (where x is a number) with the lowest value of x in the string and store them internally.
The found templates are stored in the oTemplates vector. The oTemplates vector is first cleared each time the method is called.
Parameters:
oTemplates: | returned list of the positions of the template with the lowest value (a template follows the form 'x' where x is an integer). The value of the lowest template is returned in templates[0]. |
Returns:
Variables documentation
Internal '_string' array memory size. This size includes the '\0' termination character.
String content array in UTF-8 format.