Strings in ‘C’
In this article, we learn what is string, how to store a string with example.
What is string?
In ‘C’, strings are collections of characters terminated by a null character, essential for indicating their end.
For example, your name is a collection of characters and can be represented as a string in a program.
We can use an array to store Strings(name or textual message).
Unlike C++, Java, and Python, ‘C’ does not have any dedicated data type to store string data.
In ‘C’, we take the help of an array to store and manipulate string data. So, ‘string’ manipulations are realized through arrays.
How to store a string?
Example: “Hello”;
Here, Hello is a message, which is written inside the double quotes. This is how a string is written in the program.
How do you store this in a program?
Create an array. The array of type ‘char’. Why? Because a string is a collection of characters.
char Msg[ ]= “Hello”;
This is how you store a string in the program using an array.
Now here you can see that char Msg[ ] is an array variable or you can also call it a pointer, or a reference, or in this case, you can also call it a string variable. This also can be called a string variable for this context.
When you do this, the compiler will reserve some bytes of memory and it will store the string.
How is the string stored in the memory?
Look at Figure 1, there are some memory location addresses 0xE00, 0xE01, 0xE02, 0xE03, 0xE04, and 0xE05.
‘h’ is stored in 0xE00, that is the ASCII code of h. ‘e’ will be stored in 0xE01, ‘l’ will be stored in 0xE02, another ‘l’ will be stored in 0xE03, 0xE04 is a place for ‘o’.
And after that please note that the compiler will automatically insert the null character(\0) to indicate the end of the string.
The ASCII code for the null character is ‘0’. 0 means a null character. And the null character is also denoted by \0.
So, you represent the newline as \n, in the same way, \0 is to denote a null character in the program.
This is how the string “Hello” is stored in the memory and ‘Msg’ is a reference. That means char Msg[ ]=”Hello” this string definition consumed 6 bytes in memory, not 5 bytes.
The required information is 5 bytes, but this definition consumes 6 bytes in memory, because the compiler terminates the string by a null character(‘\0’). You need not insert any null character. The compiler automatically inserts that.
Now we know how to store a string in a program using a character array. There is also one more method to store a string, which is a little complicated method.
You can also initialize char Msg[ ] = “Hello”; this array with a string like this. Create a variable char msg[ ], open the curly braces like you can initialize the array byte by byte like this → ‘H’, ‘e’, ‘l’, ‘l’,’o’, and close the curly braces.
char Msg[ ] = {‘H’, ‘e’, ‘l’, ‘l’, ‘o’};
If you do this, then this is not a string. Why?
Because here you are just storing some character data into a character array. This is not terminated with a null character. That’s why this cannot be considered a string definition.
char Msg[ ] = “Hello”; is a perfect example for string definition. Because the string is properly terminated with a null character.
So, if you are doing string manipulation, or if you want to use a second method, then you have to manually write the null character(\o) at the end.
For example, char Msg[] = {‘H’, ‘e’, ‘l’, ‘l’, ‘o’, ‘\0’}; // Here you should write the null character manually and then you finish the initialization.
These two are identical.
So, if you want to do string manipulation, then this is kind of a little complicated method, so you always go for the first method, that is char Msg[ ] = “Hello”;
Look at below two array definitions.
char Msg1[10] = “Hello”;
char Msg2[ ] = “Hello”;
What distinguishes these two declarations?
The first declaration is an array of 10 data elements. In this case, only a portion of the array has been initialized, storing 5 + 1 = 6 elements. The rest of the elements are filled with zeros or null characters.
In contrast, the second declaration allows the compiler to dynamically adjust the size of the array. In this case, the compiler reserves only 6 bytes for this array. This is the reason why the number of data elements is not specified when storing string data in a character array.
For instance, performing a sizeof(msg1) would yield 10 bytes, while a sizeof(msg2) would yield 6 bytes.
What if you do string length(strlen)?
There is a built-in function called string length in the ‘C’ standard library.
If you use that function over strlen(msg1), then it would give you 5 bytes remember that. Because the string length function always counts only the required information, it doesn’t count the null character.
The actual length of your information is 5 bytes. That’s why string length always returns the length excluding the null character.
So, at the same time if you do strlen(msg2), what do you get?
You again get 5 bytes.
String length is a built-in function that comes from the string library of the ‘C’ standard library. That’s an introduction to strings on how you store strings in your program.
Now I will write a simple string definition below.
char Msg1[ ] = “A”; //string definition
char Msg2[ ] = ’A’; //Character
Here, I write “A” in double quotes and ‘A’ in single quotes.
Please note that the first method is called a string definition. “A” is a string here, it’s not a character.
And in the second method, ‘A’ is a character. Here Msg2 is a character variable that you created to store an ASCII code for a character.
That’s how we differentiate between a string and a character.
In the following article, let’s understand one more string definition called string literal or string constant.
FastBit Embedded Brain Academy Courses
Click here: https://fastbitlab.com/course1