A string is probably the easiest sequence data type to understand. We assign a String variable by declaring the name and some text between quotation marks that we want store in memory.
Like regular variables if we call the variable name of a string we will get the text we have assigned to the variable name. However, as strings are actually sequence data types then each individual character is stored as an individual element within the variable structure and is assigned an index number. This means we can reference and manipulate each individual character within the string using an Index or a substring using an Index Range.
If we use the Python LEN function we can find the number of characters in any string variable.
We can also check if a character or substring is contained within a string. It will return True if the substring is present in the string or False if not.
Sometimes we will find that we have unwanted characters in our strings. This will be less of a concern when we define our own string variables but when we start working with imported data, where we have less control over the source data, it will be a more frequent occurrence that can cause issues when processing text. To solve this, we can use the Python Strip function to remove a character or sequence of characters from a string. In the example below, the variable “country” contains an unwanted character at the start of the string. Let’s use the Python Strip function to remove this character.
There will be times when we will want to replace characters in a string rather than just remove them. For this purpose we can use the Python Replace function. Again, with the variable country we can see that the two words in the string are separated by an underscore. Let’s use the Python Replace function to replace the underscore with a space.