STL string iterators: what happens on insert
Before I get to my problem let me explain how I think iterators work so we're clear that I got it:
Iterators point to the position of a character in a string rather than just to the adress, unlike pointers. A string doesn't need to be sequentialy stored in the memory but could be scattered across the memory in case of a pretty large string. The iterator of that string would know the positions of that string. A random access iterator could access any position in the string at once without iterating trough it sequentially.
If that is the case I could do the following:
Two iterators A & B point to the same string. Once I insert a few chars, let's say "78", into the string at the position marked by the first iterator A, the iterator B, pointing to a later position in the string, would have to be shifted by 2 to still mark the same character.
Example: A->123B->456; A points to the first and B to the 4th position in the string. Now we insert 78 at A, thus getting A->781B->23456. Instead of pointing to 4 the iterator B points now to 2; in order to correct it we would have to shift it by 2.
So my question:
-do iterators really safe the position rather than the adress?
-will the iterator point to the same position in the string after an insertion?
-can I correct the iterators position by shifting it, by the length of the inserted string, in order to get it to point at the old character?
[1456 byte] By [
Recrehal] at [2007-12-17]
Recrehal wrote: |
Before I get to my problem let me explain how I think iterators work so we're clear that I got it: Iterators point to the position of a character in a string rather than just to the adress, unlike pointers. A string doesn't need to be sequentialy stored in the memory but could be scattered across the memory in case of a pretty large string. The iterator of that string would know the positions of that string. A random access iterator could access any position in the string at once without iterating trough it sequentially. |
|
Exactly how iterators refer to elements is not specified, meaning that one string implementation's iterator type may just encapsulate a pointer, or in fact may just be a pointer without any encapsulation at all, or the iterator could be a class type which encapsulates a starting address and an offset, or it could be a deque-like iterator, or anything else you could think of.
Rather than exactly how an iterator is implemented, guidelines are put into place which govern correct usage of them for a given container. One of these guidelines for string is that when you insert an element into an ::std::string (or any other basic_string template instantation), all existing iterators into that string become invalidated. This means that once you insert an element, you should never expect to be able to use a previously acquired iterator in any manner, though you can use the return value of the call to insert as an iterator to the inserted character.
Recrehal wrote: |
If that is the case I could do the following: Two iterators A & B point to the same string. Once I insert a few chars, let's say "78", into the string at the position marked by the first iterator A, the iterator B, pointing to a later position in the string, would have to be shifted by 2 to still mark the same character. Example: A->123B->456; A points to the first and B to the 4th position in the string. Now we insert 78 at A, thus getting A->781B->23456. Instead of pointing to 4 the iterator B points now to 2; in order to correct it we would have to shift it by 2. |
|
Once the insertion takes place, both A and B become invalidated. Because of this, you can no longer assume that they refer to a valid location in the string and should not use them. Shifting B by two, while may work on some compilers in some situations, is not guaranteed to work portably or even consistently on a given compiler. Even if incrementing works in one or several cases, you should never rely on such behavior.
Recrehal wrote: |
So my question: -do iterators really safe the position rather than the adress? -will the iterator point to the same position in the string after an insertion? -can I correct the iterators position by shifting it, by the length of the inserted string, in order to get it to point at the old character? |
|
In order:
- Depends on the container and its implementation
- Depends on the implementation
- Depends on the implementation
Despite the apparent ambiguity, this is actually a good thing as it provides more freedom to implementors. For instance, a deque-like implementation may be used for a string, or a vector-like implementation may be used, depending on what the implementor decides. If you really wish to store the position in the string, then keep an index as opposed to an iterator and use that accordingly.
Indeed, I wish to store the position in the string. I'm a bit worried about performance when storing the position in the string rather then setting up an iterator. Aparently that's the only way to do it propperly but I'd really like to know if, in order to get to the requested position, I would have to go trough the string sequentially.
So is access to a position sequentially or is there a way to speed things up, probably with a random access iterator that jumps to the position directly?
Recrehal wrote: |
| Indeed, I wish to store the position in the string. I'm a bit worried about performance when storing the position in the string rather then setting up an iterator. Aparently that's the only way to do it propperly but I'd really like to know if, in order to get to the requested position, I would have to go trough the string sequentially. So is access to a position sequentially or is there a way to speed things up, probably with a random access iterator that jumps to the position directly? |
|
Accessing a string with an index is still performed in constant time, just like a vector or deque. Just do
your_string[ your_index ] or use
at.
Edit: Also, why is it that you need the functionality you described (keeping track of later parts of a string after inserting into the middle). The more I look over what you are saying, the more I think that you should just be using multiple strings, concatenated simply through output or through a string concatenation when needed, though it's difficult to say for sure with the information you provided.