:timing
:sccache 1
Tips and Traps¶
- Rust has 2 popular string types
String
andstr
(there are more string types in Rust but won't be covered here).String
can be MUTABLE (different from Java and Python) and is heap-allocated whilestr
is an immutable sequence of UTF-8 bytes somewhere in memory (static storage, heap or stack).String
owns the memory for it whilestr
does NOT. Since the size ofstr
is unknown, one can only handle it behind a pointer. This means thatstr
most commonly appears as&str
: a reference to some UTF-8 data, normally called a string slice or just a slice .&str
vsString
is similar toslice
vsarray
orVec
.
&String
is a reference to aString
type and is also called a borrowed type. It is nothing more than a pointer which you can pass around without giving up ownership.&String
can be coerced to a&str
implicitly.If you want a rea-only view of a string,
&str
is preferred. If you want to own and mutate a string,String
should be used. For example,String
should be used for returning strings created within a function or (usually) when storing sstrings in a struct or enum.Indexing into a string is not available in Rust. The reason for this is that Rust strings are encoded in UTF-8 internally, so the concept of indexing itself would be ambiguous and people would misuse it. Byte indexing is fast, but almost always incorrect (when your text contains non-ASCII symbols, byte indexing may leave you inside a character, which is really bad if you need text processing) while char indexing is not free because UTF-8 is a variable-length encoding, so you have to traverse the entire string to find the required code point.
There are 2 ways to get chars out of a string. First, you can call the
chars
method which returns an iterator. This ways is not efficient of course if you want random access. Second, you can get the underlying bytes representation of a string by calling theas_bytes
method (which returns a byte slice&[u8]
. You can then index the byte slice and convert au8
variable tochar
using theas
keyword.let my_str = "Hello World";
defines a&str
(notString
).If you have a
&str
and want a newString
, you can clone it either byto_owned()
orto_string()
(they are effectively the same). Both of those 2 methods will copy the memory and make a new String.
Convert &str
to String
¶
There are many ways to covnert a &str
to a String
.
&str.to_string
&str.to_owned
String::from
&str.into
(Into
is the reciprocal ofFrom
)String.push_str
The first 4 ways are equivalent. For more detailed discussions, please refer to How do I convert a &str to a String in Rust? .
Convert String to &str
¶
Assume s
is a value of String
,
there are (at least) 3 way to convert it to &str
.
&s[..]
&*s
s.as_str()
s.as_ref()
I personally perfer as.as_str()
or s.as_ref()
.
let s = "how are you".to_string();
s
s.as_str()
{
let s2: &str = s.as_ref();
s2
}
&str
vs String
vs AsRef<str>
for Function Parameters¶
&str
is preferred overString
as function paraemters unless you really need to own a string value in the function (in which case you needString
).If you need even a more generic string parameter or if you need a generic item type for a collection, you have to use
AsRef<str>
. For more discussions, please refer to AsRef .
&str¶
Primitive, immutable, fixed length.
let mut s: &str = "how are you";
s
let s2 = String::from("abc");
s2[0]
s[0]
s + 'a'
s.chars()
s.chars().nth(4)
s.push('c2')
s.is_empty()
s.len()
String¶
let s1: String = "Hello World!";
s1
let mut s2: String = String::from("Hello World!");
s2
s2 + 'a'
s2.push('a')
s2
Construct Strings¶
String::new¶
String::new
creates an new empty string.
String::new()
String::with_capacity
creates a new emtpy string with the given capacity.
let my_str = String::with_capacity(2);
my_str
my_str.capacity()
Cases of String¶
The
to_*case
methods return a new String object (mainly because changing the case of non-ASCII character might change the length of the string). Themake_ascii_*case
methods changes cases in place (as changing the case of ASCII characters won't change the length of the string).to_*case
methods change the case of all characters whileto_ascii_*case
methods only change the case of ASCII characters and leave non-ASCII characters unchanged.
to_lowercase and to_uppercase¶
to_ascii_lowercase and to_ascii_uppercase¶
make_ascii_lowercase and make_ascii_upper¶
chars¶
contains¶
get¶
let s: String = String::from("Hello World!");
s.get(0..3)
let s: String = String::from("Hello World!");
let ss = s.get(0..3).unwrap().to_string();
ss
join¶
["a", "b"].join("")
['a', 'b'].join("")
vec!["a", "b"].join("")
vec![String::from("a"), String::from("b")].join("")
len¶
"abcXXXabcYYYabc".matches("abc").collect::<Vec<_>>()
char::is_numeric('a')
char::is_numeric('1')
"1abc2abc3".matches(char::is_numeric).collect::<Vec<_>>()
replace¶
parse (Convert String to Other Types)¶
Convert an integer to string.
let s = 123.to_string();
s
1.to_string()
Convert a string to bytes.
"1".as_bytes()
1.to_string().as_bytes()
1i32.to_be_bytes()
Convert the string back to integer.
s.parse::<i32>()
s.parse::<i32>().unwrap()
push¶
You cannot concatenate a char to a string using the +
operator.
However,
you can use the String.push
method to add a char to the end of a String.
push_str¶
is_empty¶
split¶
"".split(",").collect::<Vec<&str>>()
"".split(" ").collect::<Vec<&str>>()
"1,2,3".split(",")
let mut it = "1,2,3".split(",");
it
it.next()
it.next()
it.next()
it.next()
let v: Vec<&str> = "1,2,3".split(",").collect();
v
v.capacity()
let v: Vec<i8> = "1,2,3".split(",").map(|x| x.parse::<i8>().unwrap()).collect();
v
v.capacity()
split_whitespace¶
"how are you".split_whitespace()
for word in "how are you".split_whitespace() {
println!("{}", word);
}
trim(&self) -> &str¶
Returns a string slice with leading and trailing whitespace removed.
" how\n".trim()
trim_end(&self) -> &str¶
Returns a string slice with trailing whitespace removed.
" how\n".trim_end()
trim_start(&self) -> &str¶
Returns a string slice with leading whitespace removed.
" how\n".trim_start()
with_capacity¶
let ss = String::with_capacity(3);
ss
Print Strings¶
You cannot use print an integer directly. Instead, you have to convert it to a String first.
It is suggested that you use
println!("{}", var);
to print the variable to terminal so that you do not have to worry about its type.m
println!(5)
println!("{}", 5)
println!("My name is {} and I'm {}", "Ben", 34);
println!("{0} * {0} = {1}", 3, 9);
println!("{x} * {x} = {y}", x=3, y=9);
Placeholder Traits¶
println!("Binary: {v:b}, Hex: {v:x}, Octol: {v:o}", v = 64);
Print an Iterable¶
println!("{:?}", ("Hello", "World"));
The concat!
Macro¶
Concatenates literals into a static string slice.
Concatenate a String and a Char¶
let mut my_str = String::from("Hello World");
my_str.push('!');
my_str
Concatenate Several Strings Together¶
The GitHub repo dclong/conccatenation_benchmarks-rs has a summary of different ways of joining strings and their corresponding performance.
Concatenate Strings in an Array/Vector¶
["how", "are", "you"].join(" ")
vec!["how", "are", "you"].join(" ")
Concatenate Strings in an Iterator¶
let v = vec!["how", "are", "you"];
v.into_iter().collect::<String>()
let v = vec!["how", "are", "you"];
v.into_iter().collect::<String>()
let arr = ["how", "are", "you"];
arr.into_iter().collect::<String>()
let arr = ["how", "are", "you"];
arr.into_iter().copied().collect::<String>()
let v = vec!["how", "are", "you"];
v.into_iter().intersperse(" ")
Indexing a String¶
Indexing into a string is not available in Rust. The reason for this is that Rust strings are encoded in UTF-8 internally, so the concept of indexing itself would be ambiguous and people would misuse it. Byte indexing is fast, but almost always incorrect (when your text contains non-ASCII symbols, byte indexing may leave you inside a character, which is really bad if you need text processing) while char indexing is not free because UTF-8 is a variable-length encoding, so you have to traverse the entire string to find the required code point.
There are 2 ways to get chars out of a string.
First,
you can call the chars
method which returns an iterator.
This ways is not efficient of course if you want random access.
Second,
you can get the underlying bytes representation of a string
by calling the as_bytes
method
(which returns a byte slice &[u8]
.
You can then index the byte slice and convert a u8
variable to char
using the as
keyword.
let s = String::from("how are you");
s[0]
let s = String::from("how are you");
s.chars().next()
let s = String::from("how are you");
s.as_bytes()[2] as char
Slicing a String¶
"how are you"[..]
"how are you"[..3]
"how are you"[4..]
"how are you"[4..7]
Format Strings¶
Please refer to Format Strings in Rust 1.58 for detailed discussions.
Third-party Libraries for String Manipulation¶
indoc
https://github.com/dtolnay/indoc
This crate provides a procedural macro for indented string literals.
The indoc!()
macro takes a multiline string literal
and un-indents it at compile time
so the leftmost non-space character is in the first column.
compact_str https://crates.io/crates/compact_str A memory efficient string type that transparently stores strings on the stack, when possible