Strings¶
Strings are immutable, iterable “lists” of characters in Python. As strings are incredibly important, we will pay a little more attention to them before moving on to the remaining datatypes.
Basics¶
In python, there is no difference between a single and a double quoted string...
>>> s1 = 'abc'
>>> s2 = "abc"
>>> s1 == s2
True
Triple quoted, using single or double quotes, is a multiline string.
>>> s = '''this is
... a
... multiline
... string
... '''
>>> s
'this is\na \nmultiline\nstring\n'
>>> print s
this is
a
multiline
string
>>>
Raw Strings¶
Raw strings are very useful in Regular Expressions.
s4 = r"hello \n world" # <-- Raw string
Unicode Strings¶
In Python 2.x, strings are ASCII. Starting in Python 3.x, all strings are unicode by default.
>>> s = u'Hello\u0020World !'
>>> s
u'Hello World !'
string
module¶
There is also a string module. Most functions in the module are accessible as instance methods on string types, although there are a few helper attributes that can’t be found elsewhere.
import string
print string.ascii_letters
#'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
print string.digits
#'0123456789'
print string.punctuation
#'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
Operations¶
Strings can be concatenated with the +
operator:
>>> word = 'Help' + 'A'
>>> word
'HelpA'
Strings can be repeated with the *
operator:
>>> '<' + word*5 + '>'
'<HelpAHelpAHelpAHelpAHelpA>'
Loops¶
Accessing a string as an iterable in a for loop:
>>> for ch in s1:
... print ch
..............
a
b
c
d
Index Notation¶
>>> word = 'HelpA'
>>> word[4]
'A'
>>> word[0:2]
'He'
>>> word[2:4]
'lp'
>>> word[:2] # The first two characters
'He'
>>> word[2:] # Everything except the first two characters
'lpA'
>>> word[-1] # The last character
'A'
>>> word[-0] # (since -0 equals 0)
'H'
No reassigning chars:
>>> word[0] = 'x'
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object does not support item assignment
Formatting¶
Justification:
>>> "42".rjust(10)
' 42'
>>> "42".center(10)
' 42 '
>>> "42".ljust(10)
'42 '
Zero fill padding:
>>> '12'.zfill(5)
'00012'
>>> '-3.14'.zfill(7)
'-003.14'
Removing extraneous white space:
>>> ' Get Rid of Whitespace, including newlines \n'.strip()
'Get Rid of Whitespace, including newlines'
>>> ' Get Rid of Whitespace, including newlines \n'.rstrip()
' Get Rid of Whitespace'
>>> ' Get Rid of Whitespace, including newlines \n'.lstrip()
'Get Rid of Whitespace, including newlines \n'
Various methods:
>>> "hello, world".split(", ")
['hello', 'world']
>>> ", ".join(["hello", "world"])
'hello, world'
# Or the same thing, written statically with the string library imported.
>>> string.join(["hello", "world"], ", ")
'hello, world'
>>> 'The happy cat ran home.'.upper()
'THE HAPPY CAT RAN HOME.'
>>> 'The happy cat ran home.'.find('cat')
10
>>> 'The happy cat ran home.'.find('kitten')
-1
Modulus (%
) Operator¶
Python uses the %
(modulus) operator for formatting (modifying) strings.
Within the string to format, a % character marks a token. The %s
is the
conversion type. If we’re passing in a string, use “s”.
>>> state = 'California'
>>> 'It never rains in sunny %s.' % state
'It never rains in sunny California.'
With multiple inputs, we wrap in a tuple:
>>> "%s %s!" % ("hello", "world")
'hello world!'
Formatting a floating point output:
- zero filled
- 6 total characters (including decimal)
- precision of 3 (which includes all values, not just post decimal values)
>>> "%06.3g" % 10.5
'0010.5'
Referencing a value from a named attribute instead of a tuple. Can use a tuple or a map, not both:
>>> pets = {'dog': 'Fido', 'cat': 'Claude'}
>>> 'My dog is named %(dog)s, and my cat %(cat)s.' % pets
'My dog is named Fido, and my cat Claude.'
.format()
Method¶
Repeats reference to first argument:
>>> "First, thou shalt count to {0} then to {0}".format(10, 10)
'First, thou shalt count to 10 then to 10'
References keyword argument food
:
>>> "I like {food}".format(food='pizza')
'I like pizza'
Accessing class attributes:
>>> class Elephant(object):
... weight = 325
...
...
>>> class Elephant(object):
... weight = 325
...
>>> e = Elephant()
>>> "Weight in tons {0.weight}".format(e)
'Weight in tons 325'
First element of keyword argument players
:
>>> "Units destroyed: {players[0]}".format(players=[80])
'Units destroyed: 80'
Referencing items in a dict:
>>> d = {"dog": "cat"}
>>> "{0[dog]}".format(d)
'cat'
Or with a more shorthand notation:
>>> "{dog}".format(**d)
'cat'
Implicitly references the first positional argument:
>>> "Bring me a {}".format("shoe")
'Bring me a shoe'
New style formatting:
>>> '{0:<30}'.format('left aligned')
'left aligned '
>>> '{0:>30}'.format('right aligned')
' right aligned'
>>> '{0:^30}'.format('centered')
' centered '
# use '*' as a fill char
>>> '{0:*^30}'.format('centered')
'***********centered***********'