Python is an easy to learn, powerful programming language. It has efficient high-level data structures and a simple approach to object-oriented programming. To learn Python, first we need to learn how Python treats with data. In this article we will learn about data types and structures in Python 3 through several examples. You may find more about Python programming at:
Most common types in Python are:
str
int
, float
, complex
bool
NoneType
[]
()
{}
{key:value}
for element in ['a', True, None, 123, 0.777, 8j, [1,2], (1,2), {1,2}, {'key':1}]:
print(type(element))
## <class 'str'>
## <class 'bool'>
## <class 'NoneType'>
## <class 'int'>
## <class 'float'>
## <class 'complex'>
## <class 'list'>
## <class 'tuple'>
## <class 'set'>
## <class 'dict'>
To learn more about the built-in types, review Python standard types in here.
Strings, tuples and lists can be concatenated:
'some text ' + 'MORE TEXT'
## some text MORE TEXT
'repetition ' * 3
## repetition repetition repetition
# Lists
[1,3,'five'] + [7,9]
## [1, 3, 'five', 7, 9]
[1,3,'five'] * 2
## [1, 3, 'five', 1, 3, 'five']
# Tuples
(2,4,'six') + (8,10)
## (2, 4, 'six', 8, 10)
(2,4,'six') * 2
## (2, 4, 'six', 2, 4, 'six')
Lists, dictionaries and sets are mutable. Mutable objects can change their value but keep the same object (same id()
):
Numbers, strings and tuples are immutable. An object with a fixed value:
Strings, tuples, lists and dictionaries are subscriptable objects:
sw = 'Python'
sw[0]
## 'P'
sw[0:1]
## 'Py'
sw[::2]
## 'Pto'
names = ['turtle','polar bear','elephant','penguin']
names[:2]
## ['turtle', 'polar bear']
In general, numbers in the indexing square brackets can be in one of the following formats:
When negative numbers can be interpreted as inverse actions:
sw[-1] # last element
## 'n'
sw[::-1] # step 1 but in inverse order
## 'nohtyP'
names[:-2] # everything before the last two elements
## ['turtle', 'polar bear']
names[::-1]
## ['penguin', 'elephant', 'polar bear', 'turtle']
And empty clones could interpret as all:
As a summary review the following table:
Type | Concatenate | Subscriptable | Mutable |
---|---|---|---|
Number | No | No | No |
String | Yes | Yes | No |
Tuple | Yes | Yes | No |
List | Yes | Yes | Yes |
Dict | No | Yes | Yes |
Set | No | No | Yes |
We can use the following commands to convert objects to other types:
str()
create a stringint()
create an integerfloat()
create a floatcomplex()
create a complex numberbool()
create a booleanlist()
create a listtuple()
create a tupleset()
create a setdict()
create a dictionarystr(5) + ' five'
## 5 five
int('5') + 5
## 10
float('5') + 5
## 10.0
list((2,4,6))
## [2, 4, 6]
list({'a':1,'b':2})
## ['a', 'b']
list({'a':1,'b':2}.values())
## [1, 2]
tuple([1,3,5])
## (1,3,5)
set([1,3,5,1,3,5])
## {1, 3, 5}
dict([('a',1),('b',2),('c',3)])
## {'a': 1, 'b': 2, 'c': 3}
There are five major data structures in Python:
srt()
, ' '
list()
, []
tuple()
, ()
set()
, {}
dict()
, {key:value}
One of the way that data can be stored in Python is strings and as we discussed they are immutable and subscriptable objects that can be concatenated together. In python, any thing inside single or double quotes (' '
or " "
) considers as string. There are several methods available for strings and the following are some of the main methods for strings:
str.capitalize()
capitalizestr.title()
titlecasedstr.lower()
lowercasestr.upper()
uppercasestr.find(x)
find index of character xstr.index(x)
index of character x (similar to .find(x)
if x is in the string)str.count(x)
count how many times x repeatedstr.replace(x,y)
replace character x with ystr.split(x)
split an string to a list of strings based on the separator x (can be empty)str.join(x)
join list of strings or string x to make an string by a separator - opposite of .split()
str.startswith(x)
True if the string starts with x characterstr.endswith(x)
True if the string ends with x characterstr.strip()
removing whitespace from the beginning and endingstr.center('chr', num)
see an example in the belowFor example:
name = 'python'
name.startswith('p')
## True
name.capitalize()
## 'Python'
name.upper()
## 'PYTHON'
name.index('p')
## 0
name2 = name.replace('n', 'n3').replace('p', ',p')
name2
## ',python3'
nm = name + name2
nm
## 'python,python3'
nm_split = nm.split(',')
nm_split
## ['python', 'python3']
','.join(nm_split)
## 'python,python3'
'new york'.title()
## 'New York'
' This is a test '.center(30,'=')
## '======= This is a test ======='
A list is a set of objects enclosed by a set of square brackets ([]
). Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list.
ls = [1,3,5,7]
ls[0] = 100 # Lists are mutable
ls
## [100, 3, 5, 7]
# Lists can hold any type of item
example = [1,True,None,['word',123],'test',(0,1),{'name id': 7}]
# Indexing
ls[1:3]
## [3, 5]
example[3][1]
## 123
Here are main lists methods:
list.append(x)
append xlist.extend(x)
or +=
extend/add xlist.insert(i,x)
insert x to index ilist.remove(x)
remove xlist.pop(i)
pop out and remove item at index i (similar to del(list[i])
)list.pop()
pop out and remove the last itemlist.sort()
sortlist.reverse()
reverse the orderlist.count(x)
count number of times x is repeatedlist.index(x)
find index of item xlist.copy()
copy listlist.clear()
clear lista = [1,4,5]
a += [2,3]
a
## [1, 4, 5, 2, 3]
a.append([6,7])
a
## [1, 4, 5, 2, 3, [6, 7]]
a.remove([6,7])
a
## [1, 4, 5, 2, 3]
a.extend([6,7])
a
## [1, 4, 5, 2, 3, 6, 7]
a.sort()
a
## [1, 2, 3, 4, 5, 6, 7]
a.reverse()
a
## [7, 6, 5, 4, 3, 2, 1]
a.count(5)
## 1
a.index(5)
## 2
a.clear()
a
## []
Be cautious when set a list equal to another list. It might change list unintentionally, see the example in below:
list1 = [1, 2, 3]
list2 = list1
list2 += [4, 5]
list1
## [1, 2, 3, 4, 5]
list2
## [1, 2, 3, 4, 5]
id(list1)
## 139724935851336
id(list2)
## 139724935851336
As we can see, by changing list2
, list1
is changing automatically since list1 = list2
- they have a same ID as well.
We can work on list2
without changing list1
by using .copy()
or clone [:]
. For instance:
list1 = [1,2,3]
list2 = list1.copy() # or list2 = list1[:]
list2 += [4,5]
list1
## [1, 2, 3]
list2
## [1, 2, 3, 4, 5]
id(list1)
## 139724935851400
id(list2)
## 139724935851208
Iterating through lists:
[x**2 for x in range(5)]
## [0, 1, 4, 9, 16]
[x**2 for x in range(5) if x**2 < 10]
## [0, 1, 4, 9]
nested = []
p = [1,2,3]
for i in p:
nested.append([(x,i) for x in p])
nested
## [[(1, 1), (2, 1), (3, 1)],
## [(1, 2), (2, 2), (3, 2)],
## [(1, 3), (2, 3), (3, 3)]]
A tuple consists of a number of values separated by commas. Though tuples may seem similar to lists, they are often used in different situations and for different purposes. Tuples are immutable, and usually contain a heterogeneous sequence of elements that are accessed via unpacking or indexing. Tuples may be input with or without surrounding parentheses.
tp = 1399,'hello',1400 # It might include parentheses or not
type(tp)
## <class 'tuple'>
tp
## (1399, 'hello', 1400)
tp[0]
## 1399
tp[0] = 1390 # Tuples are immutable
## Traceback (most recent call last):
## File "<stdin>", line 1, in <module>
## TypeError: 'tuple' object does not support item assignment
singleton = 'hello', # A tuple should include at least a comma (,)
singleton
## ('hello',)
Since tuples are immutable, there are only two methods:
tuple.count(x)
count number of times x repeatedtuple.index(x)
find index of item xBy tuples we can change the variables at the same time. Let assume we have two variables a = 10
and b = 20
and want a = b = 20
and b = a + b = 30
. For example:
a = 10
b = 20
a = b
b = a + b
(a,b)
## (20, 40) # we wanted (20, 30)
## Using tuple
a = 10
b = 20
a, b = (b, a + b) # or a, b = b, a + b
(a,b)
## (20, 30)
Dictionaries (also called dicts) are key data structure including a set of keys and values. Unlike sequences (e.g. lists and tuples) which are indexed by a range of numbers, dictionaries are indexed by unique and immutable keys. At the same time, values of the list could be any type (mutable or immutable) and duplicated. The main operations on a dictionary are storing a value with some key and extracting value by given key.
example = {}
type(example)
## <class 'dict'>
example['first key'] = 'value'
example[2] = 'two'
example['third key'] = 3
example
## {'first key': 'value', 2: 'two', 'third key': 3}
example['first key']
## 'value'
Here are some of dictionaries methods:
dict.update()
update/add itemsdict.popitem()
remove the last itemdict.pop(k)
remove item with key kdict.keys()
return keysdict.values()
return valuesdict.items()
return itemsdict.get(k)
return value for key kdict.copy()
copy dictdict.clear()
clear dictFor example:
state = {'new york': 'NY', 'missouri': 'MS', 'california': 'CA'}
list(state.items())
## [('new york', 'NY'), ('missouri', 'MS'), ('california', 'CA')]
list(state.keys())
## ['new york', 'missouri', 'california']
list(state.values())
## ['NY', 'MS', 'CA']
state.get('missouri')
## 'MS'
state.update({'missouri': 'MO', 'Texas': 'TX' })
state
## {'new york': 'NY', 'missouri': 'MO', 'california': 'CA', 'Texas': 'TX'}
state.get('missouri')
## 'MO'
Iterating through dicts:
## Example 1: counting and sorting
color = ['blue', 'red', 'white', 'green', 'green', 'red', 'red', 'white', 'red', 'green']
col = {x:color.count(x) for x in color}
col
## {'blue': 1, 'red': 4, 'white': 2, 'green': 3}
c = {k:col[k] for k in sorted(col,key = col.get,reverse = True)}
c
## {'red': 4, 'green': 3, 'white': 2, 'blue': 1}
## Example 2: make dict elements uppercase
{x.upper():y for x,y in c.items()}
## {'RED': 4, 'GREEN': 3, 'WHITE': 2, 'BLUE': 1}
## Example 3: filtering dicts by value
d = {'a': 10, 'b': 12, 'c': 20}
{x:y for x,y in d.items() if y >= 12}
## {'b': 12, 'c': 20}
For another example, let’s consider a list of dictionaries including production rate of two products, id 23 and id 35, in years 2005 and 2010:
production = [{'id':23,'year':2005,'rate':2305},{'id':35,'year':2005,'rate':3505},{'id':23,'year':2010,'rate':2310},{'id':35,'year':2010,'rate':3510}]
We can make a dictionary of production rates for each id_year
combination such that:
annual_rates = {}
for i in production:
annual_rates['%s_%s' % (i['id'],i['year'])] = i['rate']
annual_rates
## {'23_2005': 2305, '35_2005': 3505, '23_2010': 2310, '35_2010': 3510}
To make a dictionary of list of the production rates over years, we can use collections
module. defaultdict(list)
makes a default dictionary of lists such that:
import collections
annual_rates = collections.defaultdict(list)
for i in production:
annual_rates[i['id']].append(i['rate'])
dict(annual_rates)
## {23: [2305, 2310], 35: [3505, 3510]}
Now let’s find total production over years:
annual_rates_total = {}
for i in annual_rates:
annual_rates_total[i] = sum(annual_rates[i])
annual_rates_total
## {23: 4615, 35: 7015}
Also, the dict()
constructor can accept an iterator that returns a finite stream of (key, value)
tuples:
L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
dict(L)
## {'Italy': 'Rome', 'France': 'Paris', 'US': 'Washington DC'}
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
st = {12,13,13,15,12,15}
st
## {12, 13, 15}
st[1]
## Traceback (most recent call last):
## File "<stdin>", line 1, in <module>
## TypeError: 'set' object does not support indexing
Note: to create empty sets use set()
because {}
considered as empty dictionary in python.
for element in [[], (), {}]:
print(type(element))
## <class 'list'>
## <class 'tuple'>
## <class 'dict'>
Sets have several methods including set operations such as:
set.add(x)
: add a member to the setset.update(x)
: add a set/list to a setset.remove(x)
: remove a member of the setset.pop()
: pop out and remove the first memberset.union(x)
: union of a set/list to a setset.intersection(x)
: intersection of a set/list to a setset.difference(x)
: difference of a set/list to a sets = {1,2,3}
t = {3,4,5}
s.union(t)
## {1, 2, 3, 4, 5}
s.intersection(t)
## {3}
s.difference(t)
## {1,2}
Iterating through sets: