What is a Data Structure ?
A Data structure is basically a mean of storing data in a organized manner that enables efficient access and modification. It can be one of the two following types :
- Primitive data structures.
- User-defined data structures.
Here is the list of Data Structures in Python :
As you can see, there are a lot of types. The challenge in this article, will be to as easiest as I can, explain each one ! With if I can with an example and a little of Python code.
“Bad programmers worry about the code. Good programmers worry about data structures and their relationships.” — Linus Torvalds, creator of Linux
Primitive Data Structures
Integers
Used to represent numeric data, and whole numbers from negative infinity to infinity, like 2, 4, or -1.
Float
Used for rational numbers, usually ending with a decimal figure, such as 1.11 or 3.14.
Strings
Collections of alphabets, words or other characters. (i.e “Lionel”)
Boolean
Built-in data type that can take up the values: True and False.
Non-Primitive Data Structures
Arrays
Arrays in Python are a compact way of collecting basic data types, all the entries in an array must be of the same data type.
Most of the time, people refers to list when they say Array in Python… There is a little difference as explained below.
Lists – Overview
Lists in Python are used to store collection of heterogeneous items.
With arrays, you can perform an operations on all its item individually easily, which may not be the case with lists. Thus, arrays can be very useful when dealing with a large collection of homogeneous data types. Moreover, as there is only one type in arrays, it can be more efficient to use Arrays in certain cases.
Lists – Stacks
A stack is a container of objects that are inserted and removed according to the Last-In-First-Out (LIFO) concept. They are declared as lists in Python.
Example : to conceptualize Stacks, think about a deck of cards. You will always draw from the top of the deck.
Lists – Queues
A queue is a container of objects that are inserted and removed according to the First-In-First-Out (FIFO) principle.
Example : The line at a ticket counter where people are catered according to their arrival sequence and hence the person who arrives first is also the first to leave.
Lists – Graphs
A graph in mathematics and computer science are networks consisting of nodes, also called vertices which may or may not be connected to each other. The lines or the path that connects two nodes is called an edge.
Example :Think about a mobile app which gives you the best route to follow. It will use that kind of Data Structure and will use algorithms to determine which route is the fastest.
Another example : Social network. The nodes are people and the edges are friendship.
Lists – Trees
A tree is a data structure composed of nodes It has the following characteristics:
- Each tree has a root node (at the top).
- The root node has zero or more child nodes.
- Each child node has zero or more child nodes, and so on.
Trees help in defining real world scenarios and are used everywhere from the gaming world to designing XML parsers and also the PDF design principle is based on trees.
Binary search trees allow fast lookup, addition and removal of items. Trees help in defining real world scenarios and are used everywhere from the gaming world to designing XML parsers and also the PDF design principle is based on trees.
Usage : Games like chess build a huge tree with all possible moves to analyse and apply heuristics to decide on an optimal move.
Tuples
Tuples are really close to list. The difference between tuples and list is that tuples are immutable, which means once defined you cannot delete, add or edit any values inside it.
Dictionary
Dictionaries are made up of key-value pairs. Key is used to identify the item and the value holds as the name suggests, the value of the item.
Example : a phone book
Sets
Sets are a collection of distinct (unique) objects. These are useful to create lists that only hold unique values in the dataset. Besides being able to add and remove elements to a set, there are a few other important set functions that work with two sets at once.
- Union — This combines all the items from two different sets and returns this as a new set (with no duplicates).
- Intersection — Given two sets, this function returns another set that has all items that are part of both sets.
- Difference — This returns a list of items that are in one set but NOT in a different set.
- Subset — This returns a boolean value that shows if all the elements in one set are included in a different set.
Files
Files are traditionally a part of data structures. And although big data is commonplace in the data science industry, a programming language without the capability to store and retrieve previously stored information would hardly be useful.