Insertion sort
Idea
The basic concept is similar to Selection Sort.
Considering there are two lists. One is already sorted,
and the other is unsorted, denoted Lsorted and Lunsorted respectively.
The key idea is to pick the element from Lunsorted one by one
and then insert them into the correct positions of Lsorted.
Suppose we have Lsorted=[3,8,34]
and Lunsorted=[23,2,67,34,97]:
- Step 1
- Pick 23 (which is the first element) from Lunsorted,
and insert it into Lsorted
- Find a position in Lsorted such that
all elements before it is less than or equal to 23
and all elements after it is greater than 23
- Start comparing it from the last(maximal) element
to the first(minimal) one in Lsorted
(Or you can do same thing from the first element to the last one)
- 34 is greater than 23, so we keep moving
- Next, we found that 8 is less than 23
- A-ha! 23 should be inserted between 8 and 34
- The Lsorted and Lunsorted are updated to [3,8,23,34]
and [2,67,34,97] respectively.
- Step 2
- Pick the current first element of Lunsorted, 2,
and insert it into Lsorted
- Same as the previous step, we start comparing 2 from the maximal element
of Lsorted to find the position to insert
- 34 is obviously larger than 2, so we should keep moving
- In this step, we can not find any element less than or equal to 2 after
the all elements in Lsorted are checked
- Thus, the 2 is the minimal value among these elements
- We should put 2 as the first element in Lsorted
- The Lsorted and Lunsorted are updated to [2,3,8,23,34]
and [67,34,97] respectively
- Step 3
- pick the current first element of Lunsorted, 67,
and then insert it into Lsorted
- Start comparing 67 with 34, we found 67 is greater
- It means that 67 is the maximal value among these elements
- Therefore, 67 should be inserted at the last position of Lsorted
- The Lsorted and Lunsorted are updated to
[2,3,8,23,34,67] and [34,97] respectively
- Step 4
- 34 is picked to compare with the elements in Lsorted.
- 67 is greater than 34, so go next
- 34 is equal to 34, so we stop here
- The picked 34 should be inserted between the existed 34 and 67
- so the Lsorted and Lunsorted are updated to
[2,3,8,23,34,34,67] and [97] respectively.
- Step 5
- 97 is picked to insert.
- 97 is greater than 67,
- so it should be put to the last position of Lsorted
- Finally, Lunsorted is empty
and Lsorted=[2,3,8,23,34,34,67,97].
How to find the inserted position
We can use the following method to find the first element
whose value is less than or equal to the picked element:
Position(L,x): i←N while i>0 and L[i]>x: i←i−1 return i
,where x is the element needs to be inserted,
L[i] is the ith element in the sorted list L,
and N=∣L∣ is the length of L.
After getting the position p=Position(L,x) given the element x,
we need to insert x between L[p] and L[p+1].
(If p=0, then we insert x as the first element L[1].
If p=N, then we insert x as the last element L[p+1].)
Dividing one list into unsorted list and sorted list
In implementation, we usually divide the source list L into two parts.
One is sorted, the other is unsorted.
They are denoted Lsorted and Lunsorted respectively.
This is better for memory usage than
creating another list to put the sorted results.
Suppose we have L=[73,24,37,9,97,29]=Lsorted∪Lunsorted,
where Lsorted and Lunsorted are initialized to []
and [73,24,37,9,97,29].
- First round
- 73 is picked, but there is nothing could be compared
- so we just put it into Lsorted
- Lsorted=[73] and Lunsorted=[24,37,9,97,29]
- now L=Lsorted∪Lunsorted=[73∣24,37,9,97,29]
- Second round
- 24 is picked and p=Position(Lsorted,24)=0
- so, we should insert 24 as the first element and update lists
- then Lsorted=[24,73] and Lunsorted=[37,9,97,29]
- now L=Lsorted∪Lunsorted=[24,73∣37,9,97,29]
- Third round
- 37 is picked and p=Position(Lsorted,37)=1
- so we should insert 37 between L[p]=L[1]=24 and L[p+1]=L[2]=73
- then Lsorted=[24,37,73] and Lunsorted=[9,97,29]
- now L=Lsorted∪Lunsorted=[24,37,73∣9,97,29]
- Fourth round
- 9 is picked and p=Position(Lsorted,9)=0
- Thus, Lsorted,Lunsorted are updated to
[9,24,37,73] and [97,29].
- now L=Lsorted∪Lunsorted=[9,24,37,73∣97,29]
- Fifth round
- 97 is picked and p=Position(Lsorted,97)=4=∣Lsorted∣
- so we should put 97 as the last element of the Lsorted
- then Lsorted=[9,24,37,73,97] and Lunsorted=[29]
- now L=Lsorted∪Lunsorted=[9,24,37,73,97∣29].
- Final round
- 29 is picked and p=Position(Lsorted,29)=2
- so we should insert 29 between L[p]=L[2]=24 and L[p+1]=L[3]=37
- then Lsorted=[9,24,29,37,73,97] and Lunsorted=[] is empty
- now L=Lsorted∪Lunsorted=[9,24,29,37,73,97].
Algorithm
InsertionSort(L): for i←2 to ∣L∣: j←i while j>1 and L[j−1]>L[j]: swap L[j−1] and L[j] j←j−1
The above method will divide L into two parts.
L[1...i−1]=Lsorted is sorted, and L[i...N]=Lunsorted is unsorted,
where N=∣L∣ is the length of L.
The L[i] will be picked to insert into Lsorted iteratively.
- When i=2
- Lsorted=L[1] and Lunsorted=L[2...N]
- The goal in this round is to insert the L[2] into Lsorted
- The L[2] is picked and compare with L[1]
- If L[2]<L[1], then we swap them
- Otherwise, do nothing
- Then, Lsorted=L[1...2] is sorted and Lunsorted=L[3...N]
- When i=3
- Lsorted=L[1...2] and Lunsorted=L[3...N]
- The goal in this round is to insert the L[3] into Lsorted
- The L[3] is picked
- If L[3]>=L[2], it means that L[1...3] is sorted, so we don't need to do anything
- Otherwise(L[3]<L[2]), swap L[3] and L[2]
and check whether it needs to swap again if L[2]<L[1]
- After finishing checking, Lsorted=L[1...3] is sorted
and Lunsorted=L[4...N]
- When i=k
- Lsorted=L[1...k−1] and Lunsorted=L[k...N]
- The goal in this round is to insert the L[k] into Lsorted
- The L[k] is picked to compare with the elements one by one in Lsorted,
from the maximal(L[k−1]) to minimal item(L[1]), to find a place to insert
- After finishing checking, Lsorted=L[1...k] is sorted
and Lunsorted is L[k+1...N]
- When i=N
- Lsorted=L[1...N−1] and Lunsorted=L[N]
- The goal in this round is to insert the L[N] into Lsorted
- In the same way, the L[1...N] is sorted after finishing the procedure
- so Lsorted is updated to L[1...N] and Lunsorted=[] is empty
Another method without swapping
InsertionSort(L): for i←2 to ∣L∣: c←L[i] j←i while j>1 and L[j−1]>c: L[j]=L[j−1] j=j−1 L[j]=c
Proof
Proof by mathematical induction
After each iteration for i in InsertionSort,
the L[1...i] is sorted array.
We need to prove this statement is true.
- when i=2:
- Same as the above explanation
- The assumption is hold
- when i=k:
- Assume the statement is hold when i=k
- L[1...k] is sorted array
- when i=k+1
- If L[k+1]>L[k], then L[1...k+1] is naturally sorted
so the proof is done
- Otherwise, the L[k+1] is swapped with L[k]
- Now L[1...k−1] is sorted and L[k+1]>L[k](after swapping!)
- Next, we apply this algorithm to L=L[1...k−1]∪L[k] and now i=k
- The statement is hold when i=k,
so L[1...k] is sorted after applying the algorithm
- Now L[1...k] is sorted and L[k+1]>L[k], so the proof is done
Complexity
The time complexity depends on the speed to find the inserted position.
The more iterations to find the value of Position(L,x) need,
the more time it takes.
The worst case is that we need to go through whole Lsorted to find correct
positions to insert. It happens when the list is arranged from maximal to
minimal values(e.g.,[5,4,3,2,1]).
In this case, if the length of list is N, we need to search
0+1+2+...+(N−1)=2N⋅(N−1)=21⋅N2−21N
times to move all the items into Lsorted.
Thus, the complexity is O(N2).
Implementation
See the files on gist here.