All Projects → xiaowang1105 → algorithms-in-python

xiaowang1105 / algorithms-in-python

Licence: MIT License
Some famous algorithms implemented in Python

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to algorithms-in-python

Python
All Algorithms implemented in Python
Stars: ✭ 125,688 (+598414.29%)
Mutual labels:  education, practice, learn, algorithms-implemented
Awesome Python Scripts
🚀 Curated collection of Awesome Python Scripts which will make you go wow. Dive into this world of 360+ scripts. Feel free to contribute. Show your support by ✨this repository.
Stars: ✭ 198 (+842.86%)
Mutual labels:  education, practice, learn
diwa
A Deliberately Insecure Web Application
Stars: ✭ 32 (+52.38%)
Mutual labels:  education, practice, learn
Elm Cheat Sheet
An overview of Elm syntax and features
Stars: ✭ 928 (+4319.05%)
Mutual labels:  education, learn
Git Kata
When you know the bases of git but sometimes you have problemes with it. This "code kata" could help you to deal with git problems
Stars: ✭ 127 (+504.76%)
Mutual labels:  practice, learn
R
All Algorithms implemented in R
Stars: ✭ 294 (+1300%)
Mutual labels:  education, practice
Free Courses
A collection of free courses about programming 📖
Stars: ✭ 281 (+1238.1%)
Mutual labels:  education, learn
Raspberry Pi Os
Learning operating system development using Linux kernel and Raspberry Pi
Stars: ✭ 11,000 (+52280.95%)
Mutual labels:  education, learn
Sagefy
🔭 Learn anything, adapted for you. Free.
Stars: ✭ 80 (+280.95%)
Mutual labels:  education, learn
Woofjs
Learnable JavaScript
Stars: ✭ 128 (+509.52%)
Mutual labels:  education, learn
Rust Algorithms
Common data structures and algorithms in Rust
Stars: ✭ 2,918 (+13795.24%)
Mutual labels:  education, learn
awesome-course
Create awesome courses that let your audience learn by coding ⌨️
Stars: ✭ 224 (+966.67%)
Mutual labels:  education, learn
Lpic 1 Anki Flashcards
Deck of Anki flashcards for the LPIC-1 (Linux System Administrator) exams 101 and 102 of the Linux Professional Institute (LPI).
Stars: ✭ 90 (+328.57%)
Mutual labels:  education, learn
Coding Problems
Solutions for various coding/algorithmic problems and many useful resources for learning algorithms and data structures
Stars: ✭ 2,221 (+10476.19%)
Mutual labels:  education, learn
bash-course
Material for the advanced bash scripting course at Heidelberg University
Stars: ✭ 35 (+66.67%)
Mutual labels:  education, learn
vim-workshop
My thorough introduction to Vim
Stars: ✭ 30 (+42.86%)
Mutual labels:  education
technopsyna
телеграм бот для техноконфы
Stars: ✭ 16 (-23.81%)
Mutual labels:  education
web-development-101
Практическое введение во все инструменты, необходимые для создания рабочих веб-сайтов.
Stars: ✭ 21 (+0%)
Mutual labels:  education
awesome-physics
🏄 A list of awesome resources I used to study Physics.
Stars: ✭ 27 (+28.57%)
Mutual labels:  education
python-exercises
Exercises for Python
Stars: ✭ 17 (-19.05%)
Mutual labels:  learn

algorithms-in-python

This repository, I will use Python to implement some famous algorithms. The algorithms are arranged according to the strategy used. Each algorithm will have an explanation to the problem it attempts to solves and some relevant resources.
(The goal of this repository is to create a beautiful README for all of these brilliant algorithms that I have reviewed.)

Content:

1.0 - Divide and Conquer

This section, I will talk about the famous divide and conquer strategy and show some applications of this strategy.

1.1 - Interger Multiplication Problem

The standard procedure for multiplication of two n-digit numbers requires a number of elementary operations proportional to equation. As for The Karatsuba algorithm, it reduces the running time to at most equation

Key idea
The basic step of Karatsuba's algorithm is a formula that allows one to compute the product of two large numbers equation and equation using three multiplications of smaller numbers, each with about half as many digits as equation or equation, plus some additions and digit shifts.
Properties

1.2 - Merge Sort (and Insertion Sort)

  • Useful Link
    Before we discuss merge sort, the insertion sort algorithm will be analyzed and its running time to appreciate merge sort. The picture below shows the intuitive idea for sorting is to imitate how cards are arranged according to its size. When we receive a card, we immediately arrange it based on other cards in our hand.

insert

The worst running time for this algorithm is equation.
Gif above shows how merge sort works: merge

Key idea
Divide the unsorted list into n sublists, each containing 1 element (a list of 1 element is considered sorted) and repeatedly merge sublists to produce new sorted sublists until there is only 1 sublist remaining. This will be the sorted list.
Properties

1.3 - Count Inversions

  • Useful Link
    Actually, this can be treated as application of Merger Sort. Every time we do merge operation in merge sort, we implicitly calculate the inversions.
    inversion

Key idea
Like the figure above, when we first take in element from right sub-array in merge operation, that indicates the right element is smaller than ( length of left sub-array - the index of left element) elements.
As the algorithm progresses, add all the inversions will give us the total inversions.
Properties

1.4 - Maximum Subarray

  • Useful Link
    The maximum subarray problem is the task of finding the contiguous subarray within a one-dimensional array, a[1...n], of numbers which has the largest sum.
    Key idea
    If we use the divide and conquer strategy, if the array is A[low..high] and the middle point is represented as mid. A[i..j]is what we want to calculate. A[i..j] has to be one of the three cases:
  • A[i..j] belongs to A[low..mid]
  • A[i..j] belongs to A[mid+1..high]
  • i <= mid < j
    So, our job is to find the largest sub array crossing the mid point and choose the largest one among these three algorithms. cross

Properties

2.0 - Randomized Algorithms

This section I will talk about two algorithms which has used random variable inside.

2.1 - Quick Sort

Key idea
Quicksort first divides a large array into two smaller sub-arrays: the low elements and the high elements relative to a randomly chosen element. Quicksort can then recursively sort the sub-arrays. So, the key point in quick sort is to choose partition element.
Properties

  • Worst case performance O(n^2)
  • Best case performance O(n log n) or O(n) with three-way partition
  • Average case performance O(n log n)
    Back to Top

2.2 - Karger Algorithm

  • Useful Link
    The idea of the algorithm is based on the concept of contraction of an edge (u,v) in an undirected graph G=(V,E). Informally speaking, the contraction of an edge merges the nodes u and v into one, reducing the total number of nodes of the graph by one. The figure below shows how contraction works. In the sub figure left, two Bold Black nodes are fused into one (the sub figure in the right). karger

Key idea
By repeating the contraction algorithm equation times with independent random choices and returning the smallest cut, the probability of not finding a minimum cut is equation
Properties

  • With high probability we can find all min cuts in the running time of equation
  • Not finding a min cut probability is equation times.
    Back to Top

3.0 - Data Structures

To place data structures as an independent section is misleading; however, I will introduce perplexing problems which are elegantly solved by data structures. Some data structures may have an algorithm design strategy that have not been reviewed yet.

3.1 - Queue and Breadth First Search

  • Useful Link1 Link2
    Queue, also known as FIFO, is an acronym for first in, first out, a method for organizing and manipulating a data buffer, where the oldest (first) entry, or 'head' of the queue, is processed first. queue

Key idea
BFS is used to count reachable nodes from a source node in a undirected or directed graph. Reachable nodes are printed in the order found. A queue is used to store the nodes colored grey (see the gif below). The term "breadth" in BFS means finding a reachable node with the shortest length. The breadth extends the border between the visited nodes and the undiscovered nodes.

bfs

Properties

  • Running Time: T(n)= O(V+E), V is number of vertices, E is number of edges
    Back to Top

3.2 - Stack and Depth First Search

  • Useful Link1 Link2
    Stack, also known as LIFO, has the property of last in, first out.
    stack

Key idea
And DFS is used in directed graph and it tells how many nodes a source node can reach and print them out by the order we find them. We use stack to store the nodes we classify as the start points for graph. The "depth" in its name means that this algorithm will go as deeply as it can for a given source and when it reaches the endpoint, it returns to the start node.

depth

Properties

  • Running Time: T(n)= O(V+E), V is number of vertices, E is number of edges
    Back to Top

3.3 - Heap and Median Median Maintenance

  • Useful Link1 Link2
    Heap is a specialized tree-based data structure that satisfies the heap property: if P is a parent node of C, then the key (the value) of P is either greater than or equal to (in a max heap) or less than or equal to (in a min heap) the key of C.[1] The node at the "top" of the heap (with no parents) is called the root node.

heap

Median Maintenance problem is that if integers are read from a data stream, find median of elements read so for in efficient way. For simplicity assume there are no duplicates.

Key idea
We can use a max heap on left side to represent elements that are less than effective median, and a min heap on right side to represent elements that are greater than effective median. When the difference between size of two heaps is greater or equal to 2, we switch one element to another smaller size heap.

Properties

3.4 - Strongly Connected Component

  • Useful Link
    A directed graph is called strongly connected if there is a path in each direction between each pair of vertices of the graph. In a directed graph G that may not itself be strongly connected, a pair of vertices u and v are said to be strongly connected to each other if there is a path in each direction between them.

scc

Key idea
Through simple observation, we find out that tranpose of graph has the same SCCs as the original graph. We run DFS twice. First time, we run it on G and compute finishing time for each vertex. And then, we run DFS on G^T but in the main loop of DFS, consider the vertices in order of decreasing finishing time.
Properties

  • Running Time: equation, V is number of vertices, E is number of edges
    Back to Top

3.5 - Disjoint-set and SCC

  • Useful Link
    A disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets.
    For a naive disjoint-set, it supports two main operations, Make-Set and Union. Make-set make every vertex an independent group. Union puts two vertices in one group.

union

Key idea
In an undirected graph, we can use this data structure to find out how many SCCs. The figure below shows how it works.

scc_uf

Properties

  • We can use a weighted-union heuristic to save time when call find_set operation
  • Path compression can greatly improve time efficiency of union find
    Back to Top

4.0 - Greedy Algorithms

In this section, I'm going to introducce greedy algorithms, one powerful algorithm design strategy.
From Wikipedia, a greedy algorithm is an algorithmic paradigm that follows the problem solving heuristic of making the locally optimal choice at each stage[1] with the hope of finding a global optimum. In many problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a greedy heuristic may yield locally optimal solutions that approximate a global optimal solution in a reasonable time.

4.1 - Shedule Activities

In activity selection problem, every activity has its own weight and length. And our goal is to minimize the sum of weight*length. It is a very easy and great example to show how greedy algorithm works and provide an elegant proof using argument exchange technique.
Key idea
If we sort activity by value weight/length, we can prove an existing optimal structure cannot be better than this arrangement. act

As the figure shown above, we consider the cost caused by two activities that are ranged differently in two arrangement (i, j). We find out that the cost in greedy algorithm is smaller than optimal structure by the value of wi*lj - wj*li, which is greater than or equal to 0.

Properties

4.2 - Activity selection

  • Useful Link
    In this problem, every activity has its own start time and finish time. our goal is to selection a max-length subset, where jobs are compatible.

selection

Key idea
We sorted the array according to its finish time. The algorithm put the first job whose start time is bigger than last job's finish time.
Properties

  • The recursive activity selection running time is equation
    Back to Top

4.3 - Huffman Coding

  • Useful Link
    Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. For example, the table below shows letters' frequency in a "book".

freq

One way to encode this book is to use fixed length coding. As shown below:

fixed

As for huffman coding, the actual tree structure looks like this:

huff

Key idea
We maintain a binary tree and create a new node as the parent for two least-frequent letters. And the key for this new node is the sum of keys for its two children. We repeat this until no nodes left in this "book".

huff_works

Properties

  • Procedure HUFFMAN produces an optimal prefix code
    Back to Top

4.4 - Dijkstra Algorithm

Dijkstra's algorithm is an algorithm for finding the shortest paths between nodes in a graph. However, it has one prerequisite, all paths have to be greater or equal to 0.

dij

Key idea
Separate nodes into two groups, one group is marked as explored. And we update the distance from unexplored group to explored group by the shortest distance.

Properties

  • Running time equation based on a min-priority queue implemented by a Fibonacci heap
    Back to Top

4.5 - Prim Algorithm

  • Useful Link
    Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph. Very similar to Dijkstra Algorithm, we maintain two sets, explored one and unexplored one. Every time, we only absorb the vertex, which has the smallest distance to explored set. This is shown very clearly in the figure below:

prim

Key idea
The algorithm may informally be described as performing the following steps:

Initialize a tree with a single vertex, chosen arbitrarily from the graph.
Grow the tree by one edge: of the edges that connect the tree to vertices not yet in the tree, find the minimum-weight edge, and transfer it to the tree.
Repeat step 2 (until all vertices are in the tree).

Properties

  • Running time
  • adjacency matrix, searching equation
  • binary heap and adjacency list equation
  • Fibonacci heap and adjacency list equation
    Back to Top

4.6 - Kruskal Algorithm and Clustering Problem

  • Useful Link
    Prim's algorithm is another greedy algorithm that finds a minimum spanning tree for a weighted undirected graph. Instead of maintaining a tree like Prim, it maintains forest.

kruskal

Key idea
Very similar to SCC, we can early stop the alogrithm to control number of classes in our graph, which is to say we can cluster the graph.

Properties

5.0 - Dynamic Programming

In this section, I'm going to introduce dynamic algorithms, one powerful algorithm design strategy.
From Wikipedia, dynamic programming (also known as dynamic optimization) is a method for solving a complex problem by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions.

5.1 - Rod Cutting

  • Useful Link
    Given a rod of length n inches and an array of prices that contains prices of all pieces of size smaller than n. Determine the maximum value obtainable by cutting up the rod and selling the pieces.
    The following table shows the relationship between price and length.

len_price

So, if length of the rod is 8 and the values of different pieces are given as following, then the maximum obtainable value is 22 (by cutting in two pieces of lengths 2 and 6).

Key idea
We view a decomposition as consisting of a first piece of length i cut off the left-hand end, and then a right-hand remainder of length n - i. equation So, the pseudocode looks like:

rod_code

The recursion tree showing recursive calls resulting from a call CUT_ROD(p, n) looks like:

rod_tree

In order to save the repeated computation for small sub-problems, we memorized an array to store these values.

Properties

  • Time Complexity of the above implementation is O(n^2)
    Back to Top

5.2 - Matrix Chain Multiplication

  • Useful Link
    Matrix chain multiplication (or Matrix Chain Ordering Problem, MCOP) is an optimization problem that can be solved using dynamic programming. Given a sequence of matrices, the goal is to find the most efficient way to multiply these matrices. The problem is not actually to perform the multiplications, but merely to decide the sequence of the matrix multiplications involved.

matrix_mul

Key idea
Optimal structure:

mul_opt

Properties

  • Time Complexity of the above implementation is equation
    Back to Top

5.3 - Longest Common Subsequence

  • Useful Link
    The longest common subsequence (LCS) problem is the problem of finding the longest subsequence common to all sequences in a set of sequences (often just two sequences).

Key idea
From CLRS, the optimal structure for this problem is:

long_op

Properties

  • Time Complexity of the above implementation is $\theta(mn)$
    Back to Top

5.4 - Floyd-Warshall Algorithm

  • Useful Link
    Floyd–Warshall algorithm is an algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles).

Key idea
This algorithm is based on very intuitive observation. Suppose we have a subset {1, 2, 3, 4, ..., k} (denote as V(k)) of original vertices group {1, 2, 3, ..., n}. If p is a shortest path from i to j in V(k), we have two cases. First, if k is not in p, then p must be a shortest path in V(k-1). Second, if k is in p, then we can seperate the path into two, P1: ik, P2:kj. and P1 must be the shortest path from i to k with V(k-1), P2 must be the shortest path from k to j with V(k-1).

Floyd-Warshall_example

Properties

  • Time Complexity of the above implementation is equation
    Back to Top

6.0 - Approximation Algorithms for NPC

From From Wikipedia, an NP-complete decision problem is one belonging to both the NP and the NP-hard complexity classes. In this context, NP stands for "nondeterministic polynomial time". The set of NP-complete problems is often denoted by NP-C or NPC.

In this section, I am going to introduce three very famous NPC problems and explain approximation algorithms to approach them.

6.1 - Vertex Cover

  • Useful Link
    A vertex cover (sometimes node cover) of a graph is a set of vertices such that each edge of the graph is incident to at least one vertex of the set. The figure below shows a minimum vertex cover (where the cover set must have at least two vertice, zero and one wouldn't help).

cover

Key idea
It is very difficult to find a minimum vertex cover but we can find a approximate cover with at most twice the vertices in polynomial time.

cover_app

Properties

  • APPROR-VERTEX-COVER is a polynomial-time 2-approximation algorithm.
    Back to Top

6.2 - Travelling Salesman Problem

  • Useful Link
    The travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?" The gif shows brute force for tsp.

bruce

Key idea
The approximation algorithm for tsp is a greedy algorithm (CLRS P1114). Here, I also want to introduce an exact algorithm for tsp using dynamic programming.

Subproblem: for every destination j ∈ {1,2,...,n}, every subset S ⊆ {1,2,...,n} that contains 1 and j, let L S,j = minimum length of a path from 1 to j that visits precisely the vertices of S [exactly once each]. So, Corresponding recurrence:

exact_tsp

Properties

6.3 - Boolean Satisfiability Problem

  • Useful Link
    Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated as SATISFIABILITY or SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula.

3-SAT is one of Karp's 21 NP-complete problems, and it is used as a starting point for proving that other problems are also NP-hard.

Herein, I introduce Papadimitriou’s Algorithm for 2-SAT (This is a very very very interesting algorithm , so I decide to introduce it even though 2-SAT is not NPC).

Key idea
Choose random initial assignment and pick arbitrary unsatisfied clause and flip the value of one of its variable. Here is the pseudocode:

2sat

Such a casual dealing with clauses would suprisingly give us a very large probability to find the real answer. The mechanism lies in a physics term (random walk). If you are interested, I strongly recommend you to go through how random walk related to Papadimitriou.

Properties

  • For a satisfiable 2-SAT instance with n variables, Papadimitriou’s algorithm produces a satisfying assignment with probability ≥ 1 − 1/n
  • Runs in polynomial time
  • Always correct on unsatisfiable instances
    Back to Top
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].