Machine learning (ML) and complexity are not synonymous. In fact, machine learning is a way to wade through complexities posed by the data. But, what if we try to examine the level of algorithmic complexity before applying the ML algorithms?
Most of the ML algorithms are definite answers to solve complexities and arrive at an optimal solution. Yet, some algorithms have complexities that can make us contemplate the level of expertise to examine and use them; and to apply them for getting the best results.
In general, algorithms are a number of algebraic instructions to find a solution or a number of solutions to a problem. Wherever, there is big data in the application, utilizing the power of algorithms is a necessity. From computer games to listening to recommended music, and GPS mapping to doing financial transactions online, there are numerous algorithms at play at various instances of an AI application or program. The AI turf has been evolving constantly and more developed methodologies of machine learning are being invented. Self-programmed algorithms are emerging as new alternatives, wherein an algorithm is writing itself in an AI app to find a solution.
What is algorithmic complexity?
Algorithmic complexity is often called running time. It is a measure to examine how long an algorithm would take to calculate, given an input of size n. Normally, an algorithm should be able to compute results within a defined time for large values of n; so that it can be considered for scalability. Algorithmic complexity is calculated based on time and space, corresponding to an algorithm’s requirement for memory for calculation. This is also referred to as computational complexity and is a branch of computer science.
While several types of notations are used for representing Algorithmic complexity, Algorithms can be used for solving linear, logarithmic, or quadratic complexities. Big Theta, Big O, and Big Omega are asymptotic notations for representing the growth of the algorithm.
Out of other notations, the Big O can present the best case, worst case, and average-case running time of the algorithm. Big O provides clarity on an algorithm’s capacity to scale when input size increases. In evaluating the algorithmic complexity, best case, worst case, and average-case time complexity are counted.
Let us assume that in a list of unsorted items if we are finding an item sequentially. If the item is found at the start of the list then this would be the best case; else if the item is placed at the end of the list, then that would be referred to as the worst case. In terms of finding the time complexity of an algorithm, usually, if the searched item is at the start then complexity is represented as 0(1); if it is placed at the last one then complexity is represented as 0(n) and average-case time complexity is usually taken as 0(n).
For example (for linear complexity):
Worst case: Big O — 0 (n)
Best case: Big Omega — Ω(n)
Average case: Big Theta — Θ(n)
Big O is the commonly used notation as it clearly indicates if an algorithm can scale with the input size or not. This is also referred to as the order of growth. An algorithm can take different amounts of time for different types of output. While algorithmic complexity is vital to ensure the efficiency of the machine learning model, it significantly impacts the accuracy level of the machine learning model.
Techniques to figure out algorithmic complexity
To understand and figure out the complexity of an algorithm, it is better to check the number of instructions (relation to input size) required by it for working efficiently rather than the time it will take in execution.
For instance, if a search algorithm parts the input into two and removes the one part in each iteration then it will be called a logarithmic algorithm. Mergesort algorithm divides the input into two parts at each iteration, and also performs merge operation in linear time at each iteration, and will be regarded as an algorithm with log-linear complexity. Similarly, to find out if the algorithm is quadratic, check if loop iterations are linear and each loop iterates within the input.
For reference, complexity of some of the algorithms is as below:
- Matrix multiplication due to Coppersmith and Winograd: O(n2.496)
- GCD(a, b) by Euclid’s algorithm: O(log(a+b))O(log(a+b)) but O((log(a+b))2)
- Fast Fourier Transform:O(nlogn)
Relevance of data structures
Since each algorithm involves data, thinking about how data structures can impact the algorithm’s performance is the next step. While working on data sets, insertion, searching, deletion are required and such operations help in optimizing the data as per the algorithm chosen. So, to say that data structures matter in the complexity of an algorithm would not be invalid, since data can vastly impact the performance of an algorithm. To quote further on this, if an array is accessed with a linked list then it will reveal linear-complexity while with an index, the data will show constant complexity. Some of the complexities of frequently-used sorting algorithms are below.
The main purpose of knowing algorithmic complexity is to implement improvement and compare the efficiency level of an algorithm in solving a problem. Questioning an algorithm’s performance when high-caliber memories and processors are more accessible can raise a few brows. However, if we take a closer look at what an algorithm can do in terms of its implementation is actually a design concern and more related to an “idea” than a hardware requirement. While hardware does count in executing an algorithm and coming up with results faster. However, tuning the algorithm’s efficiency will be significant to get a precise solution with a not-so-expanded hardware requirement. In short, the algorithmic complexity will matter against large-sized data structures and sets for reaching a conclusive point.