The Dawn of AI (Machine Learning Tribes | Deep Learning | What Is Machine Learning)

24
6


In the past videos in this AI series we have delved quite deep into the field of machine learning, discussing both supervised and unsupervised learning. The focus of this video then is to consolidate many of the topics we’ve discussed in the past videos and answer the question posed at the start of this machine learning series, the difference between artificial intelligence and machine learning! As a quick recap, over the past two videos in this series we have discussed both supervised and unsupervised learning, with them both being subsets of the field of machine learning. Supervised learning is when we have labeled, structured data, and the algorithms we’re using determine the output based on the input data. Unsupervised learning on the other hand is for unlabeled, unstructured data where our algorithms of choice are tasked with deriving structure from unstructured data, to be able to predict output data based on input data. Additionally, both supervised and unsupervised learning are then further subsectioned: 1) Regression – A supervised learning approach where the output is the value of a feature based on the correlation with another feature. That being on the continuous line of best fit our algorithm determines. 2) Classification – A supervised learning approach where the output is the label of a data point based on the category the point was in. There are a number of discrete categories whose decision boundaries are determined based on the algorithm we choose. 3) Clustering – An unsupervised learning approach where we must discover the categories various data points lie in based on the relationships of their features. 4) Association – An unsupervised learning approach where we must discover the correlations of features in a data set. As stated in the past, while it is nice to view these topics in their own little bubbles, often there’s a lot of crossover between various techniques, for instance, in the case of semi-supervised learning. This wasn’t discussed previously but it is essentially when our data set contains both labeled and unlabeled data. So, in this instance when we have both these types of data, we may first cluster the data and then run classification algorithms on it, or a multitude of other combinations of techniques. So, now with the recap out of the way and a general understanding of the types of machine learning and the knowledge of all the terminology we have covered in the past videos, we can now begin to decipher what the term machine learning really means and how it relates to artificial intelligence and other fields! As stated in the first video in this series, the term machine learning was coined by computing pioneer Arthur Samuel and is, “a field of study that gives computers the ability to learn without being explicitly programmed”. With such a broad definition, one can argue and would be correct in stating that all ‘useful’ programs ‘learn’ something, however, the level of true learning varies. This level of learning is dependent on the algorithms the programs incorporate. Now going back a few steps, an algorithm is a concept that has existed for centuries, since the dawn of human civilization. It is a term referring to a process or set of rules to be followed in calculations or other problem-solving operations. While anything can be referred to as an algorithm, such as the recipe for a food dish or the steps needed to start a fire, it is a term most commonly used to describe our understanding of mathematics and how it relates to the world around us, the informational fabric of reality. Progressing forward, with the rise of computing, essentially a field built on the premise of speeding up mathematical calculations, gave way to the birth of computer science, in which algorithms now defined the processing, storing and communication of digital information. The ability to iterate through algorithms at the lightning fast speed computers operate at over the past century has led to the implementation and discovery of various algorithms, to list a few, we have sorting algorithms like Bubble Sort and Quick Sort, shortest path algorithms like Dijkstra and A*, and this list can go on and on for a variety of problems. These algorithms while able to perform tasks that appear to be learning are really just iteratively performing pre-programmed steps to achieve the results, in stark contrast to the definition of machine learning, to learn without explicit programming. Reflecting back on the past few videos in this series in which we discussed the types of machine learning, both supervised and unsupervised, there’s one common thread that runs through them both. To utilize a variety of techniques, approaches and algorithms to form decision boundaries over a data sets decision space. This divided up decision space is referred to as the machine learning model, and the process of forming the model, that being the decision boundaries in the data set is referred to as training. This training of a model draws parallels to the first primary type of knowledge we as humans display, declarative knowledge, in other words, memorization, the accumulation of individual facts. Once we have a trained model and it is exhibiting good accuracy on training data then we can use that model for the next step, inference. This is the ability to predict the outputs whether that be a value or a category of new data. Machine learning inference draws parallels to the second primary type of knowledge we exhibit, imperative knowledge, in other words, generalization, the ability to deduce new facts from old facts. Additionally, as the model encounters new data it can use it a train further, refining its decision boundaries to become better at inferring future data. Now this whole process we just discussed is defined in the second most widely used definition of machine learning, stated by Dr. Tom Mitchell of Carnegie Mellon University, “a computer is said to learn from experience ‘E’, with respect to some class of tasks ‘T’ and performance measure ‘P’, if its performance at tasks in ‘T’ as measured by ‘P’, improves with experience ‘E’. So, while it is correct in stating that all ‘useful’ programs ‘learn’ something from data, I hope the distinction between the level of learning of machine learning models and typical algorithms is now more clear. The rise of machine learning, domain-specific weak artificial intelligence as it is referred to, has been decades in the making. But, first, what is artificial intelligence? As I hope you’ve learnt from past videos in this series, AI refers to any model that can mimic, develop or demonstrate human thinking, perception or actions. In our case this refers to computing-based AI. In our first two videos in this AI series, the History and Birth of AI, we saw the development of the field of artificial intelligence, from trying to develop a more general AI also called a strong AI to focusing on acquiring domain-specific expertise in various fields. This turning point in the field of AI was due to expert systems in the 80s, essentially complex conditional logic, that being if-then-else statements, that were tailored for a respective field of knowledge by experts in that field. At the end of that Birth of AI video, the time period we left off on was the AI Bust which was at the start of the 90s, a low-point in the AI hype cycle due to over promises made on what expert systems could really do. After this point the development of intelligent systems went into the background due to the lack of funding and mainstream interest in the field and the rapid technological progress made in so many other fields, from the invention of the internet, commercialization of computers, mobile phones – the list can go on and on. During this time period of the 90s, expert systems and algorithms originally developed by AI researchers began to appear as parts of larger systems. These algorithms had solved a lot of very difficult problems and their solutions proved to be useful throughout the technology industry, such as: data mining, industrial robotics, logistics, speech recognition, banking software, medical diagnosis and Google’s search engine to list a few. However, the field of AI received little or no credit for these successes in the 1990s and early 2000s. Many of the field of AI’s greatest innovations had been reduced to the status of just another item in the tool chest of computer science. As Nick Bostrom, author of Superintelligence stated in 2006, “a lot of cutting-edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it’s not labeled AI anymore”. This is similar to what John McCarthy, the father of AI also stated back in the 80s. So then, what started changing in the late 2000s and at the start of this decade that propelled the field of AI once again to the forefront? Well, first off, we can thank the increase of computing power and storage, infinite computing, big data and various other topics we’ve covered in videos past. These advances allowed for large amounts of data to train on and the computing power and storage needed to be able to do so. Now, one can say that finding structure in data is the human condition, it’s how we’ve come so far, and these advances gave computers what they required to do so as well. Now, as you can see here, the difference between various AI breakthroughs and the date the algorithms were initially proposed is nearly 2 decades, however, on average just 3 years after the dataset for a said problem becomes available does the breakthrough happen, meaning that data was a huge bottleneck in the advancement of the field of AI. The next reason for the rise of machine learning is due to the rise of a particular tribe of machine learning, connectionism, or as many commonly know of it, deep learning. Before we delve into deep learning, let’s first discuss the other tribes of AI. There are 5 primary tribes of machine learning, with tribes referring to groups of people who have different philosophies on how to tackle AI-based problems. We have discussed many of these tribes in past videos, but this list below should make the more concrete: The first tribe is the Symbolists. They focus on the premise of inverse deduction, they don’t start with a premise to work towards conclusions, but rather use a set of premises and conclusions and work backwards to fill in the gaps. We discussed this in the History of AI video and will focus on it more heavily in a future video on Artificial Human Intelligence. The second tribe is the Connectionists. They mostly try to digitally re-engineer the brain and all of its connections in a neural network. The most famous example of the connectionist approach is what is commonly known as ‘Deep Learning’. We discussed parts of the rise of connectionism in the Birth of AI video. The third tribe is the Evolutionaries. Their focus lies on applying the idea of genomes and DNA in the evolutionary process to data processing. Their algorithms will constantly evolve and adapt to unknown conditions and processes. You have probably seen this style of approach used in beating games such as Mario, and we will discuss it much more in an upcoming video on ‘Reinforcement Learning’. The fourth tribe is the Bayesians. Bayesian models will take a hypothesis and apply a type of ‘a priori’ thinking, believing that there will be some outcomes that are more probable. They then update their hypothesis as they see more data. We discussed a bit more about this line of thinking in our video on Quantum Computing. The fifth and final tribe is the Analogizers. This machine learning tribe focuses on techniques to match bits of data to each other. We have been discussing this approach quite a bit in the past few videos, with many core concepts of supervised and unsupervised learning tied to it. How I think it would be best to represent these tribes of artificial intelligence and machine learning is in a bubble diagram format. To start with we have our primary AI bubble and machine learning bubble, we showed this relationship in the first video in our machine learning series. Now after this we can add the tribe bubbles. They’re constantly moving and overlapping with each other to produce novel ideas and shrinking and growing in popularity. Once a tribe hits mainstream popularity, such as connectionism, it pops so to speak, producing a new field in its wake. In the case of connectionism it was deep learning. Keep in mind that just because connectionism grew into deep learning doesn’t mean that the entire tribe of connectionism is centered around deep learning. The connectionism bubble and many connectionists will continue researching new approaches utilizing connectionist theory. Also, deep learning isn’t all connectionism, there are many symbolist and analogist philosophies incorporated within as well. You can learn more about the 5 tribes of machine learning in Pedro Domingos book, The Master Algorithm, and you can get that e-book for free along with a 30-day free trial membership to Audible by signing up with the link below! Coming back on topic, so then what is the difference between machine learning and artificial intelligence? Nothing, and everything. While machine learning is classified as a type of AI since it exhibits the ability to match and even exceed human-level perception and actions in various tasks, it has stated earlier is a weak AI, since these tasks are often isolated from one another, in other words, domain-specific. As we’ve seen machine learning can mean many things, from millions of lines of code with complex rules and decision trees, to statistical models, symbolist theories, connectionism and evolution based approaches and much more! All with the goal to model the complexities of life, just as how our brains try to do. With the advent of big data, the increases in computing power and storage and the other factors we discussed earlier and in videos past, took these models from simpler iterative algorithms to those involving many complex domains of mathematics and science working together in unison, such as: knot theory, game theory, linear algebra and statistics to list a few. One important note to touch on with these models, no matter how advanced the algorithms used, is best said through a quote by famous statistician George Box, “all models are wrong, but some are useful”. By this it is meant that in every model, abstractions and simplifications are made such that they will never 100% model reality. However, simplifications of reality can often be quite useful in solving many complex problems. Relating to machine learning this means we will never have a model that has an accuracy of 100% in predicting an output of most real-world problems, especially in more ambiguous problems. Two of the major assumptions made in the field of machine learning that is the cause of this is that: 1) We are assuming that the past, that being the patterns of the past, predict the future, and, 2) That mathematics can truly model the entire universe. Regardless of these assumptions, these models can still be very useful in a broad array of applications, we will cover these grander societal impacts of weak intelligence in an upcoming video on the Evolution of AI. Additionally, a method that has been attributed to a major rise in the accuracy of models and something we mentioned earlier is deep learning, which we will cover in the next set of videos in this a AI series! Now before concluding, one important fact that I want to reiterate and as stated in the disclaimer at the start of all my AI videos is that my goal here is to try and simplify in reality very complex topics. I urge you to seek out additional resources on this platform and various others if you wish to learn more on a much deeper level! One such resource I use and highly recommend is Brilliant! If you want to learn more about machine learning and I mean really learn how these algorithms work from supervised methodologies such as regression and classification, to unsupervised learning and more, than Brilliant.org is the place for you to go! In a world where automation through algorithms will increasingly replace more jobs, it is up to us as individuals to keep our brain sharp and think of creative solutions to multidisciplinary problems and Brilliant is a platform that allows you to do so. For instance, every day there is a Daily Challenge that can cover a wide variety of courses in the STEM domain. These challenges are crafted in such a way in which they draw you in and then allow you to learn a new concept through their intuitive explanations. My primary goal with this channel is to inspire and educate about the various technologies and innovations that are changing the world, but to do so on a higher-level requires going a step beyond these videos and actually learning the mathematics and science beyond the concepts I discuss. Brilliant does this by making math and science learning exciting and cultivates curiosity about showing the interconnectedness between a variety of different topics! To support Singularity Prosperity and learn more about Brilliant, go to Brilliant.org/ singularity and sign up for free, additionally, the first 200 people that go to that link will get 20% off their annual premium subscription! At this point the video has come to a conclusion, I’d like to thank you for taking the time to watch it! If you enjoyed it, consider supporting me on Patreon or YouTube Membership to keep this channel growing! Check out our website for more information, consider subscribing for content and like our Facebook page for more bite-sized chunks of content. This has been Ankur, you’ve been watching Singularity Prosperity and I’ll see you again soon! [Music]

24 COMMENTS

  1. Become a YouTube member for many exclusive perks from early previews, bonus content, shoutouts and more! https://www.youtube.com/c/singularityprosperity/join – AND – Join our Discord server for much better community discussions! https://discordapp.com/invite/HFUw8eM – ALSO – This video was made possible by Brilliant. Be one of the first 200 people to sign up with this link and get 20% off your premium subscription with Brilliant! https://brilliant.org/singularity

  2. Are you a younger brother of Dagogo Altraide, from Coldfusion? because your quality of videos and knowledge is the same as him.
    Loved your channel,
    Congratulations in advance for 50K subscribers.
    Love from India, man.

  3. Its so cool how you summarize deep and kind of hard concepts to some simple ideas for everyone to understand it and get people to the field
    Huge fan of your content, keep it up and good work, greetings from Paraguay!

  4. Do a video on artificial intelligence algorithms for stock markets and crypto markets , the billions being taken out of markets every day, also the related asset management artificial intelligence linked to causing many housing crisises and homelessness in many cities all over the world, how artificial intelligence has started a kind of financial war leaving people homeless or in crowded house sharing living conditions, how the algorithm encourages a housing shortage crisis in order to raise asset value price for real estate corporations, how a new type of housing regulation is now needed to protect citzens from billionaires who have access to these artificial intelligence programs, these capital finance organizations who are responsible for damaging places like Seattle, LA, and other cities in the USA, spreading into the EU in combination with the Syria refugee crisis to help contribute to creating a global housing crisis based around maximizing profits in stock and crypto markets that use housing assets as collateral leverage. I think artificial intelligence has now lead to one of worst global housing Crisises in our history such is the advantage ai gives to people with stock market portfolios

  5. You're probably the only youtuber that goes so deep into subject that instead of speeding up playback I have to pause and google frequently. And I say that as a compliment 🙂
    Keep up the good work!

    One small thing bothers me: you call specialized AI algorithms "weak AI". Specialized AI can be incredibly powerful in it's specific domain and calling it "weak" I think inappropriate. "Narrow AI" is a much better term IMO.

  6. So with the AI it needs to make algorithms so that the cpu & ram don't get used up with unless information processing though the 0s 1s. Like in this short sntnce we cn gt alt of infrmtin but use only a little bit of it.

  7. I usually watch many educational videos at double speed, but I watch your videos at normal speed due to the great background music 👌👌👌

LEAVE A REPLY

Please enter your comment!
Please enter your name here