Introduction to Machine Learning with Scikit Learn: Glossary

Key Points

Introduction	Machine learning is a set of tools and techniques that use data to make predictions. Artificial intelligence is a broader term that refers to making computers show human-like intelligence. Deep learning is a subset of machine learning. All machine learning systems have limitations to be aware of.
Supervised methods - Regression	Scikit-Learn is a Python library with lots of useful machine learning functions. Scikit-Learn includes a linear regression function. Scikit-Learn can perform polynomial regressions to model non-linear data.
Supervised methods - Classification	Classification requires labelled data (is supervised)
Unsupervised methods - Clustering	Clustering is a form of unsupervised learning. Unsupervised learning algorithms don’t need training. Kmeans is a popular clustering algorithm. Kmeans is less useful when one cluster exists within another, such as concentric circles. Spectral clustering can overcome some of the limitations of Kmeans. Spectral clustering is much slower than Kmeans. Scikit-Learn has functions to create example data.
Ensemble methods	Ensemble methods can be used to reduce under/over fitting training data.
Unsupervised methods - Dimensionality reduction	PCA is a linear dimensionality reduction technique for tabular data t-SNE is another dimensionality reduction technique for tabular data that is more general than PCA
Ethics and the Implications of Machine Learning	The results of machine learning reflect biases in the training and input data. Many machine learning algorithms can’t explain how they arrived at a decision. Machine learning can be used for unethical purposes. Consider the implications of false positives and false negatives.
Find out more	This course has only touched on a few areas of machine learning and is designed to teach you just enough to do something useful. Machine learning is a rapidly evolving field and new tools and techniques are constantly appearing.