Learn how to select the right variables in a machine learning algorithm by building a predictive model to identify loans that are likely to default. In the process, encounter oversampling, error types and the LASSO within the context of the Python sklearn environment.
Analyze geospatial data within the R and ggplot2 ecosystem to develop efficient routing networks. Apply the Traveling Salesman Problem to minimize the total distance between a large set of locations and visualize the results on an interactive map.
Explore clustering in the S&P 500 index by identifying stocks that move together. Predict stock price movements through Python’s pandas, numpy, and matplotlib libraries. Implement Prim and Kruskal’s clustering algorithms to discover patterns in the underlying structure of equities.