Learn Data Science, Analytics & Business Intelligence

Card image

Learn Alteryx

Learn the fundamentals of building Alteryx workflows to process and analyse data before diving into more advanced features
Card image

Learn Python

Learn how to use Python for Data Science. We cover Python Basics, Data Wrangling with Pandas, Visualisation and Machine Learning.
Subscribe to Our Mailing List
Follow Us

Articles

Machine Learning
Evaluation Metric Tunnel Vision
Evaluation Metric Tunnel Vision
As data scientists we pride ourselves on the accuracy of our machine learning models to predict the future, striving for increasingly accurate results that make our predictions appear like voodoo to the uninitiated. As the progression of machine learning...Read More
Python Machine Learning
Feature Importance
Random Forest Feature Importance Plot
A big part of analysing our models post training is whether the features we used for training actually helped in predicting the target and by how much. Tree based machine learning algorithms such as Random Forest and XGBoost come with a feature importance attribute....Read More
Python Machine Learning
stratified-kfold
Stratified KFold Tutorial
In this short tutorial we are going to look at stratified kfold cross validation: what it is, why we need it and when we should use it. We’ll then walk through how to split data into 5 stratified folds using the StratifiedKFold function in Sci-Kit Learn and use those folds to train......Read More
Alteryx
alteryx-download-tool
Alteryx Download Tool Tutorial
As you’re probably aware, the internet contains a rich collection of data stored in files that can be used for analysis or to build tools. Files containing data on weather, economic indicators and sports statistics, to name just a few, are scattered around the web as free resources.....Read More
Data Wrangling
join-types
Join Types
We use a join when we want to blend fields from one data source with another to create a new dataset with fields from both original data sources side by side. We can create joins when there is a relationship between one or more fields in both data sources. In practice, this means that at least one.....Read More
Machine Learning
join-types
XGBoost Parameter Tuning
XGBoost has many parameters that can be adjusted to achieve greater accuracy or generalisation for our models. Here we’ll look at just a few of the most common and influential parameters that we’ll need to pay most attention......Read More
Data Science
data-science-productivity
3 Productivity Tools for Data Scientists
Three Tools That Help Beginners and Expereinced Data Scientists Progress Faster and Work More Efficiently. Ok, so this might seem like an obvious one but Github is much more than just version control. Aside from being.......Read More
Data Science
Is Data Science for Me?
Let’s not lie, data science can be hard. Whether it’s you’re embarking on a period of self study or are making your first tentative steps into the field it can feel daunting. If you’re reading this then I’m assuming you're either thinking of entering the field or have some analytical experience and .......Read More
Machine Learning
Catboost with Python: A Simple Tutorial
In this tutorial we will see how to implement the Catboost machine learning algorithm in Python. We will give a brief overview of what Catboost is and what it can be used for before walking step by step through training a simple model including how to tune parameters and........Read More