摘要: You probably used random forest for regression and classification before, but time series forecasting? Hold up you’re going to say; time series data is special! And you’re right. When it comes to data that has a time dimension, applying machine learning (ML) methods becomes a little tricky.....
摘要: In this article, the author will discuss 7 common loss functions used in machine learning and explain where each of them is used. We have a lot to cover in this article so let’s begin!
摘要: Bayesian Target Encoding is a feature engineering technique used to map categorical variables into numeric variables. The Bayesian framework requires only minimal updates as new data is acquired and is thus well-suited for online learning. Furthermore, the Bayesian approach makes choosing and interpreting hyperparameters intuitive. I developed this technique in the recent Avito Kaggle Competition, where my team and I took 14th place out of 1,917 teams. We found that the Bayesian target encoding outperforms the built-in categorical encoding provided by the LightGBM package.
摘要: Imbalanced classes are a common problem in machine learning classification where there are a disproportionate ratio of observations in each class. Class imbalance can be found in many different areas including medical diagnosis, spam filtering, and wildfire detection.
摘要: Machine learning is increasingly moving from hand-designed models to automatically optimized pipelines using tools such as H20, TPOT, and auto-sklearn. These libraries, along with methods such as random search, aim to simplify the model selection and tuning parts of machine learning by finding the best model for a dataset with little to no manual intervention. However, feature engineering, an arguably more valuable aspect of the machine learning pipeline, remains almost entirely a human labor.
摘要: It is important to actually work on different kinds of data and projects along with learning the data science concepts. Some datasets are very popular and a lot more are easily available on the web