# Feature Engineering

## General
   - [What works in feature engineering](https://www.kaggle.com/c/ieee-fraud-detection/discussion/220878)
   + [Basic Feature Engineering With Time Series Data in Python](http://machinelearningmastery.com/basic-feature-engineering-time-series-data-python/)
   + [Zillow Prize - EDA, Data Cleaning & Feature Engineering](https://www.kaggle.com/lauracozma/eda-data-cleaning-feature-engineering)
   + [Feature-wise transformations](https://distill.pub/2018/feature-wise-transformations)
   + tsfresh: used to to extract characteristics from time series: [github](https://github.com/blue-yonder/tsfresh) | [Introduction](https://tsfresh.readthedocs.io/en/latest/text/introduction.html) | [Docs](https://tsfresh.readthedocs.io/en/latest/)
   + [featuretools](https://github.com/featuretools/featuretools/) - an open source python framework for automated feature engineering
   + [5 Steps to correctly prepare your data for your machine learning model](https://towardsdatascience.com/5-steps-to-correctly-prep-your-data-for-your-machine-learning-model-c06c24762b73?gi=6b4a6895ab1)
   + [scikit learn's SelectKBest](https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectKBest.html)
   + [mlbox's Feature selection](https://mlbox.readthedocs.io/en/latest/features.html)
   + Chi2 test: Feature selection: [Quora](https://www.quora.com/How-is-chi-test-used-for-feature-selection-in-machine-learning) | [NLP Stanford Group](https://nlp.stanford.edu/IR-book/html/htmledition/feature-selectionchi2-feature-selection-1.html) | [Learn for Master](http://www.learn4master.com/machine-learning/chi-square-test-for-feature-selection)
   + [Accelerating TSNE with GPUs: From hours to seconds](https://www.linkedin.com/posts/montrealai_machinelearning-datavisualization-datascience-activity-6628828524566331392-Cua_)
   + [Feature Engineering and Feature Selection](https://media.licdn.com/dms/document/C511FAQF45u2wk4WYKQ/feedshare-document-pdf-analyzed/0?e=1570834800&v=beta&t=lNVqtm3JJYvvPHpsl0uc6mZJjVGWgJ8Toz29tNJA4GI) [deadlink]
   + [Hands-on Guide to Automated Feature Engineering - Prateek Joshi](https://www.linkedin.com/posts/vipulppatel_hands-on-guide-to-automated-feature-engineering-ugcPost-6612564773705924608-Utyb)
   + [Feature Engineering and Selection](https://www.linkedin.com/posts/nabihbawazir_feature-engineering-and-selection-ugcPost-6603534412548280320-XTIX)
   + [What is feature engineering and why do we need it?](https://www.linkedin.com/posts/srivatsan-srinivasan-b8131b_datascience-machinelearning-ml-activity-6623556433189363712-O7c4)
   + [FEATURE-ENGINE: AN OPEN SOURCE PYTHON PACKAGE TO CREATE REPRODUCIBLE FEATURE ENGINEERING STEPS AND SMOOTH MODEL DEPLOYMENT](https://www.trainindata.com/feature-engine)
   + [Feature Engineering with Tidyverse](https://www.datasciencecentral.com/profiles/blogs/feature-engineering-with-tidyverse) [LinkedIn Post](https://www.linkedin.com/posts/data-science-central_feature-engineering-with-tidyverse-activity-6645714064209166337-4szB)
   + [ML topics expanded by Chris Albon](https://chrisalbon.com/#machine_learning) - look for topics: Feature Engineering • Feature Selection
   + [Feature Engineering: Data scientist's Secret Sauce](https://www.linkedin.com/posts/vincentg_feature-engineering-data-scientists-secret-activity-6657351483786358784-L7Mc)
   - [Python Feature Engineering Cookbook by Dr Soledad Galli](https://www.linkedin.com/posts/ajitjaokar_python-feature-engineering-cookbook-activity-6671226001567100928-Wfxn)
   - [How to Use Polynomial Feature Transforms for Machine Learning](https://machinelearningmastery.com/polynomial-features-transforms-for-machine-learning/)
   - [Transforming Quantitative Data to Qualitative Data](https://www.linkedin.com/feed/update/urn:li:activity:6674858845854019584/)
   - [Feature Engineering and Selection: A Practical Approach for Predictive Models](https://www.feat.engineering/)
   - [Feature Engineering for Machine Learning: A Comprehensive Overview](https://trainindata.medium.com/feature-engineering-for-machine-learning-a-comprehensive-overview-a7ad04c896f8)

## Dimensionality Reduction

  - [Feature engineering and Dimensionality reduction](https://towardsdatascience.com/dimensionality-reduction-for-machine-learning-80a46c2ebb7e)
  - [Seven Techniques for Data Dimensionality Reduction](https://www.kdnuggets.com/2015/05/7-methods-data-dimensionality-reduction.html)
  - [Linear Discriminant Analysis is a simple yet intuitive technique.  At it's first description it is very similar to PCA.  In PCA to find the eigen value and eigen factors we use covariance matrix](https://www.youtube.com/watch?v=D2HArUvOQaw&feature=youtu.be) | [Other PCA tutorials](https://youtu.be/D2HArUvOQaw)
  - [Principal Component Analysis for Dimensionality Reduction in Python](https://www.linkedin.com/posts/jasonbrownlee_principal-component-analysis-for-dimensionality-activity-6664240738139799552-gCqp)
  - Principal Component Analysis is a beautiful tool: [original thread](https://www.facebook.com/groups/mathfordatascience/permalink/1178371322496956/?__cft__[0]=AZUKcr9SXK7J6g5tJgW9ItNFc6z7qNJWmThqcyh-aCjwjRrVJ6ecPBdFIRUwOCLXNAnOf5W9v1-ZlKaLjeJ4bo1wH2mYXLTCOBcAvjy5_JL7ggNubGZoApyTcHjXdeA0j4wTGNcjdbtfd0xoPdBjkRCJ5nbXGlpQm_lpwkcfIusz8g&__tn__=%2CO%2CP-R) | [video](https://www.youtube.com/watch?v=otv4AUIp9HQ&feature=youtu.be)
  - [Lecture 38 Principal Component Analysis](https://www.youtube.com/watch?v=C6fH5Nfoj40&feature=youtu.be)
  - Kernel PCA methods
    - Notable papers are:
        - [Nonlinear Component Analysis as a Kernel Eigenvalue Problem](https://www.face-rec.org/algorithms/Kernel/kernelPCA_scholkopf.pdf)
        - [Kernel PCA for novelty detection](https://www.researchgate.net/publication/222828640_Kernel_PCA_for_novelty_detection)
    - Additional resources:
      - [Section 3.3.3 for the equivalent kernel of linear regression in Bishop](http://users.isr.ist.utl.pt/~wurmd/Livros/school/Bishop%20-%20Pattern%20Recognition%20And%20Machine%20Learning%20-%20Springer%20%202006.pdf)
      - [A short introduction to the kernel trick on Medium](https://medium.com/@zxr.nju/what-is-the-kernel-trick-why-is-it-important-98a98db0961d)
      - [A mini lecture on the kernel trick](https://www.youtube.com/watch?v=JiM_LXpAtLc)
      - [Kernel PCA Notebook](https://scikit-learn.org/stable/auto_examples/decomposition/plot_kernel_pca.html#sphx-glr-auto-examples-decomposition-plot-kernel-pca-py)
      - [Kernel Interpolation for Scalable Structured Gaussian Processes](https://arxiv.org/abs/1503.01057)
  + Hands-on ML with Python: Clustering, Dim Reduction, Time Series Analysis:
      [GitHub](https://resources.oreilly.com/binderhub/machine-learning-with-python-clustering) | [Jupyter notebook](https://learning.oreilly.com/jupyter-notebooks/hands-on-machine-learning/9781492063179/)
  + UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction: [PyPi](https://pypi.org/project/umap-learn/) | [Docs](https://umap-learn.readthedocs.io/en/latest/)

# Contributing

Contributions are very welcome, please share back with the wider community (and get credited for it)!

Please have a look at the [CONTRIBUTING](../CONTRIBUTING.md) guidelines, also have a read about our [licensing](../LICENSE.md) policy.

---

Back to [Data page (table of contents)](README.md)</br>
Back to [main page (table of contents)](../README.md)