“We want to create a space where data protection professionals across the world feel connected through the Journal of Data Protection & Privacy and our specialist interest group on LinkedIn. I also want to reach out and engage with the next generation of leaders in our profession and actively encourage them to share their knowledge and understanding of the subject for the benefit of our expanding – and highly influential – community across the world.”
The importance of domain knowledge for successful and robust predictive modelling
Click the button below to download the full text of the article.
Abstract: Domain knowledge helps to build more precise and robust predictive models and thus obtain better insights. In the course of preparatory work, it helps inform what questions to ask, define the key fields to examine more closely, and identify where and how the insights from the analysis can support business goals. As this paper will discuss, it is also of great benefit when it comes to selecting or reducing variables, supplementing missing data, handling outliers or applying specific binning techniques. This paper argues that data scientists cannot rely on technical knowledge alone; rather, they must acquire relevant domain knowledge and familiarise themselves with pertinent rules of thumb. The paper also highlights the importance of maintaining close contact with the people who collect and prepare the data.
Keywords: predictive modelling, domain knowledge, binning, dummy variables, data preparation, missing data, data mining
Andrea Ahlemeyer-Stubbe is Director Strategic Analytics at servicepro GmbH, and former President of the European Network of Business and Industrial Statistics. She is co-author (with Shirley Coleman) of ‘Monetising Data — How to Uplift Your Business’ and ‘A Practical Guide to Data Mining for Business and Industry’, as well as a frequent lecturer at various universities and speaker at industry conferences.
Agnes Müller is Senior Analytical Consultant at servicepro GmbH. She is involved in projects and workshops for customers both large and small from different industries. Combining technical perception with textual skills, her focus lies on making complex analytical cases and results understandable for customers and other people who are unfamiliar with analytics.