Data Mining Techniques in the Analysis of the Casual Factors Regarding Innovation in the Private Sector at the Level of Europe

Abstract :

The rapid pace of growth of data quantities and the need to make sense of the information chaos have made it possible to introduce a new concept that encompasses these two aspects. Living in a world where we can identify a wealth of data and information, the data mining technique is a useful tool for analyzing a large data set. In this article we have analyzed a data set in order to allow the observation of how innovation has different causal factors in the private sector. The analysis was carried out in several countries in Europe. The first step of the analysis was data processing and descriptive statistics. The next step was to analyze the main components to reduce the dimensionality of the analyzed data set. The third step of the analysis is the cluster technique in order to classify the variables into classes so as to ensure a minimum variability within the class and a maximum variability between classes. The last part of the analysis proposed in this article was the use of classification trees to see what characteristics influence an individual's decision to become an entrepreneur.