Applied Computational Economics: Identifying “User Innovation” Through Machine Learning
A large body of empirical research has shown that so-called “lead users” and not manufacturers, are the source of many functional-novel products and services. While the lead user phenomenon appears to be of great research interest because of the sheer magnitude of its importance, it is at odds with the dominant paradigm where public funding and innovation systems only account for manufacturer innovation. Today’s vast amount of open source data can help us to close this gap.
The objective of this project is therefore to apply novel search and machine learning algorithms to develop an automated approach to identify lead users and the inventions they make in large bodies of unstructured data sets. Our main goal therefore lies in the classification of information. We want to use existing sentences (written project descriptions, spoken text in videos) of projects from the Kickstarter platform, the largest platform for financing innovation through the crowd. Here, we apply a variety of methods to learn the concept of lead-user driven and successful crowdfunding projects at the time of project launch. More specific, we apply word-vector models, sentiment analyses, penalized logistic regressions and supervised machine learning approaches (support vector machines). Our work will be based on an open source software stack of Python libraries, StanfordCoreNLP, R Software, IBM Watson and TensorFlow, Google’s open source machine learning library.
By applying mentioned methods, (1) we will identify lead users and map the corresponding phrases that highlight lead user characteristics first. (2) Subsequently, we would have to train the learning algorithm using the information provided. (3) The findings can then be corroborated with additional data collected from the Kickstarter website, for example by converting spoken text from videos. After fine-tuning the algorithm, we would then use the third phase to let the algorithm find other similar user entrepreneurs projects from the list of 210,000 projects in the database. Lastly, we will analyze the returned list of possible lead user projects as to their performance (success and amount pledged). To the best of our knowledge, all three mentioned approaches are a novelty in crowdfunding and lead user research.