|
Valle, M. A., Ruz, G. A., & Morras, R. (2018). Market basket analysis: Complementing association rules with minimum spanning trees. Expert Syst. Appl., 97, 146–162.
Abstract: This study proposes a methodology for market basket analysis based on minimum spanning trees, which complements the search for significant association rules among the vast set of rules that usually characterize such an analysis. Thanks to the hierarchical tree structure of the subdominant ultrametric distances of the MST, the association network allows us to find strong interdependencies between products in the same category, and to find products that serve as accesses or bridges to a set of other products with a high correlation among themselves. One relevant aspect of this graph-based methodology is the ease with which pairs and groups of products susceptible to carrying out marketing actions can be identified. The application of our methodology to a real transactional database succeeded in: 1. revealing product interdependencies with the greatest strengths, 2. revealing products of high importance with access to another product set, 3. determining high quality association rules, and 4. detect clusters and taxonomic relations among supermarket subcategories. This is highly beneficial for a retail manager or for a retail analyst who must propose different promotion and offer activities in order to maximize the sales volume and increase the effectiveness of promotion campaigns. (C) 2017 Elsevier Ltd. All rights reserved.
|
|
|
Ruz, G. A. (2016). Improving the performance of inductive learning classifiers through the presentation order of the training patterns. Expert Syst. Appl., 58, 1–9.
Abstract: Although the development of new supervised learning algorithms for machine learning techniques are mostly oriented to improve the predictive power or classification accuracy, the capacity to understand how the classification process is carried out is of great interest for many applications in business and industry. Inductive learning algorithms, like the Rules family, induce semantically interpretable classification rules in the form of if-then rules. Although the effectiveness of the Rules family has been studied thoroughly and new and improved versions are constantly been developed, one important drawback is the effect of the presentation order of the training patterns which has not been studied in depth previously. In this paper this issue is addressed, first by studying empirically the effect of random presentation orders in the number of rules and the generalization power of the resulting classifier. Then a presentation order method for the training examples is proposed which combines a clustering stage with a new density measure developed specifically for this problem. The results using benchmark datasets and a real application of wood defect classification show the effectiveness of the proposed method. Also, since the presentation order method is employed as a preprocessing stage, the simplicity of the Rules family is not affected but instead it enables the generation of fewer and more accurate rules, which can have a direct impact in the performance and usefulness of the Rules family in an expert system context. (C) 2016 Elsevier Ltd. All rights reserved.
|
|
|
Ruz, G. A., Varas, S., & Villena, M. (2013). Policy making for broadband adoption and usage in Chile through machine learning. Expert Syst. Appl., 40(17), 6728–6734.
Abstract: For developing countries, such as Chile, we study the influential factors for adoption and usage of broadband services. In particular, subsidies on the broadband price are analyzed to see if this initiative has a significant effect in the broadband penetration. To carry out this study, machine learning techniques are used to identify different household profiles using the data obtained from a survey on access, use, and users of broadband Internet from Chile. Different policies are proposed for each group found, which were then evaluated empirically through Bayesian networks. Results show that an unconditional subsidy for the Internet price does not seem to be very appropriate for everyone since it is only significant for some households groups. The evaluation using Bayesian networks showed that other polices should be considered as well such as the incorporation of computers, Internet applications development, and digital literacy training. (C) 2013 Elsevier Ltd. All rights reserved.
|
|
|
Valle, M. A., Varas, S., & Ruz, G. A. (2012). Job performance prediction in a call center using a naive Bayes classifier. Expert Syst. Appl., 39(11), 9939–9945.
Abstract: This study presents an approach to predict the performance of sales agents of a call center dedicated exclusively to sales and telemarketing activities. This approach is based on a naive Bayesian classifier. The objective is to know what levels of the attributes are indicative of individuals who perform well. A sample of 1037 sales agents was taken during the period between March and September of 2009 on campaigns related to insurance sales and service pre-paid phone services, to build the naive Bayes network. It has been shown that, socio-demographic attributes are not suitable for predicting performance. Alternatively, operational records were used to predict production of sales agents, achieving satisfactory results. In this case, the classifier training and testing is done through a stratified tenfold cross-validation. It classified the instances correctly 80.60% of times, with the proportion of false positives of 18.1% for class no (does not achieve minimum) and 20.8% for the class yes (achieves equal or above minimum acceptable). These results suggest that socio-demographic attributes has no predictive power on performance, while the operational information of the activities of the sale agent can predict the future performance of the agent. (c) 2012 Elsevier Ltd. All rights reserved.
|
|