
Bertossi, L. (2022). Declarative Approaches to Counterfactual Explanations for Classification. Theory Pract. Log. Program., Early Access.
Abstract: We propose answerset programs that specify and compute counterfactual interventions on entities that are input on a classification model. In relation to the outcome of the model, the resulting counterfactual entities serve as a basis for the definition and computation of causalitybased explanation scores for the feature values in the entity under classification, namely responsibility scores. The approach and the programs can be applied with blackbox models, and also with models that can be specified as logic programs, such as rulebased classifiers. The main focus of this study is on the specification and computation of best counterfactual entities, that is, those that lead to maximum responsibility scores. From them one can read off the explanations as maximum responsibility feature values in the original entity. We also extend the programs to bring into the picture semantic or domain knowledge. We show how the approach could be extended by means of probabilistic methods, and how the underlying probability distributions could be modified through the use of constraints. Several examples of programs written in the syntax of the DLV ASPsolver, and run with it, are shown.



Gaskins, J. T., Fuentes, C., & De la Cruz, R. (2022). A Bayesian nonparametric model for classification of longitudinal profiles. Biostatistics, Early Access.
Abstract: Across several medical fields, developing an approach for disease classification is an important challenge. The usual procedure is to fit a model for the longitudinal response in the healthy population, a different model for the longitudinal response in the diseased population, and then apply Bayes' theorem to obtain disease probabilities given the responses. Unfortunately, when substantial heterogeneity exists within each population, this type of Bayes classification may perform poorly. In this article, we develop a new approach by fitting a Bayesian nonparametric model for the joint outcome of disease status and longitudinal response, and then we perform classification through the clustering induced by the Dirichlet process. This approach is highly flexible and allows for multiple subpopulations of healthy, diseased, and possibly mixed membership. In addition, we introduce an Markov chain Monte Carlo sampling scheme that facilitates the assessment of the inference and prediction capabilities of our model. Finally, we demonstrate the method by predicting pregnancy outcomes using longitudinal profiles on the human chorionic gonadotropin beta subunit hormone levels in a sample of Chilean women being treated with assisted reproductive therapy.



Henderson, R. G., Verougstraete, V., Anderson, K., Arbildua, J. J., Brock, T. O., Brouwers, T., et al. (2014). Interlaboratory validation of bioaccessibility testing for metals. Regul. Toxicol. Pharmacol., 70(1), 170–181.
Abstract: Bioelution assays are fast, simple alternatives to in vivo testing. In this study, the intra and interlaboratory variability in bioaccessibility data generated by bioelution tests were evaluated in synthetic fluids relevant to oral, inhalation, and dermal exposure. Using one defined protocol, five laboratories measured metal release from cobalt oxide, cobalt powder, copper concentrate, Inconel alloy, leaded brass alloy, and nickel sulfate hexahydrate. Standard deviations of repeatability (Sr) and reproducibility (SR) were used to evaluate the intra and interlaboratory variability, respectively. Examination of the s(R):s(r) ratios demonstrated that, while gastric and lysosomal fluids had reasonably good reproducibility, other fluids did not show as good concordance between laboratories. Relative standard deviation (RSD) analysis showed more favorable reproducibility outcomes for some data sets; overall results varied more between than withinlaboratories. RSD analysis of s(r) showed good withinlaboratory variability for all conditions except some metals in interstitial fluid. In general, these findings indicate that absolute bioaccessibility results in some biological fluids may vary between different laboratories. However, for most applications, measures of relative bioaccessibility are needed, diminishing the requirement for high interlaboratory reproducibility in absolute metal releases. The interlaboratory exercise suggests that the degrees of freedom within the protocol need to be addressed. (C) 2014 Elsevier Inc. All rights reserved.



Henriquez, P. A., & Ruz, G. A. (2017). Extreme learning machine with a deterministic assignment of hidden weights in two parallel layers. Neurocomputing, 226, 109–116.
Abstract: Extreme learning machine (ELM) is a machine learning technique based on competitive singlehidden layer feedforward neural network (SLFN). However, traclitional ELM and its variants are only based on random assignment of hidden weights using a uniform distribution, and then the calculation of the weights output using the leastsquares method. This paper proposes a new architecture based on a nonlinear layer in parallel by another nonlinear layer and with entries of independent weights. We explore the use of a deterministic assignment of the hidden weight values using lowdiscrepancy sequences (LDSs). The simulations are performed with Halton and Sobol sequences. The results for regression and classification problems confirm the advantages of using the proposed method called PLELM algorithm with the deterministic assignment of hidden weights. Moreover, the PLELM algorithm with the deterministic generation using LDSs can be extended to other modified ELM algorithms.



Henriquez, P. A., & Ruz, G. A. (2018). A noniterative method for pruning hidden neurons in neural networks with random weights. Appl. Soft. Comput., 70, 1109–1121.
Abstract: Neural networks with random weights have the advantage of fast computational time in both training and testing. However, one of the main challenges of single layer feedforward neural networks is the selection of the optimal number of neurons in the hidden layer, since few/many neurons lead to problems of underfitting/overfitting. Adapting Garson's algorithm, this paper introduces a new efficient and fast noniterative algorithm for the selection of neurons in the hidden layer for randomization based neural networks. The proposed approach is divided into three steps: (1) train the network with h hidden neurons, (2) apply Garson's algorithm to the matrix of the hidden layer, and (3) perform pruning reducing hidden layer neurons based on the harmonic mean. Our experiments in regression and classification problems confirmed that the combination of the pruning technique with these types of neural networks improved their predictive performance in terms of mean square error and accuracy. Additionally, we tested our proposed pruning method with neural networks trained under sequential learning algorithms, where Random Vector Functional Link obtained, in general, the best predictive performance compared to online sequential versions of extreme learning machines and single hidden layer neural network with random weights. (C) 2018 Elsevier B.V. All rights reserved.



Henriquez, P. A., & Ruz, G. A. (2019). Noise reduction for nearinfrared spectroscopy data using extreme learning machines. Eng. Appl. Artif. Intell., 79, 13–22.
Abstract: The near infrared (NIR) spectra technique is an effective approach to predict chemical properties and it is typically applied in petrochemical, agricultural, medical, and environmental sectors. NIR spectra are usually of very high dimensions and contain huge amounts of information. Most of the information is irrelevant to the target problem and some is simply noise. Thus, it is not an easy task to discover the relationship between NIR spectra and the predictive variable. However, this kind of regression analysis is one of the main topics of machine learning. Thus machine learning techniques play a key role in NIR based analytical approaches. Preprocessing of NIR spectral data has become an integral part of chemometrics modeling. The objective of the preprocessing is to remove physical phenomena (noise) in the spectra in order to improve the regression or classification model. In this work, we propose to reduce the noise using extreme learning machines which have shown good predictive performances in regression applications as well as in large dataset classification tasks. For this, we use a novel algorithm called CPLELM, which has an architecture in parallel based on a nonlinear layer in parallel with another nonlinear layer. Using the soft margin loss function concept, we incorporate two Lagrange multipliers with the objective of including the noise of spectral data. Six reallife dataset were analyzed to illustrate the performance of the developed models. The results for regression and classification problems confirm the advantages of using the proposed method in terms of root mean square error and accuracy.



Hughes, S., Moreno, S., Yushimito, W. F., & HuertaCanepa, G. (2019). Evaluation of machine learning methodologies to predict stop delivery times from GPS data. Transp. Res. Pt. CEmerg. Technol., 109, 289–304.
Abstract: In last mile distribution, logistics companies typically arrange and plan their routes based on broad estimates of stop delivery times (i.e., the time spent at each stop to deliver goods to final receivers). If these estimates are not accurate, the level of service is degraded, as the promised time window may not be satisfied. The purpose of this work is to assess the feasibility of machine learning techniques to predict stop delivery times. This is done by testing a wide range of machine learning techniques (including different types of ensembles) to (1) predict the stop delivery time and (2) to determine whether the total stop delivery time will exceed a predefined time threshold (classification approach). For the assessment, all models are trained using information generated from GPS data collected in Medellin, Colombia and compared to hazard duration models. The results are threefold. First, the assessment shows that regressionbased machine learning approaches are not better than conventional hazard duration models concerning absolute errors of the prediction of the stop delivery times. Second, when the problem is addressed by a classification scheme in which the prediction is aimed to guide whether a stop time will exceed a predefined time, a basic Knearestneighbor model outperforms hazard duration models and other machine learning techniques both in accuracy and F1 score (harmonic mean between precision and recall). Third, the prediction of the exact duration can be improved by combining the classifiers and prediction models or hazard duration models in a two level scheme (first classification then prediction). However, the improvement depends largely on the correct classification (first level).



Lobos, F., Goles, E., Ruivo, E. L. P., de Oliveira, P. P. B., & Montealegre, P. (2018). Mining a Class of Decision Problems for Onedimensional Cellular Automata. J. Cell. Autom., 13(56), 393–405.
Abstract: Cellular automata are locally defined, homogeneous dynamical systems, discrete in space, time and state variables. Within the context of onedimensional, binary, cellular automata operating on cyclic configurations of odd length, we consider the general decision problem: if the initial configuration satisfies a given property, the lattice should converge to the fixedpoint of all 1s ((1) over right arrow), or to (0) over right arrow, otherwise. Two problems in this category have been widely studied in the literature, the parity problem [1] and the density classification task [4]. We are interested in determining all cellular automata rules with neighborhood sizes of 2, 3, 4 and 5 cells (i.e., radius r of 0.5, 1, 1.5 and 2.5) that solve decision problems of the previous type. We have demonstrated a theorem that, for any given rule in those spaces, ensures the non existence of fixed points other than (0) over right arrow and (1) over right arrow for configurations of size larger than 2(2r), provided that the rule does not support different fixed points for any configuration with size smaller than or equal to 2(2r). In addition, we have a proposition that ensures the convergence to only (0) over right arrow or (1) over right arrow of any initial configuration, if the rule complies with given conditions. By means of theoretical and computational approaches, we determined that: for the rule spaces defined by radius 0.5 and r = 1, only 1 and 2 rules, respectively, converge to (1) over right arrow or (0) over right arrow, to any initial configuration, and both recognize the same language, and for the rule space defined by radius r = 1.5, 40 rules satisfy this condition and recognize 4 different languages. Finally, for the radius 2 space, out of the 4,294,967,296 different rules, we were able to significantly filter it out, down to 40,941 candidate rules. We hope such an extensive mining should unveil new decision problems of the type widely studied in the literature.



MontalvaMedel, M., de Oliveira, P. P. B., & Goles, E. (2018). A portfolio of classification problems by onedimensional cellular automata, over cyclic binary configurations and parallel update. Nat. Comput., 17(3), 663–671.
Abstract: Decision problems addressed by cellular automata have been historically expressed either as determining whether initial configurations would belong to a given language, or as classifying the initial configurations according to a property in them. Unlike traditional approaches in language recognition, classification problems have typically relied upon cyclic configurations and fully paralell (twoway) update of the cells, which render the action of the cellular automaton relatively less controllable and difficult to analyse. Although the notion of cyclic languages have been studied in the wider realm of formal languages, only recently a more systematic attempt has come into play in respect to cellular automata with fully parallel update. With the goal of contributing to this effort, we propose a unified definition of classification problem for onedimensional, binary cellular automata, from which various known problems are couched in and novel ones are defined, and analyse the solvability of the new problems. Such a unified perspective aims at increasing existing knowledge about classification problems by cellular automata over cyclic configurations and parallel update.



Pham, D. T., & Ruz, G. A. (2009). Unsupervised training of Bayesian networks for data clustering. Proc. R. Soc. AMath. Phys. Eng. Sci., 465(2109), 2927–2948.
Abstract: This paper presents a new approach to the unsupervised training of Bayesian network classifiers. Three models have been analysed: the Chow and Liu (CL) multinets; the treeaugmented naive Bayes; and a new model called the simple Bayesian network classifier, which is more robust in its structure learning. To perform the unsupervised training of these models, the classification maximum likelihood criterion is used. The maximization of this criterion is derived for each model under the classification expectationmaximization ( EM) algorithm framework. To test the proposed unsupervised training approach, 10 wellknown benchmark datasets have been used to measure their clustering performance. Also, for comparison, the results for the kmeans and the EM algorithm, as well as those obtained when the three Bayesian network classifiers are trained in a supervised way, are analysed. A realworld image processing application is also presented, dealing with clustering of wood board images described by 165 attributes. Results show that the proposed learning method, in general, outperforms traditional clustering algorithms and, in the wood board image application, the CL multinets obtained a 12 per cent increase, on average, in clustering accuracy when compared with the kmeans method and a 7 per cent increase, on average, when compared with the EM algorithm.



Rozas Andaur, J. M., Ruz, G. A., & Goycoolea, M. (2021). Predicting OutofStock Using Machine Learning: An Application in a Retail Packaged Foods Manufacturing Company. Electronics, 10(22), 2787.
Abstract: For decades, OutofStock (OOS) events have been a problem for retailers and manufacturers. In grocery retailing, an OOS event is used to characterize the condition in which customers do not find a certain commodity while attempting to buy it. This paper focuses on addressing this problem from a manufacturer’s perspective, conducting a case study in a retail packaged foods manufacturing company located in Latin America. We developed two machine learning based systems to detect OOS events automatically. The first is based on a single Random Forest classifier with balanced data, and the second is an ensemble of six different classification algorithms. We used transactional data from the manufacturer information system and physical audits. The novelty of this work is our use of new predictor variables of OOS events. The system was successfully implemented and tested in a retail packaged foods manufacturer company. By incorporating the new predictive variables in our Random Forest and Ensemble classifier, we were able to improve their system’s predictive power. In particular, the Random Forest classifier presented the best performance in a realworld setting, achieving a detection precision of 72% and identifying 68% of the total OOS events. Finally, the incorporation of our new predictor variables allowed us to improve the performance of the Random Forest by 0.24 points in the Fmeasure.



Ruz, G. A. (2016). Improving the performance of inductive learning classifiers through the presentation order of the training patterns. Expert Syst. Appl., 58, 1–9.
Abstract: Although the development of new supervised learning algorithms for machine learning techniques are mostly oriented to improve the predictive power or classification accuracy, the capacity to understand how the classification process is carried out is of great interest for many applications in business and industry. Inductive learning algorithms, like the Rules family, induce semantically interpretable classification rules in the form of ifthen rules. Although the effectiveness of the Rules family has been studied thoroughly and new and improved versions are constantly been developed, one important drawback is the effect of the presentation order of the training patterns which has not been studied in depth previously. In this paper this issue is addressed, first by studying empirically the effect of random presentation orders in the number of rules and the generalization power of the resulting classifier. Then a presentation order method for the training examples is proposed which combines a clustering stage with a new density measure developed specifically for this problem. The results using benchmark datasets and a real application of wood defect classification show the effectiveness of the proposed method. Also, since the presentation order method is employed as a preprocessing stage, the simplicity of the Rules family is not affected but instead it enables the generation of fewer and more accurate rules, which can have a direct impact in the performance and usefulness of the Rules family in an expert system context. (C) 2016 Elsevier Ltd. All rights reserved.



SanchezSaez, P., Lira, H., Marti, L., SanchezPi, N., Arredondo, J., Bauer, F. E., et al. (2021). Searching for Changingstate AGNs in Massive Data Sets. I. Applying Deep Learning and Anomalydetection Techniques to Find AGNs with Anomalous Variability Behaviors. Astron. J., 162(5), 206.
Abstract: The classic classification scheme for active galactic nuclei (AGNs) was recently challenged by the discovery of the socalled changingstate (changinglook) AGNs. The physical mechanism behind this phenomenon is still a matter of open debate and the samples are too small and of serendipitous nature to provide robust answers. In order to tackle this problem, we need to design methods that are able to detect AGNs right in the act of changing state. Here we present an anomalydetection technique designed to identify AGN light curves with anomalous behaviors in massive data sets. The main aim of this technique is to identify CSAGN at different stages of the transition, but it can also be used for more general purposes, such as cleaning massive data sets for AGN variability analyses. We used light curves from the Zwicky Transient Facility data release 5 (ZTF DR5), containing a sample of 230,451 AGNs of different classes. The ZTF DR5 light curves were modeled with a Variational Recurrent Autoencoder (VRAE) architecture, that allowed us to obtain a set of attributes from the VRAE latent space that describes the general behavior of our sample. These attributes were then used as features for an Isolation Forest (IF) algorithm that is an anomaly detector for a “one class” kind of problem. We used the VRAE reconstruction errors and the IF anomaly score to select a sample of 8809 anomalies. These anomalies are dominated by bogus candidates, but we were able to identify 75 promising CSAGN candidates.

