|
Araya-Diaz, P., Ruz, G. A., & Palomino, H. M. (2013). Discovering Craniofacial Patterns Using Multivariate Cephalometric Data for Treatment Decision Making in Orthodontics. Int. J. Morphol., 31(3), 1109–1115.
Abstract: The aim was to find craniofacial morphology patterns in a multivariate cephalometric database using a clustering technique. Cephalometric analysis was performed in a sample of 100 teleradiographs collected from Chilean orthodontic patients. Thirty cephalometric measurements were taken from commonly used analysis. The computed variables were used to perform a clustering analysis with the k-means algorithm to identify patterns of craniofacial morphology. The J48 decision tree was used to analyze each cluster, and the ANOVA test to determine the statistical differences between the clusters. Four clusters were found that had significant differences (P<0.001) in 24 of the 30 variables studied, suggesting that they represent different patterns of craniofacial form. Using the decision tree, 8 of the 30 variables appeared to be relevant for describing the clusters. The clustering analysis is effective in identifying different craniofacial patterns based on a multivariate database. The distinct clusters appear to be caused by differences in the compensation process of the facial structure responding to a genetically determined cranial and mandible form. The proposed method can be applied to several databases, creating specific classifications for each one of them.
|
|
|
Arias-Garzón, D., Tabares-Soto, R., Bernal-Salcedo. J., & Ruz, G. A. (2023). Biases associated with database structure for COVID-19 detection in X-ray images. Sci. Rep., 13, 3477.
Abstract: Several artificial intelligence algorithms have been developed for COVID-19-related topics. One that has been common is the COVID-19 diagnosis using chest X-rays, where the eagerness to obtain early results has triggered the construction of a series of datasets where bias management has not been thorough from the point of view of patient information, capture conditions, class imbalance, and careless mixtures of multiple datasets. This paper analyses 19 datasets of COVID-19 chest X-ray images, identifying potential biases. Moreover, computational experiments were conducted using one of the most popular datasets in this domain, which obtains a 96.19% of classification accuracy on the complete dataset. Nevertheless, when evaluated with the ethical tool Aequitas, it fails on all the metrics. Ethical tools enhanced with some distribution and image quality considerations are the keys to developing or choosing a dataset with fewer bias issues. We aim to provide broad research on dataset problems, tools, and suggestions for future dataset developments and COVID-19 applications using chest X-ray images.
|
|
|
Billi, M., Mascareno, A., Henriquez, P. A., Rodriguez, I., Padilla, F., & Ruz, G. A. (2022). Learning from crises? The long and winding road of the salmon industry in Chiloe Island, Chile. Mar. Pol., 140, 105069.
Abstract: The rapid development of salmon aquaculture worldwide and the growing criticism of the activity in recent decades have raised doubts about the capacity of the sector to learn from its own crises. In this article, we assess the discursive, behavioral and outcome performance dimensions of the industry to identify actual learning and lessons to be learned. We focus on the case of Chiloe Island, Chile, a global center of salmon production since 1990 that has gone through two severe crises in the last 15 years (2007-2009 ISAV crisis and 2016 red tide crisis). On the basis of a multi-method approach combining qualitative analysis of interviews and statistical data analysis, we observe that the industry has discursively learned the relevance of both self-regulation and the wellbeing of communities. However, at the behavioral and outcome performance levels, the data show a highly heterogeneous conduct that questions the ability of the sector as a whole to learn from crises. We conclude that detrimental effects for ecosystems and society will increase if learning remains at the level of discourses. Without significant changes in operational practices and market performance there are no real perspectives for the sustainability of the industry. This intensifies when considering the uneven responses to governance mechanisms. The sector needs to adapt its factual performance to sustainable goals and reflexively monitor this process. The first step for achieving this is to produce reliable data to make evidence-based decisions that align the operational dynamics of the entire sector with a more sustainable trajectory in the near future, as well as advancing towards hybrid and more reflexive governance arrangements.
|
|
|
Canals, C., Goles, E., Mascareno, A., Rica, S., & Ruz, G. A. (2018). School Choice in a Market Environment: Individual versus Social Expectations. Complexity, 3793095, 11 pp.
Abstract: School choice is a key factor connecting personal preferences (beliefs, desires, and needs) and school offer in education markets. While it is assumed that preferences are highly individualistic forms of expectations by means of which parents select schools satisfying their internal moral standards, this paper argues that a better matching between parental preferences and school offer is achieved when individuals take into account their relevant network vicinity, thereby constructing social expectations regarding school choice. We develop two related models (individual expectations and social expectations) and prove that they are driven by a Lyapunov function, obtaining that both models converge to fixed points. Also, we assess their performance by conducting computational simulations. While the individual expectations model shows a probabilistic transition and a critical threshold below which preferences concentrate in a few schools and a significant amount of students is left unattended by the school offer, the social expectations model presents a smooth dynamics in which most of the schools have students all the time and no students are left out. We discuss our results considering key topics of the empirical research on school choice in educational market environments and conclude that social expectations contribute to improve information and lead to a better matching between school offer and parental preferences.
|
|
|
Chetty, M., Hallinan, J., Ruz, G. A., & Wipat, A. (2022). Computational intelligence and machine learning in bioinformatics and computational biology. Biosystems, 222, 104792.
Abstract: Bioinformatics and computational biology are major beneficiaries of the current innovations in artificial intelligence and machine learning. While Bioinformatics applies principles of computer science and technique to help understand the vast, diverse, and complex life sciences data and thus make it more useful, in contrast, Computational Biology applies computational approaches to address theoretical and experimental questions in biology. This Special Issue on Computational Intelligence and Machine Learning in Bioinformatics and Computational Biology comprises of extended versions of the key papers from the 18th IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB, 2021) which is a major event in the field of computational intelligence and its applications to problems in bioinformatics, computational biology, and biomedical engineering. The conference, run annually, provides a global forum for academic and industrial scientists from computer science, biology, chemistry, medicine, mathematics, statistics, and engineering, to discuss and present their latest research findings from theory to applications.
|
|
|
Cho, A. D., Carrasco, R. A., & Ruz, G. A. (2022). A RUL Estimation System from Clustered Run-to-Failure Degradation Signals. Sensors, 22(14), 5323.
Abstract: The prognostics and health management disciplines provide an efficient solution to improve a system's durability, taking advantage of its lifespan in functionality before a failure appears. Prognostics are performed to estimate the system or subsystem's remaining useful life (RUL). This estimation can be used as a supply in decision-making within maintenance plans and procedures. This work focuses on prognostics by developing a recurrent neural network and a forecasting method called Prophet to measure the performance quality in RUL estimation. We apply this approach to degradation signals, which do not need to be monotonical. Finally, we test our system using data from new generation telescopes in real-world applications.
|
|
|
Cho, A. D., Carrasco, R. A., & Ruz, G. A. (2022). Improving Prescriptive Maintenance by Incorporating Post-Prognostic Information Through Chance Constraints. IEEE Access, 10, 55924–55932.
Abstract: Maintenance is one of the critical areas in operations in which a careful balance between preventive costs and the effect of failures is required. Thanks to the increasing data availability, decision-makers can now use models to better estimate, evaluate, and achieve this balance. This work presents a maintenance scheduling model which considers prognostic information provided by a predictive system. In particular, we developed a prescriptive maintenance system based on run-to-failure signal segmentation and a Long Short Term Memory (LSTM) neural network. The LSTM network returns the prediction of the remaining useful life when a fault is present in a component. We incorporate such predictions and their inherent errors in a decision support system based on a stochastic optimization model, incorporating them via chance constraints. These constraints control the number of failed components and consider the physical distance between them to reduce sparsity and minimize the total maintenance cost. We show that this approach can compute solutions for relatively large instances in reasonable computational time through experimental results. Furthermore, the decision-maker can identify the correct operating point depending on the balance between costs and failure probability.
|
|
|
Cho, A. D., Carrasco, R. A., Ruz, G. A., & Ortiz, J. L. (2020). Slow Degradation Fault Detection in a Harsh Environment. IEEE Access, 8, 175904–175920.
Abstract: The ever increasing challenges posed by the science projects in astronomy have skyrocketed the complexity of the new generation telescopes. Due to the climate and sky requirements, these high precision instruments are generally located in remote areas, suffering from the harsh environments around it. These modern telescopes not only produce massive amounts of scientific data, but they also generate an enormous amount of operational information. The Atacama Large Millimeter/submillimeter Array (ALMA) is one of these unique instruments, generating more than 50 Gb of operational data every day while functioning in conditions of extreme dryness and altitude. To maintain the array working under extreme conditions, the engineering teams must check over 130,000 monitoring points, combing through the massive datasets produced every day. To make this possible, predictive tools are needed to identify, hopefully beforehand, the occurrence of failures in all the different subsystems.
This work presents a novel fault detection scheme for one of these subsystems, the Intermediate Frequency Processors (IFP). This subsystem is critical to process the information gathered by each antenna and communicate it, reliably, to the correlator for processing. Our approach is based on echo state networks, a configuration of artificial neural networks, used to learn and predict the signal patterns. These patterns are later compared to the actual signal, to identify failure modes. Additional preprocessing techniques were also added since the signal-to-noise ratio of the data used was very low.
The proposed scheme was tested in over seven years of data from 132 IFPs at ALMA, showing an accuracy of over 70%. Furthermore, the detection was done several months earlier, on average, when compared to what human operators did. These results help the maintenance procedures, increasing reliability while reducing humans' exposure to the harsh environment where the antennas are. Although applied to a specific fault, this technique is broad enough to be applied to other types of faults and settings.
|
|
|
Cillero, J. I., Henriquez, P. A., Ledger, T. W., Ruz, G. A., & Gonzalez, B. (2022). Individual competence predominates over host nutritional status in Arabidopsis root exudate-mediated bacterial enrichment in a combination of four Burkholderiaceae species. BMC Microbiol., 22(1), 218.
Abstract: Background Rhizosphere microorganisms play a crucial role in plant health and development. Plant root exudates (PRE) are a complex mixture of organic molecules and provide nutritional and signaling information to rhizosphere microorganisms. Burkholderiaceae species are non-abundant in the rhizosphere but exhibit a wide range of plant-growth-promoting and plant-health-protection effects. Most of these plant-associated microorganisms have been studied in isolation under laboratory conditions, whereas in nature, they interact in competition or cooperation with each other. To improve our understanding of the factors driving growth dynamics of low-abundant bacterial species in the rhizosphere, we hypothesized that the growth and survival of four Burkholderiaceae strains (Paraburkholderia phytofirmans PsJN, Cupriavidus metallidurans CH34, C. pinatubonensis JMP134 and C. taiwanensis LMG19424) in Arabidopsis thaliana PRE is affected by the presence of each other. Results Differential growth abilities of each strain were found depending on plant age and whether PRE was obtained after growth on N limitation conditions. The best-adapted strain to grow in PRE was P. phytofirmans PsJN, with C. pinatubonensis JMP134 growing better than the other two Cupriavidus strains. Individual strain behavior changed when they succeeded in combinations. Clustering analysis showed that the 4-member co-culture grouped with one of the best-adapted strains, either P. phytofirmans PsJN or C. pinatubonensis JMP134, depending on the PRE used. Sequential transference experiments showed that the behavior of the 4-member co-culture relies on the type of PRE provided for growth. Conclusions The results suggest that individual strain behavior changed when they grew in combinations of two, three, or four members, and those changes are determined first by the inherent characteristics of each strain and secondly by the environment.
|
|
|
Concha, M., & Ruz, G. A. (2023). Evaluation of Atmospheric Environmental Regulations: The Case of Thermoelectric Power Plants. Atmosphere, 14(2), 358.
Abstract: In Chile, the concept of sacrifice zones corresponds to those land surfaces in which industrial development was prioritized over the environmental impact that it caused. A high number of industries that emit pollutants into the environment are concentrated in these zones. This paper studies the atmospheric component of the Environmental Impact Declaration and Assessment�s (EID and EIA, respectively) environmental assessment instruments of the thermoelectric power plants in northern Chile, based on their consistency with current environmental quality regulations. We specify concepts on air quality, atmospheric emission regulations, and the critical parameters and factors to be considered when carrying out an environmental impact assessment. Finally, we end by presenting possible alternatives to replace the current methodologies and criteria for atmospheric regulation in areas identified as saturated or of environmental sacrifice, with an emphasis on both population health and an environmental approach.
|
|
|
Cordero, R., Mascareno, A., Henriquez, P. A., & Ruz, G. A. (2022). Drawing constitutional boundaries: A digital historical analysis of the writing process of Pinochet's 1980 authoritarian constitution. Hist. Methods, 55(3), 145–167.
Abstract: Drawing conceptual boundaries is one of the defining features of constitution-making processes. These historically situated operations of boundary making are central to the definition of what counts as “constitutional” in a political community. In this article, we study the operations of conceptual delimitation performed by the Constitutional Commission (1973-1978) that drafted the 1980 Chilean Constitution, the trademark of Augusto Pinochet's dictatorship. Using the eleven volumes of the Commission's Official Records as our textual material (10,915 pages and 80,005 distinct words), we apply vector semantics, spectral clustering and bigram graph-based analysis to explore conceptual boundaries and the behavior of specific keywords shaping the space of constitutional meanings. Our results identify the ways in which the Commission defines the normative horizon of the new social and political order by transforming old semantic references into a renewed conceptual framework. This analysis shows the immanent relations between political action and conceptual elaboration that underlie the creation of constitutional texts, as well as the potential of computational methods for the study of constitutional history and constitution-making processes.
|
|
|
de la Cruz, R., Padilla, O., Valle, M. A., & Ruz, G. A. (2021). Modeling Recidivism through Bayesian Regression Models and Deep Neural Networks. Mathematics, 9(6), 639.
Abstract: This study aims to analyze and explore criminal recidivism with different modeling strategies: one based on an explanation of the phenomenon and another based on a prediction task. We compared three common statistical approaches for modeling recidivism: the logistic regression model, the Cox regression model, and the cure rate model. The parameters of these models were estimated from a Bayesian point of view. Additionally, for prediction purposes, we compared the Cox proportional model, a random survival forest, and a deep neural network. To conduct this study, we used a real dataset that corresponds to a cohort of individuals which consisted of men convicted of sexual crimes against women in 1973 in England and Wales. The results show that the logistic regression model tends to give more precise estimations of the probabilities of recidivism both globally and with the subgroups considered, but at the expense of running a model for each moment of the time that is of interest. The cure rate model with a relatively simple distribution, such as Weibull, provides acceptable estimations, and these tend to be better with longer follow-up periods. The Cox regression model can provide the most biased estimations with certain subgroups. The prediction results show the deep neural network's superiority compared to the Cox proportional model and the random survival forest.
|
|
|
Di Genova, A., Ruz, G. A., Sagot, M. F., & Maass, A. (2018). Fast-SG: an alignment-free algorithm for hybrid assembly. GigaScience, 7(5), 15 pp.
Abstract: Background: Long-read sequencing technologies are the ultimate solution for genome repeats, allowing near reference-level reconstructions of large genomes. However, long-read de novo assembly pipelines are computationally intense and require a considerable amount of coverage, thereby hindering their broad application to the assembly of large genomes. Alternatively, hybrid assembly methods that combine short-and long-read sequencing technologies can reduce the time and cost required to produce de novo assemblies of large genomes. Results: Here, we propose a new method, called Fast-SG, that uses a new ultrafast alignment-free algorithm specifically designed for constructing a scaffolding graph using light-weight data structures. Fast-SG can construct the graph from either short or long reads. This allows the reuse of efficient algorithms designed for short-read data and permits the definition of novel modular hybrid assembly pipelines. Using comprehensive standard datasets and benchmarks, we show how Fast-SG outperforms the state-of-the-art short-read aligners when building the scaffolding graph and can be used to extract linking information from either raw or error-corrected long reads. We also show how a hybrid assembly approach using Fast-SG with shallow long-read coverage (5X) and moderate computational resources can produce long-range and accurate reconstructions of the genomes of Arabidopsis thaliana (Ler-0) and human (NA12878). Conclusions: Fast-SG opens a door to achieve accurate hybrid long-range reconstructions of large genomes with low effort, high portability, and low cost.
|
|
|
Escapil-Inchauspe, P., & Ruz, G. A. (2023). h-Analysis and data-parallel physics-informed neural networks. Sci. Rep., 13(1), 17562.
Abstract: We explore the data-parallel acceleration of physics-informed machine learning (PIML) schemes, with a focus on physics-informed neural networks (PINNs) for multiple graphics processing units (GPUs) architectures. In order to develop scale-robust and high-throughput PIML models for sophisticated applications which may require a large number of training points (e.g., involving complex and high-dimensional domains, non-linear operators or multi-physics), we detail a novel protocol based on h-analysis and data-parallel acceleration through the Horovod training framework. The protocol is backed by new convergence bounds for the generalization error and the train-test gap. We show that the acceleration is straightforward to implement, does not compromise training, and proves to be highly efficient and controllable, paving the way towards generic scale-robust PIML. Extensive numerical experiments with increasing complexity illustrate its robustness and consistency, offering a wide range of possibilities for real-world simulations.
|
|
|
Escapil-Inchauspé, P., & Ruz, G. A. (2023). Hyper-parameter tuning of physics-informed neural networks: Application to Helmholtz problems. Neurocomputing, 561, 126826.
Abstract: We consider physics-informed neural networks (PINNs) (Raissiet al., 2019) for forward physical problems. In order to find optimal PINNs configuration, we introduce a hyper-parameter optimization (HPO) procedure via Gaussian processes-based Bayesian optimization. We apply the HPO to Helmholtz equation for bounded domains and conduct a thorough study, focusing on: (i) performance, (ii) the collocation points density r and (iii) the frequency kappa, confirming the applicability and necessity of the method. Numerical experiments are performed in two and three dimensions, including comparison to finite element methods.
|
|
|
Goles, E., & Ruz, G. A. (2015). Dynamics of neural networks over undirected graphs. Neural Netw., 63, 156–169.
Abstract: In this paper we study the dynamical behavior of neural networks such that their interconnections are the incidence matrix of an undirected finite graph G = (V, E) (i.e., the weights belong to {0, 1}). The network may be updated synchronously (every node is updated at the same time), sequentially (nodes are updated one by one in a prescribed order) or in a block-sequential way (a mixture of the previous schemes). We characterize completely the attractors (fixed points or cycles). More precisely, we establish the convergence to fixed points related to a parameter alpha(G), taking into account the number of loops, edges, vertices as well as the minimum number of edges to remove from E in order to obtain a maximum bipartite graph. Roughly, alpha(G') < 0 for any G' subgraph of G implies the convergence to fixed points. Otherwise, cycles appear. Actually, for very simple networks (majority functions updated in a block-sequential scheme such that each block is of minimum cardinality two) we exhibit cycles with nonpolynomial periods. (C) 2014 Elsevier Ltd. All rights reserved.
|
|
|
Goles, E., Lobos, F., Ruz, G. A., & Sene, S. (2020). Attractor landscapes in Boolean networks with firing memory: a theoretical study applied to genetic networks. Nat. Comput., 19(2), 295–319.
Abstract: In this paper we study the dynamical behavior of Boolean networks with firing memory, namely Boolean networks whose vertices are updated synchronously depending on their proper Boolean local transition functions so that each vertex remains at its firing state a finite number of steps. We prove in particular that these networks have the same computational power than the classical ones, i.e. any Boolean network with firing memory composed of m vertices can be simulated by a Boolean network by adding vertices. We also prove general results on specific classes of networks. For instance, we show that the existence of at least one delay greater than 1 in disjunctive networks makes such networks have only fixed points as attractors. Moreover, for arbitrary networks composed of two vertices, we characterize the delay phase space, i.e. the delay values such that networks admits limit cycles or fixed points. Finally, we analyze two classical biological models by introducing delays: the model of the immune control of the lambda\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda $$\end{document}-phage and that of the genetic control of the floral morphogenesis of the plant Arabidopsis thaliana.
|
|
|
Goles, E., Montalva, M., & Ruz, G. A. (2013). Deconstruction and Dynamical Robustness of Regulatory Networks: Application to the Yeast Cell Cycle Networks. Bull. Math. Biol., 75(6), 939–966.
Abstract: Analyzing all the deterministic dynamics of a Boolean regulatory network is a difficult problem since it grows exponentially with the number of nodes. In this paper, we present mathematical and computational tools for analyzing the complete deterministic dynamics of Boolean regulatory networks. For this, the notion of alliance is introduced, which is a subconfiguration of states that remains fixed regardless of the values of the other nodes. Also, equivalent classes are considered, which are sets of updating schedules which have the same dynamics. Using these techniques, we analyze two yeast cell cycle models. Results show the effectiveness of the proposed tools for analyzing update robustness as well as the discovery of new information related to the attractors of the yeast cell cycle models considering all the possible deterministic dynamics, which previously have only been studied considering the parallel updating scheme.
|
|
|
Gregor, C., Ashlock, D., Ruz, G. A., MacKinnon, D., & Kribs, D. (2022). A novel linear representation for evolving matrices. Soft Comput., 26(14), 6645–6657.
Abstract: A number of problems from specifiers for Boolean networks to programs for quantum computers can be encoded as matrices. The paper presents a novel family of linear, generative representations for evolving matrices. The matrices can be general or restricted within special classes of matrices like permutation matrices, Hermitian matrices, or other groups of matrices with particular algebraic properties. These classes include unitary matrices which encode quantum programs. This representation avoids the brittleness that arises in direct representations of matrices and permits the researcher substantial control of the part of matrix space being searched. The representation is demonstrated on a relatively simple matrix problem in automatic content generation as well as Boolean map induction and automatic quantum programming. The automatic content generation problem yields interesting results; the generative matrix representation yields worse fitness but a substantially greater variety of outcomes than a direct encoding, which is acceptable when generating content. The Boolean map experiments extend and confirm results that demonstrate that the generative encoding is superior to a direct encoding for the transition matrix of a Boolean map. The quantum programming results are generally quite good, with poor performance on the simplest problems in two of the families of programming tasks studied. The viability of the new representation for evolutionary matrix induction is well supported.
|
|
|
Henriquez, P. A., & Ruz, G. A. (2017). Extreme learning machine with a deterministic assignment of hidden weights in two parallel layers. Neurocomputing, 226, 109–116.
Abstract: Extreme learning machine (ELM) is a machine learning technique based on competitive single-hidden layer feedforward neural network (SLFN). However, traclitional ELM and its variants are only based on random assignment of hidden weights using a uniform distribution, and then the calculation of the weights output using the least-squares method. This paper proposes a new architecture based on a non-linear layer in parallel by another non-linear layer and with entries of independent weights. We explore the use of a deterministic assignment of the hidden weight values using low-discrepancy sequences (LDSs). The simulations are performed with Halton and Sobol sequences. The results for regression and classification problems confirm the advantages of using the proposed method called PL-ELM algorithm with the deterministic assignment of hidden weights. Moreover, the PL-ELM algorithm with the deterministic generation using LDSs can be extended to other modified ELM algorithms.
|
|