|
Amine Aboussalah
Academic
Amine Aboussalah is an Assistant Professor in the Department of Finance and Risk Engineering at the NYU Tandon School of Engineering.
|
Amine Aboussalah is an Assistant Professor in the Department of Finance and Risk Engineering at the NYU Tandon School of Engineering. He earned his Ph.D. in Artificial Intelligence and Operations Research at the University of Toronto. His research interests lie broadly in artificial intelligence and dynamical systems. He enjoys applying theoretical mathematical concepts such as information geometry to develop new machine learning algorithms for a variety of practical real-world dynamical systems applications. He uses the financial application domain as a challenging real-world dynamical systems environment in which to advance machine learning. Professor Aboussalah's primary research interest is improving reinforcement learning algorithms for solving and controlling dynamical systems by exploiting topological properties of time-series data and partial differential equations. As a teacher, he likes to mix theory and practice by sharing both his research and his industry experiences.
- Position: Professor of Financial Engineering
- Affiliation: New York University
- Papers: 4
- Location: NEW YORK, United States
Education
- Doctor of Philosophy (Machine Learning, Operations Research)
2017-2022, University of Toronto
Selected Experiences
- Consultant (World Bank Group)
2021-2022, Washington
- Machine Learning Researcher (Fujitsu Co-Creation Research Laboratory)
2019-2020, Toronto
Selected Papers
A Deep Reinforcement Learning Framework for Column Generation
Column Generation (CG) is an iterative algorithm for solving linear programs (LPs) with an extremely large number of variables (columns). CG is the workhorse for tackling large-scale integer linear programs, which rely on CG to solve LP relaxations within a branch and price algorithm. Two canonical applications are the Cutting Stock Problem (CSP) and Vehicle Routing Problem with Time Windows (VRPTW). In VRPTW, for example, each binary variable represents the decision to include or exclude a route, of which there are exponentially many; CG incremen- tally grows the subset of columns being used, ultimately converging to an optimal solution. We propose RLCG, the first Reinforcement Learning (RL) approach for CG. Unlike typical column selection rules which myopically select a column based on local information at each iteration, we treat CG as a sequential decision-making problem: the column selected in a given iteration affects subsequent column selec- tions. This perspective lends itself to a Deep Reinforcement Learning approach that uses Graph Neural Networks (GNNs) to represent the variable-constraint structure in the LP of interest. We perform an extensive set of experiments using the publicly available BPPLIB benchmark for CSP and Solomon benchmark for VRPTW. RLCG converges faster and reduces the number of CG iterations by 22.4% for CSP and 40.9% for VRPTW on average compared to a commonly used greedy policy.
Academic
Amine Aboussalah
Mar 2024
Machine Learning
200
Quantum computing reduces systemic risk in financial networks
In highly connected financial networks, the failure of a single institution can cascade into additional bank failures. This systemic risk can be mitigated by adjusting the loans, holding shares, and other liabilities connecting institutions in a way that prevents cascading of failures. We are approaching the systemic risk problem by attempting to optimize the connections between the institutions. In order to provide a more realistic simulation environment, we have incorporated nonlinear/discontinuous losses in the value of the banks. To address scalability challenges, we have developed a two-stage algorithm where the networks are partitioned into modules of highly interconnected banks and then the modules are individually optimized. We developed new algorithms for classical and quantum partitioning for directed and weighted graphs (first stage) and a new methodology for solving Mixed Integer Linear Programming problems with constraints for the systemic risk context (second stage). We compare classical and quantum algorithms for the partitioning problem. Experimental results demonstrate that our two-stage optimization with quantum partitioning is more resilient to financial shocks, delays the cascade failure phase transition, and reduces the total number of failures at convergence under systemic risks with reduced time complexity.
Academic
Amine Aboussalah
Mar 2024
Machine Learning
196
Recursive Time Series Data Augmentation
Time series observations can be seen as realizations of an underlying dynamical system governed by rules that we typically do not know. For time series learning tasks we create our model using available data. Training on available realizations, where data is limited, often induces severe over-fitting thereby preventing generalization. To address this issue, we introduce a general recursive framework for time series augmentation, which we call the Recursive Interpolation Method (RIM). New augmented time series are generated using a recursive interpolation function from the original time series for use in training. We perform theoretical analysis to characterize the proposed RIM and to guarantee its performance under certain conditions. We apply RIM to diverse synthetic and real-world time series cases to achieve strong performance over non-augmented data on a variety of learning tasks. Our method is also computationally more efficient and leads to better performance when compared to state of the art time series data augmentation.
Academic
Amine Aboussalah
Mar 2024
Machine Learning
234
What is the value of the cross-sectional approach to deep reinforcement learning?
Reinforcement learning (RL) for dynamic asset allocation is an emerging field of study. Total return, the common performance metric, is useful for comparing algorithms but does not help us determine how close an RL algorithm is to an optimal solution. In real-world financial applications, a bad deci- sion could prove to be fatal. One of the key ideas of our work is to combine the two paradigms of the mean-variance optimization approach (Markowitz criteria) and the optimal capital growth approach (Kelly criteria) via the actor-critic approach. By using an actor-critic approach, we can balance optimization of risk and growth by configuring the actor to optimize the mean-variance while the critic is configured to maximize growth. We propose a Geometric Policy Score used by the critic to assess the quality of the actions taken by the actor. This could allow portfolio manager prac- titioners to better understand the investment RL policy. We present an extensive and in-depth study of RL algorithms for use in portfolio management (PM). We studied eight published policy-based RL algorithms which are preferred over value-based RL because they are better suited for contin- uous action spaces and are considered to be state of the art, Deterministic Policy Gradient (DPG), Stochastic Policy Gradients (SPG), Deep Deterministic Policy Gradient (DDPG), Trust Region Pol- icy Optimization (TRPO), Proximal Policy Optimization (PPO), Twin Delayed Deep Deterministic Policy Gradient (TD3), Soft Actor Critic (SAC), and Evolution Strategies (ES) for Policy Optimiza- tion. We implemented all eight and we were able to modify all of them for PM but our initial testing determined that their performance was not satisfactory. Most algorithms showed difficulty converg- ing during the training process due to the non-stationary and noisy nature of financial environments, along with other challenges. We selected the four most promising algorithms DPG, SPG, DDPG, PPO for further improvements. The modification of RL algorithms to finance required unconven- tional changes. We have developed a novel approach for encoding multi-type financial data in a way that is compatible with RL. We use a multi-channel convolutional neural network (CNN-RL) framework, where each channel corresponds to a specific type of data such as high-low-open-close prices and volumes. We also designed a reward function based on concepts such as alpha, beta, and diversification that are financially meaningful while still being learnable by RL. In addition, port- folio managers will typically use a blend of time series analysis and cross-sectional analysis before making a decision. We extend our approach to incorporate, for the first time, cross-sectional deep RL in addition to time series RL. Finally, we demonstrate the performance of the RL agents and benchmark them against commonly used passive and active trading strategies, such as the uniform buy-and-hold (UBAH) index and the dynamical multi-period Mean-Variance-Optimization (MVO) model.
Academic
Amine Aboussalah
Mar 2024
Machine Learning
192