Adaptive stochastic lookahead policies for dynamic multi-period purchasing and inventory routing

Autor/innen

  • Daniel Cuellar-Usaquén
  • Martin W. Ulmer
  • Camilo Gomez
  • David Álvarez-Martínez

DOI:

https://doi.org/10.24352/UB.OVGU-2023-097

Schlagworte:

Agri-Food Supply Chains, Dynamic Multi-period Vehicle Routing, Stochastic Dynamic Decision Making, Approximate Dynamic Programming, Two-Stage Stochastic Programming, Cost Function Approximation

Abstract

We present a problem motivated by discussions with Colombian e-commerce platforms for agri-food products. In regular time intervals (periods), the platforms collect groceries from local farmers and stores them at a warehouse to distribute them to local customers. The supply quantities and prices per farmer and the cumulated customer demand can change from period to period. Thus, there is value in purchasing more than needed in one period to exploit cheap prices and consolidation opportunities, to hedge against future uncertainty, and to save routing cost in future periods. A careful balance between too much and not enough inventory needs to be found, especially, since inventory perishes over time. The resulting optimization problem is a stochastic dynamic multi-period routing problem with inventory and purchasing decisions. The decision space of the problem is vast as it combines purchasing, inventory, and routing decisions. Further, the value of a decisions is unknown since it depends on future developments and decisions. We propose solving the problem with a stochastic lookahead method. In every state, the method samples a set of future realizations and solves the resulting two-stage stochastic program. To cope with the complex decision space in first and second stage, we propose a “soft” decomposition where the inventory and purchasing decision are fully considered, but the routing decisions are simplified and their cost is approximated via a cost function approximation. As the routing cost also depends on future decisions, the approximated cost are learned iteratively via repeated simulation and adaption of the lookahead. We show that our method outperforms a large number of benchmark policies for a variety of instances. We further analyze the functionality of our method and investigate variation in the problem dimensions in a comprehensive analysis.

Veröffentlicht

2023-05-05

Ausgabe

Rubrik

Artikel