Author name cluster

Morteza Ebrahimi

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

1 author row

AAAI Conference 2023 Conference Paper

Optimal Pathfinding on Weighted Grid Maps

Mark Carlson
Sajjad K. Moghadam
Daniel D. Harabor
Peter J. Stuckey
Morteza Ebrahimi

In many computer games up to hundreds of agents navigate in real-time across a dynamically changing weighted grid map. Pathfinding in these situations is challenging because the grids are large, traversal costs are not uniform, and because each shortest path has many symmetric permutations, all of which must be considered by an optimal online search. In this work we introduce Weighted Jump Point Search (JPSW), a new type of pathfinding algorithm which breaks weighted grid symmetries by introducing a tiebreaking policy that allows us to apply effective pruning rules in symmetric regions. We show that these pruning rules preserve at least one optimal path to every grid cell and that their application can yield large performance improvements for optimal pathfinding. We give a complete theoretical description of the new algorithm, including pseudo-code. We also conduct a wide-ranging experimental evaluation, including data from real games. Results indicate JPSW is up to orders of magnitude faster than the nearest baseline, online search using A*.

PDF Details DOI

KER Journal 2022 Journal Article

An online scalarization multi-objective reinforcement learning algorithm: TOPSIS Q-learning

Mohammad Mirzanejad
Morteza Ebrahimi
Peter Vamplew
Hadi Veisi

Abstract Conventional reinforcement learning focuses on problems with single objective. However, many problems have multiple objectives or criteria that may be independent, related, or contradictory. In such cases, multi-objective reinforcement learning is used to propose a compromise among the solutions to balance the objectives. TOPSIS is a multi-criteria decision method that selects the alternative with minimum distance from the positive ideal solution and the maximum distance from the negative ideal solution, so it can be used effectively in the decision-making process to select the next action. In this research a single-policy algorithm called TOPSIS Q-Learning is provided with focus on its performance in online mode. Unlike all single-policy methods, in the first version of the algorithm, there is no need for the user to specify the weights of the objectives. The user’s preferences may not be completely definite, so all weight preferences are combined together as decision criteria and a solution is generated by considering all these preferences at once and user can model the uncertainty and weight changes of objectives around their specified preferences of objectives. If the user only wants to apply the algorithm for a specific set of weights the second version of the algorithm efficiently accomplishes that.

Details DOI