top of page

Assessing Humans’ Willingness to
Delegate Control Tasks
to a Robot in Critical Situations

User Research, Human-Robot Interaction


This is a group project in which we tried to assess the factors that affect a human’s trust of and reliance upon robots in a hazardous environment. Our project focuses on the trust of a robot by its human operator as opposed to its human follower or rescuee.


User Researcher


UI/Data Collection: Java, Amazon MTurk

Data Analysis: Excel, ANOVA test



As robots increasingly take on more prominent roles in search-and-rescue tasks, it becomes vital to gain a deeper understanding of the factors that affect a human’s trust of and reliance upon robots. This is because emergency situations are usually times of vastly increased stress and adrenaline for victims and rescuers alike, and the inclusion of robots possessing requisite task capabilities could potentially ease some of the burdens off human rescuers. At the same time, we must also ensure that the degree of human reliance on robots is commensurate with the robot’s competence and the level of difficulty of the task in question, so as to prevent over-reliance in situations for which the robot may be under-equipped, as well as under-reliance in situations which may be too dangerous or taxing to the human.


Our main research questions are threefold :



What impact, if any, does the quality of a robot’s prior performance have on a human’s willingness to delegate “critical tasks” to the robot in the future? For purposes of this experiment, “critical tasks” are defined as those involving rescue efforts in response to an emergency.


What impact, if any, does the difficulty level of the task have on the human’s willingness to delegate critical tasks to the robot?


What, if any, interactions exist between performance of a robot and treacherousness of environment?

MILESTONE 1: Experiment Design

The experiment was conducted in the form of a search-and-rescue mission via a custom simulation software that we created using Java. We simulated the evacuation scenario in a 2D discrete set of grids. The human participant’s task is to use the virtual robot to traverse this treacherous environment safely in an attempt to find the victim, whose precise location is not known ahead of time. Our source code and README file for the simulation are located here.

In order to minimize risk of harm to itself while always seeking to explore new areas, the robot goes through the yes/no questions in the flowchart before making next move. The flowchart below shows the algorithm working at a high level.

A 2x2, between-subjects factorial design was employed for this study, with the independent variables set as:


With Bad AI (Artificial Intelligence), the robot will be more likely to fail the practice mission in full autonomous mode (or look silly even if successful), whereas all of the Good AI robots would be more likely to succeed.


The environment was broken down into Easy and Hard versions, with the former containing a total of five hazardous grids to the latter’s nine.

All in all, there were four different experiment types in this between-subjects study:

Easy environment / Bad AI

Hard environment / Bad AI

Easy environment / Good AI

Hard environment / Good AI

MILESTONE 2: Data Collection

We posted our experiment and  post-experiment survey on Amazon MTurk. A total of 20 participants for the study were recruited for completing the experiment. We imposed no particular constraints on the participants with respect to age, gender, expertise with robotics or the like, as we deemed the simulation to be sufficiently intuitive and user-friendly to learn and follow through. Below is what was presented to participants on Amazon MTurk. Our data sheet can be found here.

MILESTONE 3: Data Analysis & Results

For each of the three dependent variables (Frequency of Task Delegation Ratio, Confidence Level in the Robot, Reliance on the Robot), the impact of the manipulations of the independent variables (i.e. autonomous performance and difficulty level of the environment) were analyzed using two-way ANOVA test.

Frequency of Task Delegation Ratio (No. of AI triggers / No. of total moves)

The evidence suggests that the difficulty of the final mission (Hard environment vs. Easy environment) had a singularly significant effect on the participants’ frequency of delegating task control to the robot (p ≈ 0.0067). To our surprise, the robot’s autonomous performance level in the practice mission had no significant impact on this metric.

Also, we could not reject our null hypothesis that there was no interaction between the two
independent variables. In other words,
participants tended to perform the task themselves in easy environments while deferring more to robot automation in difficult environments, regardless of whether the robot performance was good or poor.

Confidence Level in the Robot

there is strong evidence to infer that the robot’s performance level was the main effect on the subjects’ responses, with low ratings correlated to bad performance and vice versa (p ≈ 0.00027). Likewise, the evidence indicates that the environment types had no significant effect on the responses, and that there was no interaction between the two independent variables.

Reliance on the Robot

Two main effects were found. First, we can infer that superior autonomous performance was correlated with an increased reliance on the robot (p ≈ 0.033). Second, the more difficult the environment, the more inclined subjects were to rely on the robot (p ≈ 0.0085). No significant interaction was found between the two independent variables.


The most surprising of our findings was that task difficulty, and not prior robot performance, had a significant effect on the (no. of AI triggers) / (total no. of moves) ratio. The survey responses, however, would introduce additional layers of complexity. The participants’ responses to the first question confirmed, as we expected, that poor performance was correlated with low confidence and vice versa. What we did not expect was the sheer
degree of discrepancy between how participants felt regarding the robot’s ability and how they behaved in terms of delegating control tasks during the mission.
When the mission was conducted in an easy environment, most took it upon themselves to complete it even when they gave high confidence ratings to the robot, whereas in a difficult environment, they relied more heavily on automation even when they gave low confidence ratings to the robot.

The responses to the second survey question were likewise intriguing. The intent of this question, as indicated by its wording, was to identify whether the respondents felt as if the robot could be relied upon during the final mission, regardless of whether they ended up relying on it. Although it is possible that some respondents interpreted the question as merely asking them to recall if they relied on the robot in the final mission (in which case the findings should be similar to those regarding the frequency of task delegation), we take the finding of main effects here for
both independent variables to be evidence to the contrary.


We believe these responses help place our two aforementioned findings (each of which found only one main effect) into context by repudiating one of our unstated assumptions throughout this study: that there is a one-to-one correspondence between confidence/trust level and task delegation, as well as between the sentiment that a robot is reliable and the actual behavior of relying on it. Stated otherwise, the totality of our findings seems to suggest that high confidence in a robot’s ability does not necessarily translate to robot-reliant behavior in a critical situation, nor does the lack of confidence in a robot necessarily result in self-reliant behavior.


To verify conjectures (e.g., high frequency of task delegation in a difficult environment has to do not so much with trusting the robot, but rather with a desire to avoid direct responsibility in the likely event of failure) and to supplement the shortcomings in this experiment, a future study might conduct similar sets of experiments, preferably in a real-life context and a larger group of subjects, with the nature of the tasks (i.e. critical vs. mundane) as one of the key independent variables. The dependent variables could include objective measurements of task delegation, as well as subjective measurements of confidence level, trust, and sentiment of reliance upon automation.

bottom of page