A participant controls a robot arm to pick up a marshmallow.

We present HARMONIC, a large multi-modal dataset of human interactions in a shared autonomy setting. The dataset provides human, robot, and environment data streams from twenty-four people engaged in an assistive eating task with a 6 degree-of-freedom (DOF) robot arm. From each participant, we recorded video of both eyes, egocentric video from a head-mounted camera, joystick commands, electromyography from the participant's forearm used to operate the joystick, third person stereo video, and the joint positions of the 6 DOF robot arm. Also included are several data streams that come as a direct result of these recordings, namely eye gaze fixations in the egocentric camera frame and body position skeletons. This dataset could be of interest to researchers studying intention prediction, human mental state modeling, and shared autonomy. Data streams are provided in a variety of formats such as video and human-readable csv or yaml files.

Full details are available in the accompanying paper on arXiv.

Accessing the dataset

The data set can be downloaded here. The most recent version is harmonic-0.5.0.

The files available are as follows:

  • harmonic_0.5.0_data.tar.gz: All data.
  • harmonic_0.5.0_sample.tar.gz: Contains all data from a single run.
  • harmonic_0.5.0_minimal.tar.gz: Contains only text data, statistics, and a single processed video per run.
  • harmonic_0.5.0_text.tar.gz: Contains only text data.

To visualize the data, you may use the code available at https://github.com/HARPLab/harmonic_playback. Code for accessing the data is coming soon.

To receive updates about this dataset (including notifications when a new version is posted), join the mailing list.


The data set can be cited as follows:
Newman, Benjamin A., Aronson, Reuben M., Srinivasa, Siddhartha S., Kitani, Kris, and Admoni, Henny. “HARMONIC: A Multimodal Dataset of Assistive Human-Robot Collaboration.” arXiv:1807.11154 [cs.RO], July 2018.


   author = {{Newman}, B.~A. and {Aronson}, R.~M. and {Srinivasa}, S.~S. and 
	{Kitani}, K. and {Admoni}, H.},
    title = "{HARMONIC: A Multimodal Dataset of Assistive Human-Robot Collaboration}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1807.11154},
 primaryClass = "cs.RO",
 keywords = {Computer Science - Robotics, Computer Science - Human-Computer Interaction},
     year = 2018,
    month = jul,
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180711154N},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}



  • Added additional ZED videos with blurred faces when subjects permitted
  • Changed all datastreams to start from 0 (corresponding to start_time in run_info.yaml) rather than a generic unixtime
  • Standardized export data precision across all files
  • Reencoded all videos as .mp4 / H.264 (significantly reducing the size)
  • Removed dropped frame information from ada_joy (as it has no consistent report rate, since it reports only on changes)
  • Ensured that stats files are always present and include a dummy file with no information if the stream is missing
  • Moved generated openpose files into the text directory and standardized them to match other signals


  • Fixed several morsel labels where the assignment of raw morsel data to indices was incorrect
    • This error lead to the morsel_target value in morsel.yaml to be incorrect. goal_correct and any_correct were not affected by this bug.
  • Fixed bug where p_goal_* fields in assistance_info.csv did not align with morsel labels in morsel.yaml
  • Several new files and fields were added for additional transparency regarding the morsel exporting process
    • morsel.yaml now includes an additional field, morsel_perm, which is a dictionary mapping raw morsel ids to labeled morsel ids. All data is now in terms of the labeled morsel ids (i.e., the morsel perm transformation has already been applied) except when noted otherwise. Note that morsel_perm is many-to-one, i.e. multiple raw morsels were determined (manually) to be duplicate detections of the same morsel. In this case, the positions were averaged together and the probabilities of those combined morsels were averaged, then all p_goal_* probabilities were renormalized.
    • assistance_info.csv now has p_goal_*_raw as well as p_goal_*. p_goal_*_raw are the original probabilities assigned by the algorithm and correspond to the UNADJUSTED morsels; p_goal_* probabilities now align with the morsel labels found in morsel.yaml (i.e., the transformation has been applied.)


  • Remove gaze directory
  • Remove playback info in processed directory
  • Reencode all videos as .avi with codec FFV1 for consistency
  • Add an additional field to run_info.yaml detailing when the trial starts
  • Fix many missing morsel info files


  • Add skeleton tracking info in keypoints/*
  • Edit names in morsel.yaml to be pure ASCII instead of python-specific unicode


  • Fix morsel.yaml file export to be correct.
  • Add additional visualization data in processed/playback


  • Initial dataset release.