2021 PHM Conference Data Challenge

Announcement: The Data Challenge Submission Deadline has been Extended.


This year’s data challenge addresses the problem of remaining useful life (RUL) prediction in a fleet of aircraft engines under conditions of high variability in the flight envelope and multiple failure modes. The task is to develop a data-driven model to estimate RUL using the condition monitoring data as input. The challenge uses a subset of the run-t0-failure degradation trajectories of the N-CMAPSS dataset.

The details of this data challenge are available here. We provide the dataset containing full flight profile data for 100 aircraft experiencing different types of slowly developing faults that initiate at some time during the flight history. There are seven different failure nodes. The task is to train a model to estimate the time to failure using the data in the development dataset, and test on data in the test dataset. Validation is done with a validation dataset that will be released for one-time assessment at the end of the data challenge. Scoring of performance (train, test, and validation) is done through this web interface.

The top three teams will be asked to submit their approach as a paper to the PHM Conference and are expected to present their approach at the conference. Top teams are also being recognized at the banquet and will receive a plaque.  In addition, all teams have the option to submit a poster via the conference submission website (www.phmpapers.org) to be presented at the conference.


Collaboration is encouraged and teams may be comprised of one or more students and professionals. The teams judged to have the first, second, and third best scores will be awarded prizes of USD 600, USD 400, and USD 200, respectively, contingent upon:

  • Having at least one member of the team register and attend the PHM 2021 Conference.
  • Submitting a peer-reviewed conference paper. Submission of the data challenge special session papers is outside the regular paper submission process and follows its own modified schedule.
  • Presenting the analysis results and technique employed at a special session within the conference program.

The organizers of the competition reserve the right to both modify these rules and disqualify any team for any efforts it deems inconsistent with fair and open practices. In addition, the top entries will also be encouraged to submit a journal-quality paper to the International Journal of Prognostics and Health Management (ijPHM).

Data Challenge Registration

Teams may register here. Please note that the registration is a 2 steps process:
1) You must first log in with a PHM user account to complete the form. If you need to create one here –> Sign Up
2) Once logged in, you need to complete the Data Challenge Application form. Then, we will grant your user access to use the Data Challenge submission area.

Please note: In the spirit of fair competition, we allow only one account per team. Please do not register multiple times under different user names, under fictitious names, or using anonymous accounts. Competition organizers reserve the right to delete multiple entries from the same person (or team) and/or to disqualify those who are trying to “game” the system or using fictitious identities.

Relevant Dates

Key PHM Data Challenge Dates
Competition Open – the following information will be posted:

  • Challenge description
  • Training data
  • Scoring definition
August 9, 2021
Final Validation Data Posted and Result Submission Website Open September 20 September 13, 2021
Competition Closed October 4 September 27, 2021 (11:59:59 pm PDT)
Preliminary Winners Announced October 5 September 28, 2021
Winners Announced October 7 September 30, 2021
Final Papers Due, Winners Announced November 3 October 20, 2021
PHM Conference Dates November 29 – December 2 November 1-4, 2021

Data Challenge Details

The details about this year’s data challenge can be found in this document.


Training and Testing Dataset

The training and test datasets are available here: Turbofan Engine Degradation Simulation Data Set-2

Validation Dataset

[New!] The validation dataset for one-time assessment of your algorithms can be downloaded here .

Data Challenge Submission

Please upload your submission in the Data Challenge submission area. Note that you will only see the Submission area if you have previously applied using the application form and we have granted you access

Please ensure the filename is yourusername.txt or else the automated scorer will not be able to read your file and score your algorithm and the conference organizers will not be able to consider your submission as a valid submission. Please ensure that your results are in the format shown in this Example_Submission.txt file which contains only 38 values in a column vector format.

Also, you only can submit 1 single yourusername.txt file for your submission. Multiple submission files from a single team will lead to disqualification.

Frequently Asked Questions

    1. For the data set used to develop models, how many units are provided?
      It has 90 units.
    2. Do I have to follow the training/testing split for the model development data set?
      No, it is up to you how you split up the data for training/evaluating your model. A final validation data set will be posted near the end of the competition, so the user would need to use their best model for the validation data set.
    3. When I downloaded the data set, it also included files “N-CMAPSS_DS02-006” and “N-CMAPSS_DS08d-010”, should I use/consider them?
      These files are not listed in the description and should not be used/considered. Please use the other files for model development.
    4. For the validation data set, what variable types/sets will be provided?
      W_dev (Scenario descriptors – flight data)
      X_s (Measurements, xs)
      A (auxiliary data)
    5. Do all the units have the same failure mode in the development data set?
      No, please see Table 2 in the description.
    6. In the development data set, are both training and test cycles full run to failure degradation data sets?
      Yes, for training and testing data sets, the last cycle is when the aircraft engine failed. Each data set starts from cycle 1 until failure.
    7. Will the validation data set contain all the cycles up until a failure occurs?
      Unlike the development data set, the validation data set will contain units that are partially degraded and have not failed yet. The participant would need to estimate its remaining useful life (in cycles/flights).