Skip to content

A curated list of datasets, publically available for machine learning research in the area of manufacturing

Notifications You must be signed in to change notification settings

nicolasj92/industrial-ml-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 

Repository files navigation

Industrial ML Datasets

The following is a curated list of datasets, publically available for machine learning research in the area of manufacturing.

For more information, please check our corresponding publication:

@inproceedings{jourdan_machine_2021,
 title = {Machine {Learning} for {Intelligent} {Maintenance} and {Quality} {Control}: {A} {Review} of {Existing} {Datasets} and {Corresponding} {Use} {Cases}},
 volume = {2},
 journal = {Proceedings of the 2nd Conference on Production Systems and Logistics},
 author = {Jourdan, Nicolas and Longard, Lukas and Biegel, Tobias and Metternich, Joachim},
 year = {2021},
}

Some additional datasets may be found here: Link

✔️ indicates a preset split between training and testing data.

🌐 indicates, that the test set labels are hidden behind an evaluation server.

Predictive Maintenance and Condition Monitoring

Name Year Feature Type Feature Count Target Variable Instances Official Train/Test Split Data Source Format License Access
Wood veneers before and after drying
This dataset consists of 2579 image pairs (5158 images in total) of wood veneers before and after drying.
2021 Image >4000x4000 - 5.158 Real PNG CC BY 4.0 Link
Diesel Engine Faults Features
Fault detection based on pressure curves and vibration.
2020 Signal 84 C (4) 3.500 Synthetic MAT CC BY 4.0 Link
Degradation of a Cutting Blade
Wrapping machine process data over 12 months with a degrading cutting tool.
2019 Signal 9 C (8) / R 1.062.912 Real CSV CC BY-SA 3.0 Link
CNC Mill Tool Wear
CNC process data of wax milling with worn/unworn tools.
2018 Signal 48 C (3*2) 25.286 Real CSV CC0: Public Domain Link
Condition Monitoring of Hydraulic Systems
Test rig process data of multiple load cycles with various fault types and severity levels.
2018 Signal 17 C (5*(24)) 2.205 Real Non-Standard ? Link
Production Plant Data for Condition Monitoring
nonymized process data of component run-to-failure experiments.
2018 Signal 26 - 228.414 Real CSV CC BY-SA 3.0 Link
Versatile Production
Popcorn production process data with multiple process steps.
2018 Signal 5-85 - 80.000 Real CSV CC BY-NC-SA 4.0 Link
Degradation Measurement of Robot Arm Position Accuracy
Target- and actual values of robotic arm tool position, velocity and current for health assessment.
2017 Signal 73 - 155.000 Real CSV ? Link
APS Failure at Scania Trucks
Anonymized counters and histograms for air pressure system fault detection.
2016 Signal 170 C (2) 76.000 ✔️ Real CSV GNU General Public License Link
Maintenance of Naval Propulsion Plants
Gas turbine process data for component decay state prediction.
2016 Signal 16 R 11.934 Synthetic Non-Standard
More Information Use of this dataset in publications must be acknowledged by referencing the following publication:
A. Coraddu, L. Oneto, A. Ghio, S. Savio, D. Anguita, M. Figari, Machine Learning Approaches for Improving Condition?Based Maintenance of Naval Propulsion Plants, Journal of Engineering for the Maritime Environment, 2014, DOI: 10.1177/1475090214540874, (In Press)
Link
Plant Fault Detection
Anonymized process data for plant fault detection.
2015 Signal 10 C (6) 8.938.370 Real CSV ? Link
Asset Failure and Replacement
Anonymized data for asset fault detection.
2014 Signal 1 C (2) 447.341 ✔️ 🌐 Real CSV ? Link
Maintenance Action Recommendation
Anonymized process and maintenance data of an industrial asset for maintenance action recommendation.
2013 Signal 32 C (14) 2.097.152 ✔️ 🌐 Real CSV ? Link
Anemometer Fault Detection
Anemometer measurements for fault detection.
2011 Signal 16
16-20
- 345.700
208.800
✔️ 🌐 Real Non-Standard ? Link
Gearbox Fault Detection
Test rig accelerometer data for fault detection.
2009 Signal 3 - > 10 Mio. Real CSV ? Link
Li-Ion Battery Aging
Battery test rig data during charge and discharge cycles for degradation detection.
2008 Signal 12 - 2.167 Real MAT N/A Link
Turbofan Engine Degradation Simulation
C-MAPSS simulation sensor data of various conditions and fault modes.
2008 Signal 26 - 262.256 ✔️ Synthetic Non-Standard ? Link
Bearing
Bearing test rig accelerometer data of run-to-failure experiments.
2007 Signal 4-8 - 61.440 Real CSV ? Link
Milling
Milling process- and external sensor data for tool wear detection.
2007 Signal 13 R 1.503.000 Real MAT ? Link
CWRU Bearing Data
Bearing test rig accelerometer data for fault detection.
n.A. Signal 5 C (2) > 10 Mio. Real MAT ? Link

Process Monitoring

Name Year Feature Type Feature Count Target Variable Instances Official Train/Test Split Data Source Format License Access
Skoltech Anomaly Benchmark (SKAB)
Time-series data from water circulation loop testbed for evaluating anomaly detection algorithms.
2020 Signal 8 C (2) 34×1,200 ✔️ Real CSV GNU GPL v3.0 Link
High Storage System Anomaly Detection
Storage test rig process data for anomaly detection.
2018 Signal 20 C (2) 91.000 Synthetic CSV CC-BY-NC-SA 4.0 Link
Genesis Pick-and-Place Demonstrator
Material sorting test rig process data for anomaly detection.
2018 Signal 23 C (3) 32.440 Real CSV CC-BY-NC-SA 4.0 Link
Tennessee Eastman Process Simulation Dataset
Simulated chemical process data for anomaly detection with different fault types.
2017 Signal 51 C (21) / R > 10 Mio. ✔️ Synthetic RData
More Information The person who owns, created, or contributed a work to the data or work provided here dedicated the work to the public domain and has waived his or her rights to the work worldwide under copyright law. You can copy, modify, distribute, and perform the work, for any lawful purpose, without asking permission.
Link
Robot Execution Failures
Force and torque measurements of an industrial robot with different erroneous operating conditions.
1999 Signal 89 C (13) 463 Real Non-Standard ? Link
Mechanical Analysis
Vibration measurements of electromechanical devices with different erroneous operating conditions.
1990 Signal 7 C (6) 209 ✔️ Real MAT ? Link
CWRU Bearing Data Bearing test rig accelerometer data for anomaly detection. n.A. Signal 5 C (2) > 10 Mio. Real MAT ? Link

Predictive Quality and Quality Inspection

Name Year Feature Type Feature Count Target Variable Instances Official Train/Test Split Data Source Format License Access
Casting Product Quality Inspection
Grayscale images of pump impeller castings with and without defects.
2020 Image 300x300
512x512
C (2) 7.348 ✔️ Real JPG CC-BY-NC-ND 4.0 Link
GC10-DET Defect Location for Metal Surface
Grayscale images of metal surfaces with various defect types and corresponding bounding box annotations.
2020 Image Varying C (10) 3.570 Real JPG, XML ? Link
Mechanic Component Images
Grayscale images of air conditioner pistons with various defect types.
2020 Image 86x90 C (3) 285 Real PNG ? Link
Multi-Stage Continuous Flow Process
Anonymized process data of a production line with quality measurements of part dimensions.
2020 Signal 116 - 14.088 Real CSV ? Link
Plastic Extrusion Defects
Process data of a plastic extrusion process.
2020 Signal 470 - 226.536 Real CSV CC BY-NC-ND 4.0 Link
AITEX
Grayscale images of textile fabrics with various defect types and corresponding segmentation masks.
2019 Image 4096x256 C (13) 245 Real PNG, Mask ? Link
Deep PCB
Grayscale images of circuit boards with various defect types and corresponding bounding box annotations.
2019 Image 640x640 C (7) 1.500 ✔️ Real JPG, Mask only for research purpose Link
Severstal Steel Defect Detection
Grayscale images of steel surfaces with various defect types and corresponding segmentation polygons.
2019 Image 1600x256 C (5) 18.074 ✔️ 🌐 Real JPG, CSV ? Link
Turning Dataset for Chatter Diagnosis
Sensory data of a turning test rig and varying strengths of chatter.
2019 Signal 8 C (4) > 10 Mio. Real MAT CC BY 4.0 Link
Magnetic Tile Defect
Grayscale images of magnetic tile surfaces with various defect types and corresponding segmentation masks.
2018 Image 248x373 C (6) 1.344 Real JPG, PNG ? Link
TIG Welding
Grayscale images of a welding process with various defect types.
2018 Image 800x974 C (6) 33.254 ✔️ Real PNG, JSON CC BY-SA 4.0 Link
Mining Process
Process data of a mining process for impurity prediction in ore concentrate.
2017 Signal 24 R 737.454 Real CSV CC0: Public Domain Link
Bosch Production Line Performance
Anonymized process data of production lines with and without defects.
2016 Signal 4264 C (2) 2.368.435 ✔️ 🌐 Real CSV ? Link
WM811K Wafer Maps
Defect matrices of semiconductor wafers with various defect types.
2014 2D Defect Matrix Varying C (9) 811.457 Real MAT ? Link
NEU Surface Defect Database
Grayscale images of metal surfaces with various defect types and corresponding bounding box annotations.
2013 Image 200x200 C (6) 1.800 Real BMP, XML ? Link
Steel Plate Faults
Geometric measurements of steel plates with various defect types.
2010 Signal 27 C (7) 1.941 Real CSV ? Link
HCI Industrial Optical Inspection
Synthetic grayscale images of textured surfaces with corresponding defect ellipses.
2007 Image 512x512 C (2) 16.100 ✔️ Synthetic PNG, Non-Standard ? Link

Process Parameter Optimization

Name Year Feature Type Feature Count Instances Official Train/Test Split Data Source Format License Access
Laser Welding
Process parameter recordings for correlation with weld quality indicators such as weld depth and geometrical dimensions.
2020 Signal 13 361 Real XLS CC BY 4.0 Link
3D Printer
Process parameters of a 3D printer for correlation with print quality indicators such as roughness, tension and elongation.
2018 Signal 12 50 Real CSV ? Link
Tool Path Generation
Shape deviation measurements and corresponding simulated cutting conditions.
2018 Signal 9 4.968 Real CSV CC BY 4.0 Link
Mercedes-Benz Greener Manufacturing
Car feature configurations to be correlated with the required test time of the configurations.
2017 Signal 378 8.420 ✔️ 🌐 Real CSV ? Link
SECOM
Semiconductor process measurements and corresponding yields for determination of key factors to yield.
2008 Signal 591 1.567 Real Non-Standard ? Link

About

A curated list of datasets, publically available for machine learning research in the area of manufacturing

Topics

Resources

Stars

Watchers

Forks