physical activity monitoring with mobiliesed dataset

neurogeriatricskiel · Aug 9, 2024 · 9b48041 · 9b48041
1 parent 13585c3
commit 9b48041
Show file tree

Hide file tree

Showing 4 changed files with 145 additions and 32 deletions.
diff --git a/docs/examples/modules_03_pam.md b/docs/examples/modules_03_pam.md
@@ -0,0 +1,131 @@
+# Tutorial: Physical Activity Monitoring
+
+**Author:** Masoud Abedinifar
+
+**Last update:** Fri 09 August 2024
+
+## Learning objectives  
+By the end of this tutorial, you will be able to:  
+
+- Load accelerometer data from a raw recording
+- Apply the Physical Activity Monitoring algorithm to classify activity intensity levels.  
+- Interpret the results of activity classification.  
+
+# Pam Physical Activity Monitoring
+
+This example serves as a reference on how to use the physical activity monitoring algorithm. This example can be cited by referencing the package.
+
+The example illustrates how the physical activity monitoring algorithm determines the intensity level of sedentary, light, moderate, and vigorous physical activities using body acceleration recorded with a triaxial accelerometer worn on the lowerback. The physical activity monitoring algorithm is implemented in the main module [`kielmat.modules.pam._pam`](https://github.com/neurogeriatricskiel/KielMAT/tree/main/kielmat/modules/pam/_pam.py).
+
+The algorithm determines the intensity level of physical activities based on the following steps:
+
+1. **Loading Data:** Start by loading the data, including a time index along with accelerometer data (N, 3) for x, y, and z axes. The other inputs are the sampling frequency of the data (sampling_freq_Hz), defaulting to 100 Hz, and thresholds (thresholds_mg), provided as a dictionary containing threshold values for physical activity detection in mg unit. Another input is the epoch duration (epoch_duration_sec) in seconds, defaulting to 5 seconds. The last input, plot_results, when set to True, generates a plot showing the average Euclidean Norm Minus One (ENMO) per hour for each date, with a default of True.
+
+2. **Preprocessing:** The input signal is preprocessed by calculating the sample-level Euclidean norm (EN) of the acceleration signal across the x, y, and z axes. A fourth-order Butterworth low-pass filter with a cut-off frequency of 20Hz is then applied to remove noise. This filter is applied to the vector magnitude scores. The ENMO index is calculated to separate the activity-related component of the acceleration signal. Negative ENMO values are truncated to zero. Finally, the indices are multiplied by 1000 to convert units from g to mg.
+
+3. **Classification:** The algorithm classifies the intensity of physical activities based on the calculated ENMO values. The activity_classification function expresses the ENMO time-series data in 5-second epochs for summarizing the data. Thresholds for categorization are as follows: sedentary activity < 45 mg, light activity 45–100 mg, moderate activity 100–400 mg, vigorous activity > 400 mg.
+
+4. **Results:** The algorithm classifies different levels of activities along with the time spent on each activity level for each day. If `plot_results` is set to True, the function generates a plot showing the averaged ENMO values for each day.
+
+#### References
+[`1`] Doherty, Aiden, et al. (2017). Large scale population assessment of physical activity using wrist-worn accelerometers: the UK biobank study. PloS one 12.2. [https://doi.org/10.1371/journal.pone.0169649](https://doi.org/10.1371/journal.pone.0169649)
+
+[`2`] Van Hees, Vincent T., et al. (2013). Separating movement and gravity components in an acceleration signal and implications for the assessment of human daily physical activity. PloS one 8.4. [https://doi.org/10.1371/journal.pone.0061691](https://doi.org/10.1371/journal.pone.0061691)
+
+## Import Libraries
+The necessary libraries such as pandas, physical activity monitoring and mobilised dataset are imported. Make sure that you have all the required libraries and modules installed before running this code. You may also need to install the `kielmat` library and its dependencies if you haven't already.
+
+
+```python
+import pandas as pd
+import os
+from pathlib import Path
+from kielmat.modules.pam import PhysicalActivityMonitoring
+from kielmat.datasets import mobilised
+```
+
+## Data Preparation
+
+To implement the physical activity monitoring algorithm, we load example data from a participant who has worn a LowerBack IMU sensor for several hours during a day while performing daily life activities at home.
+
+The accelerometer data (N, 3) for the x, y, and z axes, is loaded as a pandas DataFrame.
+
+```python
+# Set the dataset path
+dataset_path = Path(os.getcwd()) / "_mobilised"
+
+# Fetch and load the dataset
+mobilised.fetch_dataset(dataset_path=dataset_path)
+```
+
+```python
+# In this example, we use "SU" as tracking_system and "LowerBack" as tracked points.
+tracking_sys = "SU"
+tracked_points = {tracking_sys: ["LowerBack"]}
+```
+
+
+```python
+# The 'mobilised.load_recording' function is used to load the data from the specified file_path
+recording = mobilised.load_recording(
+    cohort="PFF",  # Choose the cohort
+    file_name="data.mat", 
+    dataset_path=dataset_path)
+
+# Load lower back acceleration data
+accel_data = recording.data[tracking_sys][
+    ["LowerBack_ACCEL_x", "LowerBack_ACCEL_y", "LowerBack_ACCEL_z"]
+]
+
+# Get the corresponding sampling frequency directly from the recording
+sampling_frequency = recording.channels[tracking_sys][
+    recording.channels[tracking_sys]["name"] == "LowerBack_ACCEL_x"
+]["sampling_frequency"].values[0]
+
+# Get the acceleration data unit from the recording
+acceleration_unit = recording.channels[tracking_sys][
+    recording.channels[tracking_sys]["name"] == "LowerBack_ACCEL_x"
+]["units"].values[0]
+```
+
+## Apply Physical Activity Monitoring Algorithm
+Now, we are running the physical activity monitoring algorithm from the main module [`kielmat.modules.pam._pam`](https://github.com/neurogeriatricskiel/KielMAT/tree/main/kielmat/modules/pam/_pam.py). The inputs of the algorithm are as follows:
+
+- **Input Data:** `data` Includes data with a time index along with accelerometer data (N, 3) for x, y, and z axes in pandas Dataframe format.
+- **Acceleration Unit:** `acceleration_unit` is the unit of the acceleration data.
+- **Sampling Frequency:** `sampling_freq_Hz` is the sampling frequency of the acceleration data, defined in Hz, with a default value of 100 Hz.
+- **Thresholds:** `thresholds_mg` are provided as a dictionary containing threshold values for physical activity detection in mili-g.
+- **Epoch Duration:** `epoch_duration_sec` is the epoch length in seconds, with a default value of 5 seconds.
+- **Plot Results:** `plot_results`, if set to True, generates a plot showing the average Euclidean Norm Minus One (ENMO) per hour for each day.
+
+To apply the physical activity monitoring algorithm, an instance of the PhysicalActivityMonitoring class is created using the constructor, `PhysicalActivityMonitoring()`. The `pam` variable holds the instance, allowing us to access its methods. The output of the algorithm includes information regarding physical activity levels and the time spent on each activity for the provided date, including the mean of sedentary time, light, moderate, and vigorous activities, along with the time spent for each of them.
+
+
+```python
+# Initialize the PhysicalActivityMonitoring class
+pam = PhysicalActivityMonitoring()
+
+# Detect physical activity
+pam.detect(
+    data=accel_data,
+    acceleration_unit=acceleration_unit,
+    sampling_freq_Hz=sampling_frequency,
+    thresholds_mg={
+        "sedentary_threshold": 45,
+        "light_threshold": 100,
+        "moderate_threshold": 400,
+    },
+    epoch_duration_sec=5,
+    plot=False
+)
+
+# Print detected physical activities
+print(pam.physical_activities_)
+```
+        date        sedentary_mean_enmo  sedentary_time_min  light_mean_enmo     light_time_min  moderate_mean_enmo  moderate_time_min      vigorous_time_min  vigorous_mean_enmo  
+    0  2023-01-01   0.824444             115.583333          NaN                 0               NaN                 0                      NaN                0           
+
+
+
+
+
diff --git a/kielmat/modules/pam/_pam.py b/kielmat/modules/pam/_pam.py
@@ -103,18 +103,20 @@ def detect(
         if not isinstance(data, pd.DataFrame):
             raise ValueError("Input data must be a DataFrame.")
 
-        # check if index column is datetime
-        if not isinstance(data.index, pd.DatetimeIndex):
-            raise ValueError("Index column must be a datetime index.")
+        # Check if data has at least 3 columns
+        if data.shape[1] < 3:
+            raise ValueError("Input data must have at least 3 columns.")
+
+        # Create a time index if data does not have a timestamp column
+        if data.index.name != "timestamp" or not isinstance(data.index, pd.DatetimeIndex):
+            # Create a timestamp index with the correct frequency if not already present
+            data.index = pd.date_range(start="2023-01-01 00:00:00", periods=len(data), freq=f"{1/sampling_freq_Hz}s")
+            data.index.name = "timestamp"
 
         # check if index column in named timestamp
         if data.index.name != "timestamp":
             raise ValueError("Index column must be named timestamp.")
 
-        # Check if data has at least 3 columns
-        if data.shape[1] < 3:
-            raise ValueError("Input data must have at least 3 columns.")
-
         if not isinstance(sampling_freq_Hz, (int, float)) or sampling_freq_Hz <= 0:
             raise ValueError("Sampling frequency must be a positive float.")
 
@@ -130,10 +132,12 @@ def detect(
         # Check unit of acceleration data if it is in g or m/s^2
         if acceleration_unit == "m/s^2":
             # Convert acceleration data from m/s^2 to g (if not already is in g)
+            data = data.copy()
             data /= 9.81
 
         # Calculate Euclidean Norm (EN)
-        data["en"] = np.linalg.norm(data, axis=1)
+        data = data.copy()
+        data["en"] = np.linalg.norm(data.values, axis=1)
 
         # Apply 4th order low-pass Butterworth filter with the cutoff frequency of 20Hz
         data["en"] = preprocessing.lowpass_filter(
@@ -155,7 +159,7 @@ def detect(
 
         # Create a final DataFrame with time index and processed ENMO values
         processed_data = pd.DataFrame(
-            data=data["enmo"], index=data.index, columns=["enmo"]
+            data={"enmo": data["enmo"]}, index=data.index
         )
 
         # Classify activities based on thresholds using activity_classification

diff --git a/kielmat/test/test_modules.py b/kielmat/test/test_modules.py
@@ -659,28 +659,6 @@ def test_invalid_input_data_type():
         )
 
 
-def test_invalid_index_name():
-    # Initialize the class
-    pam = PhysicalActivityMonitoring()
-
-    # Test with invalid index name
-    data_with_wrong_index_name = acceleration_data.copy()
-    data_with_wrong_index_name.index.name = "wrong_name"
-    with pytest.raises(ValueError):
-        pam.detect(
-            data=data_with_wrong_index_name,
-            acceleration_unit="m/s^2",
-            sampling_freq_Hz=sampling_frequency,
-            thresholds_mg={
-                "sedentary_threshold": 45,
-                "light_threshold": 100,
-                "moderate_threshold": 400,
-            },
-            epoch_duration_sec=5,
-            plot=False,
-        )
-
-
 def test_insufficient_columns():
     # Initialize the class
     pam = PhysicalActivityMonitoring()

diff --git a/kielmat/utils/preprocessing.py b/kielmat/utils/preprocessing.py
@@ -1004,7 +1004,7 @@ def classify_physical_activity(
         raise ValueError("Epoch_duration must be a positive integer.")
 
     # Group data by time in epochs and calculate the mean
-    processed_data = input_data.groupby(pd.Grouper(freq=f"{epoch_duration}S")).mean()
+    processed_data = input_data.groupby(pd.Grouper(freq=f"{epoch_duration}s")).mean()
 
     # Classify activity levels based on threshold values
     processed_data["sedentary"] = (processed_data["enmo"] < sedentary_threshold).astype(