Package 'animovement'

Title: An R toolbox for analysing animal movement across space and time
Description: An R toolbox for analysing animal movement across space and time.
Authors: Mikkel Roald-Arbøl [aut, cre]
Maintainer: Mikkel Roald-Arbøl <[email protected]>
License: MIT + file LICENSE
Version: 0.6.0
Built: 2025-02-02 02:40:56 UTC
Source: https://github.com/roaldarbol/animovement

Help Index


Add Centroid to Movement Data

Description

Calculates and adds a centroid point to movement tracking data. The centroid represents the mean position of selected keypoints at each time point.

Usage

add_centroid(
  data,
  include_keypoints = NULL,
  exclude_keypoints = NULL,
  centroid_name = "centroid"
)

Arguments

data

A data frame containing movement tracking data with the following required columns:

  • individual: Identifier for each tracked subject

  • keypoint: Factor specifying tracked points

  • time: Time values

  • x: x-coordinates

  • y: y-coordinates

  • confidence: Confidence values for tracked points

include_keypoints

Optional character vector specifying which keypoints to use for centroid calculation. If NULL (default), all keypoints are used unless exclude_keypoints is specified.

exclude_keypoints

Optional character vector specifying which keypoints to exclude from centroid calculation. If NULL (default), no keypoints are excluded unless include_keypoints is specified.

centroid_name

Character string specifying the name for the centroid keypoint (default: "centroid")

Details

The function calculates the centroid as the mean x and y position of the selected keypoints at each time point for each individual. Keypoints can be selected either by specifying which ones to include (include_keypoints) or which ones to exclude (exclude_keypoints). The resulting centroid is added as a new keypoint to the data frame.

Value

A data frame with the same structure as the input, but with an additional keypoint representing the centroid. The centroid's confidence values are set to NA.

See Also

convert_nan_to_na() for NaN handling in the centroid calculation

Examples

## Not run: 
# Add centroid using all keypoints
add_centroid(movement_data)

# Calculate centroid using only specific keypoints
add_centroid(movement_data,
            include_keypoints = c("head", "thorax", "abdomen"))

# Calculate centroid excluding certain keypoints
add_centroid(movement_data,
            exclude_keypoints = c("antenna_left", "antenna_right"),
            centroid_name = "body_centroid")

## End(Not run)

Align a time series with a reference series using cross-correlation

Description

This function aligns two time series by shifting one series relative to the reference based on their cross-correlation. It first finds the optimal lag using find_lag, then applies the shift by padding with NA values as needed.

Usage

align_timeseries(signal, reference, max_lag = 5000, normalize = TRUE)

Arguments

signal

Time series to align (numeric vector)

reference

Reference time series to align against (numeric vector)

max_lag

Maximum lag to consider in both directions, in number of samples. If NULL, uses (length of series - 1)

normalize

Logical; if TRUE, z-score normalizes both series before computing cross-correlation (recommended for series with different scales)

Value

A numeric vector of the same length as the input signal, shifted to align with the reference series. NA values are used to pad the beginning or end depending on the direction of the shift.

Examples

# Create two artificially shifted sine waves
t <- seq(0, 10, 0.1)
reference <- sin(t)
signal <- sin(t - 0.5)  # Signal delayed by 0.5 units

# Align the delayed signal with the reference
aligned <- align_timeseries(signal, reference)

# Plot to verify alignment
plot(t, reference, type = "l", col = "black")
lines(t, aligned, col = "red", lty = 2)

Calculate kinematics from position data

Description

[Experimental]

Calculates kinematic measurements including translational and rotational motion from position data. The function computes velocities, accelerations, and angular measurements from x-y coordinate time series data.

Usage

calculate_kinematics(data, by = NULL)

Arguments

data

A data frame containing at minimum:

  • time (numeric): Time points of measurements

  • x (numeric): X-coordinates

  • y (numeric): Y-coordinates

by

Character vector specifying additional grouping variables (optional). If the input data frame is already grouped, those groups will be preserved and any additional groups specified in by will be added.

Value

A data frame containing the original data plus calculated kinematics:

  • distance: Distance traveled between consecutive points

  • v_translation: Translational velocity

  • a_translation: Translational acceleration

  • direction: Movement direction in radians

  • rotation: Angular change between consecutive points

  • v_rotation: Angular velocity

  • a_rotation: Angular acceleration

Warning

Time points should be regularly sampled for accurate derivatives.

Examples

# Basic usage with just x-y coordinates
df <- data.frame(
  time = 1:10,
  x = runif(10),
  y = runif(10)
)
calculate_kinematics(df)

# Using with grouping variables
df_grouped <- data.frame(
  time = rep(1:5, 2),
  x = runif(10),
  y = runif(10),
  individual = rep(c("A", "B"), each = 5)
)
calculate_kinematics(df_grouped, by = "individual")

Calculate Speed from Position Data

Description

Calculates the instantaneous speed from x, y coordinates and time data. Speed is computed as the absolute magnitude of velocity (change in position over time).

Usage

calculate_speed(x, y, time)

Arguments

x

Numeric vector of x coordinates

y

Numeric vector of y coordinates

time

Numeric vector of time values

Value

Numeric vector of speeds. The first value will be NA since speed requires two positions to calculate.

Examples

## Not run: 
# Inside dplyr pipeline
data |>
  group_by(keypoint) |>
  mutate(speed = calculate_speed(x, y, time))

## End(Not run)

Calculate summary statistics

Description

[Experimental]

Calculate summary statistics for tracks

Usage

calculate_statistics(
  data,
  measures = "median_mad",
  straightness = c("A", "B", "C", "D")
)

Arguments

data

A kinematics data frame

measures

Measures of central tendency and dispersion. Options are median_mad (default) and mean_sd. See description for more information.

straightness

Which method to calculate path straightness. Choose between "A" (default), "B", "C"... or a combination (e.g. "c("A","B")"). See description for details about the different calculations.

Value

An data frame data frame with kinematics calculated


Visualize the distribution of confidence values for each keypoint

Description

This function generates histograms showing the distribution of confidence values for each keypoint in the dataset.

Usage

check_confidence(data)

Arguments

data

A data frame containing at least the columns keypoint and confidence.

Details

  • Each keypoint in the dataset is assigned its own histogram, showing the frequency of different confidence values.

  • Confidence values are grouped and visualized using the subplot_confidence function.

  • The combined plots use patchwork for alignment and styling.

Value

A patchwork object combining histograms for each keypoint, visualizing the confidence value distributions.

Examples

library(dplyr)
library(patchwork)
data <- dplyr::tibble(
  keypoint = rep(c("head", "arm", "leg", "torso"), each = 10),
  confidence = runif(40, min = 0, max = 1)
)
# Generate histograms of confidence distributions
check_confidence(data)

Visualize the occurrence of gap sizes in the data

Description

This function generates a plot showing the distribution of gap sizes (consecutive NA values) in the data, either aggregated or broken down by keypoints.

Usage

check_na_gapsize(data, limit = 10, include_total = TRUE, by_keypoint = TRUE)

Arguments

data

A data frame containing at least the columns x and keypoint.

limit

An integer specifying the maximum gap size to include in the plot. Default is 10.

include_total

Logical. If TRUE, includes the total count of gaps of each size in the plot. Default is TRUE.

by_keypoint

Logical. If TRUE, generates a separate plot for each keypoint. If FALSE, creates a single aggregated plot for all keypoints. Default is TRUE.

Details

  • The plot highlights the most common gap sizes in the data, ordered by frequency.

  • Different colors represent the occurrence (indianred), total counts (steelblue), and border outlines (black).

  • The function uses patchwork to combine multiple plots when by_keypoint = TRUE.

Value

A patchwork object combining one or more ggplots that visualize the occurrence of gap sizes (consecutive NAs) in the data.

Examples

library(dplyr)
library(ggplot2)
library(patchwork)
data <- dplyr::tibble(
  x = c(NA, NA, 3, NA, 5, 6, NA, NA, NA, 10),
  keypoint = factor(rep(c("head", "arm"), each = 5))
)
check_na_gapsize(data, limit = 5, include_total = TRUE, by_keypoint = TRUE)

Visualize the timing of missing values in the data

Description

This function generates a plot to visualize where missing values (NAs) occur in the data over time. It can display separate plots for each keypoint or a single aggregated plot for all keypoints.

Usage

check_na_timing(data, by_keypoint = TRUE)

Arguments

data

A data frame containing at least the columns x, individual, and keypoint.

by_keypoint

Logical. If TRUE, generates a separate plot for each keypoint. If FALSE, creates a single aggregated plot for all keypoints. Default is TRUE.

Details

  • Missing values are highlighted using a red (indianred2) color, and non-missing values are shown in blue (steelblue).

  • The function uses the patchwork package to combine multiple plots when by_keypoint = TRUE.

Value

A patchwork object combining one or more ggplots that show the timing of missing values (NA) in the data.

Examples

library(dplyr)
library(ggplot2)
library(patchwork)
data <- dplyr::tibble(
  x = c(1, 2, NA, 4, NA, 6),
  individual = rep("A", 6),
  keypoint = factor(rep(c("head", "arm"), each = 3))
)
check_na_timing(data, by_keypoint = TRUE)

Analyze the distribution of distances from keypoints to the centroid

Description

This function generates visualizations of the distances from each keypoint to a calculated centroid in the data. By default, it produces histograms of the distance distributions, but it can also create confidence plots if specified.

Usage

check_pose(data, reference_keypoint, type = "histogram")

Arguments

data

A data frame containing at least the columns keypoint, x, and y.

reference_keypoint

The keypoint used as a reference to calculate the distance.

type

Character string specifying the type of plot to create. Options are:

  • "histogram": Histograms of the distance distributions (default)

  • "confidence": Plots showing confidence intervals for the distances

Details

The centroid is computed using the add_centroid function and distances are calculated with the calculate_distance_to_centroid function. The function automatically excludes the centroid itself from the visualizations. Histograms provide an overview of distance distributions, while confidence plots summarize variability with intervals.

Value

A patchwork object combining plots for each keypoint, visualizing the distances to the centroid.

Examples

## Not run: 
# Create sample data
data <- dplyr::tibble(
  keypoint = rep(c("head", "arm", "leg", "torso"), each = 10),
  x = rnorm(40, mean = 0, sd = 1),
  y = rnorm(40, mean = 0, sd = 1)
)

# Plot histogram of distances
check_pose(data, reference_keypoint = "head", type = "histogram")

# Plot confidence intervals
check_pose(data, reference_keypoint = "head", type = "confidence")

## End(Not run)

Classify Movement States Based on Stability Analysis

Description

This function analyzes movement tracking data to identify periods of high and low activity by detecting stable periods in the movement data. It returns a binary classification where 1 indicates high activity and 0 indicates low activity.

Usage

classify_by_stability(
  speed,
  window_size = 30,
  min_stable_period = 30,
  tolerance = 0.1,
  refine_transitions = TRUE,
  min_low_state_duration = 0,
  min_high_state_duration = 0,
  search_window = 90,
  stability_window = 10,
  stability_threshold = 0.5,
  return_type = c("numeric", "factor")
)

Arguments

speed

Numeric vector of speed or velocity measurements. If velocity is provided, absolute values will be used automatically

window_size

Number of measurements to consider when calculating variance (default: 30)

min_stable_period

Minimum length required for a stable period (default: 30)

tolerance

Tolerance for variance in stable periods (default: 0.1, must be between 0 and 1)

refine_transitions

Whether to refine state transitions using stability detection (default: TRUE)

min_low_state_duration

Minimum duration for low activity states; shorter periods are merged using majority context (default: 0, no merging)

min_high_state_duration

Minimum duration for high activity states; shorter periods are merged using majority context (default: 0, no merging)

search_window

How far to look for movement transitions when refining (default: 90)

stability_window

Window size for checking if movement has stabilized (default: 10)

stability_threshold

Maximum variance allowed in stable state (default: 0.5)

return_type

Should the function return "factor" ("high"/"low") or "numeric" (1/0) (default: "numeric")

Details

The classification process follows these key steps:

  1. Stability Detection:

    • Identifies stable periods in the movement data

    • Uses the longest stable period to establish a baseline for low activity

  2. State Classification:

    • Sets an activity threshold based on the baseline period

    • Classifies periods that deviate from baseline stability as high activity

  3. Optional Refinement:

    • If refine_transitions = TRUE, examines transitions between states to find precise start/end points using stability detection

    • Short duration states can be filtered based on min_low_state_duration and min_high_state_duration parameters using a majority context approach

Value

Numeric vector of the same length as input:

  • 1: High activity state

  • 0: Low activity state

  • NA: Unable to classify (usually due to missing data)


Classify Values Into Sequences with Minimum Run Length Constraints

Description

Classifies numeric values into "high" and "low" categories based on a threshold, while enforcing minimum run lengths for both categories. Values exceeding the threshold are classified as "high", others as "low". Short runs that don't meet the minimum length requirement are reclassified into the opposite category.

Usage

classify_by_threshold(
  values,
  threshold,
  min_low_frames,
  min_high_frames,
  return_type = c("numeric", "factor")
)

Arguments

values

Numeric vector to be classified

threshold

Numeric value used as classification boundary between "high" and "low"

min_low_frames

Minimum number of consecutive frames required for a "low" sequence

min_high_frames

Minimum number of consecutive frames required for a "high" sequence

return_type

Should the function return "factor" ("high"/"low") or "numeric" (1/0) (default: "numeric")

Details

The classification process occurs in two steps:

  1. Initial classification based on threshold

  2. Reclassification of sequences that don't meet minimum length requirements

The function first processes "low" sequences, then "high" sequences. This order can affect the final classification when there are competing minimum length requirements.

Value

Character vector of same length as input, with values classified as either "high" or "low". NA values in input remain NA in output.

Examples

# Basic usage
values <- c(1, 1.5, 2.8, 3.2, 3.0, 2.9, 1.2, 1.1)
result <- classify_by_threshold(values,
                                threshold = 2.5,
                                min_low_frames = 2,
                                min_high_frames = 3)

# Handling NAs
values_with_na <- c(1, NA, 3, 3.2, NA, 1.2)
result <- classify_by_threshold(values_with_na,
                                threshold = 2.5,
                                min_low_frames = 2,
                                min_high_frames = 2)

Classifies Periods of High Activity in Time Series Using Peaks and Troughs

Description

Identifies periods of high activity in a time series by analyzing peaks and troughs, returning a logical vector marking these periods. The function handles special cases like adjacent peaks and the initial/final sequences.

Usage

classify_high_periods(x, peaks, troughs)

Arguments

x

numeric vector; the time series values

peaks

logical vector; same length as x, TRUE indicates peak positions

troughs

logical vector; same length as x, TRUE indicates trough positions

Details

The function performs the following steps:

  1. Resolves adjacent peaks by keeping only the highest

  2. Handles the initial sequence before the first trough

  3. Handles the final sequence after the last event

  4. Identifies regions between troughs containing exactly one peak

Value

logical vector; TRUE indicates periods of high activity

Examples

## Not run: 
x <- c(1, 3, 2, 1, 4, 2, 1)
peaks <- c(FALSE, TRUE, FALSE, FALSE, TRUE, FALSE, FALSE)
troughs <- c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE)
classify_high_periods(x, peaks, troughs)

## End(Not run)

Classifies Periods of Low Activity in Time Series Using Peaks and Troughs

Description

Identifies periods of low activity in a time series by analyzing peaks and troughs, returning a logical vector marking these periods. Low activity periods are defined as regions between consecutive troughs that contain no peaks.

Usage

classify_low_periods(peaks, troughs)

Arguments

peaks

logical vector; TRUE indicates peak positions

troughs

logical vector; same length as peaks, TRUE indicates trough positions

Details

The function performs the following steps:

  1. Validates input lengths

  2. Initializes all periods as potentially low activity (TRUE)

  3. For each pair of consecutive troughs:

    • If no peaks exist between them, maintains TRUE for that period

    • If any peaks exist, marks that period as FALSE (not low activity)

Value

logical vector; TRUE indicates periods of low activity

Examples

peaks <- c(FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE)
troughs <- c(FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE)
classify_low_periods(peaks, troughs)

Apply Butterworth Highpass Filter to Signal

Description

This function applies a highpass Butterworth filter to a signal using forward-backward filtering (filtfilt) to achieve zero phase distortion. The Butterworth filter is maximally flat in the passband, making it ideal for many signal processing applications.

Usage

filter_highpass(
  x,
  cutoff_freq,
  sampling_rate,
  order = 4,
  na_action = c("linear", "spline", "stine", "locf", "value", "error"),
  keep_na = FALSE,
  ...
)

Arguments

x

Numeric vector containing the signal to be filtered

cutoff_freq

Cutoff frequency in Hz. Frequencies above this value are passed, while frequencies below are attenuated. Should be between 0 and sampling_rate/2.

sampling_rate

Sampling rate of the signal in Hz. Must be at least twice the highest frequency component in the signal (Nyquist criterion).

order

Filter order (default = 4). Controls the steepness of frequency rolloff: - Higher orders give sharper cutoffs but may introduce more ringing - Lower orders give smoother transitions but less steep rolloff - Common values in practice are 2-8 - Values above 8 are rarely used due to numerical instability

na_action

Method to handle NA values before filtering. One of: - "linear": Linear interpolation (default) - "spline": Spline interpolation for smoother curves - "stine": Stineman interpolation preserving data shape - "locf": Last observation carried forward - "value": Replace with a constant value - "error": Raise an error if NAs are present

keep_na

Logical indicating whether to restore NAs to their original positions after filtering (default = FALSE)

...

Additional arguments passed to replace_na(). Common options include: - value: Numeric value for replacement when na_action = "value" - min_gap: Minimum gap size to interpolate/fill - max_gap: Maximum gap size to interpolate/fill

Details

The Butterworth filter response falls off at -6*order dB/octave. The cutoff frequency corresponds to the -3dB point of the filter's magnitude response.

Common Applications:

  • Removing baseline drift: Use low cutoff (0.1-1 Hz)

  • EMG analysis: Use moderate cutoff (10-20 Hz)

  • Motion artifact removal: Use application-specific cutoff

Parameter Selection Guidelines:

  • cutoff_freq: Choose based on the lowest frequency you want to preserve

  • order: Same guidelines as lowpass_filter

Common values by field:

  • ECG processing: order=2, cutoff=0.5 Hz

  • EEG analysis: order=4, cutoff=1 Hz

  • Mechanical vibrations: order=2, cutoff application-specific

Missing Value Handling: The function uses replace_na() internally for handling missing values. See ?replace_na for detailed information about each method and its parameters. NAs can optionally be restored to their original positions after filtering using keep_na = TRUE.

Value

Numeric vector containing the filtered signal

References

Butterworth, S. (1930). On the Theory of Filter Amplifiers. Wireless Engineer, 7, 536-541.

See Also

replace_na for details on NA handling methods filter_lowpass for low-pass filtering butter for Butterworth filter design filtfilt for zero-phase digital filtering

Examples

# Generate example signal with drift
t <- seq(0, 1, by = 0.001)
drift <- 0.5 * t  # Linear drift
signal <- sin(2*pi*10*t)  # 10 Hz signal
x <- signal + drift

# Add some NAs
x[sample(length(x), 10)] <- NA

# Basic filtering with linear interpolation for NAs
filtered <- filter_highpass(x, cutoff_freq = 2, sampling_rate = 1000)

# Using spline interpolation with max gap constraint
filtered <- filter_highpass(x, cutoff_freq = 2, sampling_rate = 1000,
                           na_action = "spline", max_gap = 3)

# Replace NAs with zeros before filtering
filtered <- filter_highpass(x, cutoff_freq = 2, sampling_rate = 1000,
                           na_action = "value", value = 0)

# Filter but keep NAs in their original positions
filtered <- filter_highpass(x, cutoff_freq = 2, sampling_rate = 1000,
                           na_action = "linear", keep_na = TRUE)

Apply FFT-based Highpass Filter to Signal

Description

This function implements a highpass filter using the Fast Fourier Transform (FFT). It provides a sharp frequency cutoff but may introduce ringing artifacts (Gibbs phenomenon).

Usage

filter_highpass_fft(
  x,
  cutoff_freq,
  sampling_rate,
  na_action = c("linear", "spline", "stine", "locf", "value", "error"),
  keep_na = FALSE,
  ...
)

Arguments

x

Numeric vector containing the signal to be filtered

cutoff_freq

Cutoff frequency in Hz. Frequencies above this value are passed, while frequencies below are attenuated. Should be between 0 and sampling_rate/2.

sampling_rate

Sampling rate of the signal in Hz. Must be at least twice the highest frequency component in the signal (Nyquist criterion).

na_action

Method to handle NA values before filtering. One of: - "linear": Linear interpolation (default) - "spline": Spline interpolation for smoother curves - "stine": Stineman interpolation preserving data shape - "locf": Last observation carried forward - "value": Replace with a constant value - "error": Raise an error if NAs are present

keep_na

Logical indicating whether to restore NAs to their original positions after filtering (default = FALSE)

...

Additional arguments passed to replace_na(). Common options include: - value: Numeric value for replacement when na_action = "value" - min_gap: Minimum gap size to interpolate/fill - max_gap: Maximum gap size to interpolate/fill

Details

FFT-based filtering applies a hard cutoff in the frequency domain. This can be advantageous for:

  • Precise frequency selection

  • Batch processing of long signals

  • Cases where sharp frequency cutoffs are desired

Common Applications:

  • Removing baseline drift: Use low cutoff (0.1-1 Hz)

  • EMG analysis: Use moderate cutoff (10-20 Hz)

  • Motion artifact removal: Use application-specific cutoff

Limitations:

  • May introduce ringing artifacts

  • Assumes periodic signal (can cause edge effects)

  • Less suitable for real-time processing

Missing Value Handling: The function uses replace_na() internally for handling missing values. See ?replace_na for detailed information about each method and its parameters. NAs can optionally be restored to their original positions after filtering using keep_na = TRUE.

Value

Numeric vector containing the filtered signal

See Also

replace_na for details on NA handling methods filter_lowpass_fft for FFT-based low-pass filtering filter_highpass for Butterworth-based filtering

Examples

# Generate example signal with drift
t <- seq(0, 1, by = 0.001)
drift <- 0.5 * t  # Linear drift
signal <- sin(2*pi*10*t)  # 10 Hz signal
x <- signal + drift

# Add some NAs
x[sample(length(x), 10)] <- NA

# Basic filtering with linear interpolation for NAs
filtered <- filter_highpass_fft(x, cutoff_freq = 2, sampling_rate = 1000)

# Using spline interpolation with max gap constraint
filtered <- filter_highpass_fft(x, cutoff_freq = 2, sampling_rate = 1000,
                               na_action = "spline", max_gap = 3)

# Replace NAs with zeros before filtering
filtered <- filter_highpass_fft(x, cutoff_freq = 2, sampling_rate = 1000,
                               na_action = "value", value = 0)

# Filter but keep NAs in their original positions
filtered <- filter_highpass_fft(x, cutoff_freq = 2, sampling_rate = 1000,
                               na_action = "linear", keep_na = TRUE)

# Compare with Butterworth filter
butter_filtered <- filter_highpass(x, 2, 1000)

Kalman Filter for Regular Time Series

Description

Implements a Kalman filter for regularly sampled time series data with automatic parameter selection based on sampling rate. The filter handles missing values (NA) and provides noise reduction while preserving real signal changes.

Usage

filter_kalman(
  measurements,
  sampling_rate,
  base_Q = NULL,
  R = NULL,
  initial_state = NULL,
  initial_P = NULL
)

Arguments

measurements

Numeric vector containing the measurements to be filtered.

sampling_rate

Numeric value specifying the sampling rate in Hz (frames per second).

base_Q

Optional. Process variance. If NULL, automatically calculated based on sampling_rate. Represents expected rate of change in the true state.

R

Optional. Measurement variance. If NULL, defaults to 0.1. Represents the noise level in your measurements.

initial_state

Optional. Initial state estimate. If NULL, uses first non-NA measurement.

initial_P

Optional. Initial state uncertainty. If NULL, calculated based on sampling_rate.

Details

The function implements a simple Kalman filter with a constant position model. When parameters are not explicitly provided, they are automatically configured based on the sampling rate:

  • base_Q scales inversely with sampling rate (base_Q ≈ 0.15/sampling_rate)

  • R defaults to 0.1 (assuming moderate measurement noise)

  • initial_P scales with sampling rate uncertainty

Missing values (NA) are handled by relying on the prediction step without measurement updates.

Value

A numeric vector of the same length as measurements containing the filtered values.

Note

Parameter selection guidelines:

  • Increase R or decrease base_Q for smoother output

  • Decrease R or increase base_Q for more responsive output

  • For high-frequency data (>100 Hz), consider reducing base_Q

  • If you know your sensor's noise characteristics, set R to the square of the standard deviation

See Also

filter_kalman_irregular for handling irregularly sampled data

Examples

# Basic usage with 60 Hz data
measurements <- c(1, 1.1, NA, 0.9, 1.2, NA, 0.8, 1.1)
filtered <- filter_kalman(measurements, sampling_rate = 60)

# Custom parameters for more aggressive filtering
filtered_custom <- filter_kalman(measurements,
                                sampling_rate = 60,
                                base_Q = 0.001,
                                R = 0.2)

Kalman Filter for Irregular Time Series with Optional Resampling

Description

Implements a Kalman filter for irregularly sampled time series data with optional resampling to regular intervals. Handles variable sampling rates, missing values, and automatically adjusts process variance based on time intervals.

Usage

filter_kalman_irregular(
  measurements,
  times,
  base_Q = NULL,
  R = NULL,
  initial_state = NULL,
  initial_P = NULL,
  resample = FALSE,
  resample_freq = NULL
)

Arguments

measurements

Numeric vector containing the measurements to be filtered.

times

Numeric vector of timestamps corresponding to measurements.

base_Q

Optional. Base process variance per second. If NULL, automatically calculated.

R

Optional. Measurement variance. If NULL, defaults to 0.1.

initial_state

Optional. Initial state estimate. If NULL, uses first non-NA measurement.

initial_P

Optional. Initial state uncertainty. If NULL, calculated from median sampling rate.

resample

Logical. Whether to return regularly resampled data (default: FALSE).

resample_freq

Numeric. Desired sampling frequency in Hz for resampling (required if resample=TRUE).

Details

The function implements an adaptive Kalman filter that accounts for irregular sampling intervals. Process variance is scaled by the time difference between measurements, allowing proper uncertainty handling for variable sampling rates.

Key features:

  • Handles irregular sampling intervals

  • Scales process variance with time gaps

  • Optional resampling to regular intervals

  • Automatic parameter selection based on median sampling rate

  • Missing value (NA) handling

When resampling, the function uses linear interpolation and warns if the requested sampling frequency exceeds twice the median original sampling rate (Nyquist frequency).

Value

If resample=FALSE: A numeric vector of filtered values corresponding to original timestamps If resample=TRUE: A list containing:

  • time: Vector of regular timestamps

  • values: Vector of filtered values at regular timestamps

  • original_time: Original irregular timestamps

  • original_values: Filtered values at original timestamps

Note

Resampling considerations:

  • Avoid resampling above twice the median original sampling rate

  • Consider the physical meaning of your data when choosing resample_freq

  • Be cautious of creating artifacts through high-frequency resampling

Parameter selection guidelines:

  • base_Q controls the expected rate of change per second

  • R should reflect your measurement noise level

  • For slow-changing signals, reduce base_Q

  • For noisy measurements, increase R

See Also

filter_kalman for regularly sampled data

Examples

# Example with irregular sampling
measurements <- c(1, 1.1, NA, 0.9, 1.2, NA, 0.8, 1.1)
times <- c(0, 0.1, 0.3, 0.35, 0.5, 0.8, 0.81, 1.0)

# Basic filtering with irregular samples
filtered <- filter_kalman_irregular(measurements, times)

# Filtering with resampling to 50 Hz
filtered_resampled <- filter_kalman_irregular(measurements, times,
                                             resample = TRUE,
                                             resample_freq = 50)

# Plot results
plot(times, measurements, type="p", col="blue")
lines(filtered_resampled$time, filtered_resampled$values, col="red")

Apply Butterworth Lowpass Filter to Signal

Description

This function applies a lowpass Butterworth filter to a signal using forward-backward filtering (filtfilt) to achieve zero phase distortion. The Butterworth filter is maximally flat in the passband, making it ideal for many signal processing applications.

Usage

filter_lowpass(
  x,
  cutoff_freq,
  sampling_rate,
  order = 4,
  na_action = c("linear", "spline", "stine", "locf", "value", "error"),
  keep_na = FALSE,
  ...
)

Arguments

x

Numeric vector containing the signal to be filtered

cutoff_freq

Cutoff frequency in Hz. Frequencies below this value are passed, while frequencies above are attenuated. Should be between 0 and sampling_rate/2.

sampling_rate

Sampling rate of the signal in Hz. Must be at least twice the highest frequency component in the signal (Nyquist criterion).

order

Filter order (default = 4). Controls the steepness of frequency rolloff: - Higher orders give sharper cutoffs but may introduce more ringing - Lower orders give smoother transitions but less steep rolloff - Common values in practice are 2-8 - Values above 8 are rarely used due to numerical instability

na_action

Method to handle NA values before filtering. One of: - "linear": Linear interpolation (default) - "spline": Spline interpolation for smoother curves - "stine": Stineman interpolation preserving data shape - "locf": Last observation carried forward - "value": Replace with a constant value - "error": Raise an error if NAs are present

keep_na

Logical indicating whether to restore NAs to their original positions after filtering (default = FALSE)

...

Additional arguments passed to replace_na(). Common options include: - value: Numeric value for replacement when na_action = "value" - min_gap: Minimum gap size to interpolate/fill - max_gap: Maximum gap size to interpolate/fill

Details

The Butterworth filter response falls off at -6*order dB/octave. The cutoff frequency corresponds to the -3dB point of the filter's magnitude response.

Parameter Selection Guidelines:

  • cutoff_freq: Choose based on the frequency content you want to preserve

  • sampling_rate: Should match your data collection rate

  • order:

    • order=2: Gentle rolloff, minimal ringing (~12 dB/octave)

    • order=4: Standard choice, good balance (~24 dB/octave)

    • order=6: Steeper rolloff, some ringing (~36 dB/octave)

    • order=8: Very steep, may have significant ringing (~48 dB/octave) Note: For very low cutoff frequencies (<0.001 of Nyquist), order is automatically reduced to 2 to maintain stability.

Common values by field:

  • Biomechanics: order=2 or 4

  • EEG/MEG: order=4 or 6

  • Audio processing: order=2 to 8

  • Mechanical vibrations: order=2 to 4

Missing Value Handling: The function uses replace_na() internally for handling missing values. See ?replace_na for detailed information about each method and its parameters. NAs can optionally be restored to their original positions after filtering using keep_na = TRUE.

Value

Numeric vector containing the filtered signal

References

Butterworth, S. (1930). On the Theory of Filter Amplifiers. Wireless Engineer, 7, 536-541.

See Also

replace_na for details on NA handling methods filter_highpass for high-pass filtering butter for Butterworth filter design filtfilt for zero-phase digital filtering

Examples

# Generate example signal: 2 Hz fundamental + 50 Hz noise
t <- seq(0, 1, by = 0.001)
x <- sin(2*pi*2*t) + 0.5*sin(2*pi*50*t)

# Add some NAs
x[sample(length(x), 10)] <- NA

# Basic filtering with linear interpolation for NAs
filtered <- filter_lowpass(x, cutoff_freq = 5, sampling_rate = 1000)

# Using spline interpolation with max gap constraint
filtered <- filter_lowpass(x, cutoff_freq = 5, sampling_rate = 1000,
                          na_action = "spline", max_gap = 3)

# Replace NAs with zeros before filtering
filtered <- filter_lowpass(x, cutoff_freq = 5, sampling_rate = 1000,
                          na_action = "value", value = 0)

# Filter but keep NAs in their original positions
filtered <- filter_lowpass(x, cutoff_freq = 5, sampling_rate = 1000,
                          na_action = "linear", keep_na = TRUE)

Apply FFT-based Lowpass Filter to Signal

Description

This function implements a lowpass filter using the Fast Fourier Transform (FFT). It provides a sharp frequency cutoff but may introduce ringing artifacts (Gibbs phenomenon).

Usage

filter_lowpass_fft(
  x,
  cutoff_freq,
  sampling_rate,
  na_action = c("linear", "spline", "stine", "locf", "value", "error"),
  keep_na = FALSE,
  ...
)

Arguments

x

Numeric vector containing the signal to be filtered

cutoff_freq

Cutoff frequency in Hz. Frequencies below this value are passed, while frequencies above are attenuated. Should be between 0 and sampling_rate/2.

sampling_rate

Sampling rate of the signal in Hz. Must be at least twice the highest frequency component in the signal (Nyquist criterion).

na_action

Method to handle NA values before filtering. One of: - "linear": Linear interpolation (default) - "spline": Spline interpolation for smoother curves - "stine": Stineman interpolation preserving data shape - "locf": Last observation carried forward - "value": Replace with a constant value - "error": Raise an error if NAs are present

keep_na

Logical indicating whether to restore NAs to their original positions after filtering (default = FALSE)

...

Additional arguments passed to replace_na(). Common options include: - value: Numeric value for replacement when na_action = "value" - min_gap: Minimum gap size to interpolate/fill - max_gap: Maximum gap size to interpolate/fill

Details

FFT-based filtering applies a hard cutoff in the frequency domain. This can be advantageous for:

  • Precise frequency selection

  • Batch processing of long signals

  • Cases where sharp frequency cutoffs are desired

Limitations:

  • May introduce ringing artifacts

  • Assumes periodic signal (can cause edge effects)

  • Less suitable for real-time processing

Missing Value Handling: The function uses replace_na() internally for handling missing values. See ?replace_na for detailed information about each method and its parameters. NAs can optionally be restored to their original positions after filtering using keep_na = TRUE.

Value

Numeric vector containing the filtered signal

See Also

replace_na for details on NA handling methods filter_highpass_fft for FFT-based high-pass filtering filter_lowpass for Butterworth-based filtering

Examples

# Generate example signal with mixed frequencies
t <- seq(0, 1, by = 0.001)
x <- sin(2*pi*2*t) + sin(2*pi*50*t)

# Add some NAs
x[sample(length(x), 10)] <- NA

# Basic filtering with linear interpolation for NAs
filtered <- filter_lowpass_fft(x, cutoff_freq = 5, sampling_rate = 1000)

# Using spline interpolation with max gap constraint
filtered <- filter_lowpass_fft(x, cutoff_freq = 5, sampling_rate = 1000,
                              na_action = "spline", max_gap = 3)

# Replace NAs with zeros before filtering
filtered <- filter_lowpass_fft(x, cutoff_freq = 5, sampling_rate = 1000,
                              na_action = "value", value = 0)

# Filter but keep NAs in their original positions
filtered <- filter_lowpass_fft(x, cutoff_freq = 5, sampling_rate = 1000,
                              na_action = "linear", keep_na = TRUE)

# Compare with Butterworth filter
butter_filtered <- filter_lowpass(x, 5, 1000)

Smooth Movement Data

Description

[Experimental]

Applies smoothing filters to movement tracking data to reduce noise.

Usage

filter_movement(
  data,
  method = c("rollmedian", "rollmean", "kalman", "sgolay", "lowpass", "highpass",
    "lowpass_fft", "highpass_fft"),
  use_derivatives = FALSE,
  ...
)

Arguments

data

A data frame containing movement tracking data with the following required columns:

  • individual: Identifier for each tracked subject

  • keypoint: Identifier for each tracked point

  • x: x-coordinates

  • y: y-coordinates

  • time: Time values Optional columns:

  • z: z-coordinates

method

Character string specifying the smoothing method. Options:

use_derivatives

Filter on the derivative values instead of coordinates (important for e.g. trackball or accelerometer data)

...

Additional arguments passed to the specific filter function

Details

This function is a wrapper that applies various filtering methods to x and y (and z if present) coordinates. Each filtering method has its own specific parameters - see the documentation of individual filter functions for details:

Value

A data frame with the same structure as the input, but with smoothed coordinates.

See Also

Examples

## Not run: 
# Apply rolling median with window of 5
filter_movement(tracking_data, "rollmedian", window_width = 5, min_obs = 1)

## End(Not run)

Filter low-confidence values in a dataset

Description

This function replaces values in columns x, y, and confidence with NA if the confidence values are below a specified threshold.

Usage

filter_na_confidence(data, threshold = 0.6)

Arguments

data

A data frame containing the columns x, y, and confidence.

threshold

A numeric value specifying the minimum confidence level to retain data. Default is 0.6.

Value

A data frame with the same structure as the input, but where x, y, and confidence values are replaced with NA if the confidence is below the threshold.

Examples

library(dplyr)
data <- dplyr::tibble(
  x = 1:5,
  y = 6:10,
  confidence = c(0.5, 0.7, 0.4, 0.8, 0.9)
)
filter_na_confidence(data, threshold = 0.6)

Filter coordinates outside a region of interest (ROI)

Description

Filters out coordinates that fall outside a specified region of interest by setting them to NA. The ROI can be either rectangular (defined by min/max coordinates) or circular (defined by center and radius).

Usage

filter_na_roi(
  data,
  x_min = NULL,
  x_max = NULL,
  y_min = NULL,
  y_max = NULL,
  x_center = NULL,
  y_center = NULL,
  radius = NULL
)

Arguments

data

A data frame containing 'x' and 'y' coordinates

x_min

Minimum x-coordinate for rectangular ROI

x_max

Maximum x-coordinate for rectangular ROI

y_min

Minimum y-coordinate for rectangular ROI

y_max

Maximum y-coordinate for rectangular ROI

x_center

x-coordinate of circle center for circular ROI

y_center

y-coordinate of circle center for circular ROI

radius

Radius of circular ROI

Value

A data frame with coordinates outside ROI set to NA

Examples

# Create sample data
sample_data <- expand.grid(
  x = seq(0, 100, by = 10),
  y = seq(0, 100, by = 10)
) |> as.data.frame()

# Rectangular ROI example
sample_data |>
  filter_na_roi(x_min = 20, x_max = 80, y_min = 20, y_max = 80)

# Circular ROI example
sample_data |>
  filter_na_roi(x_center = 50, y_center = 50, radius = 25)

Filter values by speed threshold

Description

This function filters out values in a dataset where the calculated speed exceeds a specified threshold. Values for x, y, and confidence are replaced with NA if their corresponding speed exceeds the threshold. Speed is calculated using the calculate_kinematics function.

Usage

filter_na_speed(data, threshold = "auto")

Arguments

data

A data frame containing the following required columns:

  • x: x-coordinates

  • y: y-coordinates

  • time: time values Optional column:

  • confidence: confidence values for each observation

threshold

A numeric value specifying the speed threshold, or "auto".

  • If numeric: Observations with speeds greater than this value will have their x, y, and confidence values replaced with NA

  • If "auto": Sets threshold at mean speed + 3 standard deviations

Details

The speed is calculated using the calculate_kinematics function, which computes translational velocity (v_translation) and other kinematic parameters. When using threshold = "auto", the function calculates the threshold as the mean speed plus three standard deviations, which assumes normally distributed speeds.

Value

A data frame with the same columns as the input data, but with values replaced by NA where the speed exceeds the threshold.

Examples

## Not run: 
data <- dplyr::tibble(
  time = 1:5,
  x = c(1, 2, 4, 7, 11),
  y = c(1, 1, 2, 3, 5),
  confidence = c(0.8, 0.9, 0.7, 0.85, 0.6)
)

# Filter data by a speed threshold of 3
filter_by_speed(data, threshold = 3)

# Use automatic threshold
filter_by_speed(data, threshold = "auto")

## End(Not run)

Apply Rolling Mean Filter

Description

[Experimental]

Applies a rolling mean filter to a numeric vector using the roll package.

Usage

filter_rollmean(x, window_width = 5, min_obs = 1, ...)

Arguments

x

Numeric vector to filter

window_width

Integer specifying window size for rolling calculation

min_obs

Minimum number of non-NA values required (default: 1)

...

Additional parameters to be passed to roll::roll_mean()

Value

Filtered numeric vector


Apply Rolling Median Filter

Description

[Experimental]

Applies a rolling median filter to a numeric vector using the roll package.

Usage

filter_rollmedian(x, window_width = 5, min_obs = 1, ...)

Arguments

x

Numeric vector to filter

window_width

Integer specifying window size for rolling calculation

min_obs

Minimum number of non-NA values required (default: 1)

...

Additional parameters to be passed to roll::roll_median

Value

Filtered numeric vector


Apply Savitzky-Golay Filter to Movement Data

Description

This function applies a Savitzky-Golay filter to smooth movement data while preserving higher moments (peaks, valleys) better than moving average filters. The implementation uses zero-phase filtering to prevent temporal shifts in the data.

Usage

filter_sgolay(
  x,
  sampling_rate,
  window_size = ceiling(sampling_rate/10) * 2 + 1,
  order = 3,
  preserve_edges = FALSE,
  na_action = "linear",
  keep_na = FALSE,
  ...
)

Arguments

x

Numeric vector containing the movement data to be filtered

sampling_rate

Sampling rate of the data in Hz. Must match your data collection rate (e.g., 60 for 60 FPS motion capture).

window_size

Window size in samples (must be odd). Controls the amount of smoothing. Larger windows give more smoothing but may over-attenuate genuine movement features. Default is automatically calculated as sampling_rate/10 (rounded up to nearest odd number).

order

Polynomial order (default = 3). Controls how well the filter preserves higher-order moments in the data: - order=2: Preserves position, velocity (good for smooth movements) - order=3: Also preserves acceleration (good for most movement data) - order=4: Also preserves jerk (good for quick movements) - order=5: Maximum preservation (may retain too much noise)

preserve_edges

Logical indicating whether to use progressively smaller windows at the beginning and end of the signal to reduce edge effects (default = FALSE). Note: This only affects the signal endpoints, not internal discontinuities.

na_action

Method to handle NA values before filtering. One of: - "linear": Linear interpolation (default) - "spline": Spline interpolation for smoother curves - "locf": Last observation carried forward - "value": Replace with a constant value - "error": Raise an error if NAs are present

keep_na

Logical indicating whether to restore NAs to their original positions after filtering (default = FALSE)

...

Additional arguments passed to replace_na()

Details

The Savitzky-Golay filter fits successive polynomials to sliding windows of the data. This approach preserves higher moments of the data better than simple moving averages or Butterworth filters, making it particularly suitable for movement data where preserving features like peaks and valleys is important.

Edge Handling: When preserve_edges = TRUE, the function uses progressively smaller windows near the beginning and end of the signal to reduce endpoint distortion. This only affects the signal endpoints - it does not detect or handle internal discontinuities or sharp events within the data.

Parameter Selection Guidelines:

  • window_size:

    • For 60 FPS: 5-15 frames (83-250ms) for quick movements, 15-31 for slow movements

    • For 120 FPS: 7-21 frames (58-175ms) for quick movements, 21-51 for slow movements

    • For 500 FPS: 25-75 frames (50-150ms) for quick movements, 75-151 for slow movements The default window_size = sampling_rate/10 works well for typical human movement.

  • order:

    • order=2: Smooth movements, position analysis

    • order=3: Most movement analysis (default)

    • order=4: Quick movements, sports analysis

    • order=5: Very quick movements, impact analysis Note: order must be less than window_size

Common values by application:

  • Gait analysis (60 FPS): window_size=15, order=3

  • Sports biomechanics (120 FPS): window_size=21, order=4

  • Impact analysis (500 FPS): window_size=51, order=4

  • Posture analysis (60 FPS): window_size=31, order=2

Value

Numeric vector containing the filtered movement data

References

Savitzky, A., & Golay, M.J.E. (1964). Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Analytical Chemistry, 36(8), 1627-1639.

See Also

filter_lowpass for frequency-based filtering sgolayfilt for the base Savitzky-Golay implementation replace_na for details on NA handling methods

Examples

# Generate example movement data: smooth motion + noise
t <- seq(0, 5, by = 1/60)  # 60 FPS data
x <- sin(2*pi*0.5*t) + rnorm(length(t), 0, 0.1)

# Basic filtering with default parameters (60 FPS)
filtered <- filter_sgolay(x, sampling_rate = 60)

# Adjusting parameters for quick movements
filtered_quick <- filter_sgolay(x, sampling_rate = 60,
                               window_size = 11, order = 4)

# High-speed camera data (500 FPS) with larger window
filtered_high <- filter_sgolay(x, sampling_rate = 500,
                              window_size = 51, order = 3)

Find optimal time lag between two time series using cross-correlation

Description

This function calculates the optimal lag between two time series by finding the lag that maximizes their cross-correlation. It's particularly useful for synchronizing recordings from different sources, such as physiological and behavioral data.

Usage

find_lag(signal, reference, max_lag = 5000, normalize = TRUE)

Arguments

signal

Time series to align (numeric vector)

reference

Reference time series to align against (numeric vector)

max_lag

Maximum lag to consider in both directions, in number of samples. If NULL, uses (length of series - 1)

normalize

Logical; if TRUE, z-score normalizes both series before computing cross-correlation (recommended for series with different scales)

Value

Integer indicating the optimal lag. A positive value means the signal needs to be shifted forward in time to align with the reference. A negative value means the signal needs to be shifted backward.

See Also

align_timeseries for applying the computed lag

Examples

# Create two artificially shifted sine waves
t <- seq(0, 10, 0.1)
reference <- sin(t)
signal <- sin(t - 0.5)  # Signal delayed by 0.5 units
lag <- find_lag(signal, reference)
print(lag)  # Should be approximately 5 samples (0.5 units)

Find Peaks in Time Series Data

Description

Identifies peaks (local maxima) in a numeric time series, with options to filter peaks based on height and prominence. The function handles missing values (NA) appropriately and is compatible with dplyr's mutate. Includes flexible handling of plateaus and adjustable window size for peak detection.

Usage

find_peaks(
  x,
  min_height = -Inf,
  min_prominence = 0,
  plateau_handling = c("strict", "middle", "first", "last", "all"),
  window_size = 3
)

Arguments

x

Numeric vector containing the time series data

min_height

Minimum height threshold for peaks (default: -Inf)

min_prominence

Minimum prominence threshold for peaks (default: 0)

plateau_handling

String specifying how to handle plateaus. One of:

  • "strict" (default): No points in plateau are peaks

  • "middle": Middle point(s) of plateau are peaks

  • "first": First point of plateau is peak

  • "last": Last point of plateau is peak

  • "all": All points in plateau are peaks

window_size

Integer specifying the size of the window to use for peak detection (default: 3). Must be odd and >= 3. Larger values detect peaks over wider ranges.

Details

The function uses a sliding window algorithm for peak detection (window size specified by window_size parameter), combined with a region-based prominence calculation method similar to that described in Palshikar (2009).

Value

A logical vector of the same length as the input where:

  • TRUE indicates a confirmed peak

  • FALSE indicates a non-peak

  • NA indicates peak status could not be determined due to missing data

Peak Detection

A point is considered a peak if it is the highest point within its window (default window_size of 3 compares each point with its immediate neighbors). The first and last (window_size-1)/2 points in the series cannot be peaks and are marked as NA. Larger window sizes will identify peaks that dominate over a wider range, typically resulting in fewer peaks being detected.

Prominence

Prominence measures how much a peak stands out relative to its surrounding values. It is calculated as the height of the peak minus the height of the highest minimum between this peak and any higher peaks (or the end of the series if no higher peaks exist).

Plateau Handling

Plateaus (sequences of identical values) are handled according to the plateau_handling parameter:

  • strict: No points in a plateau are considered peaks (traditional behavior)

  • middle: For plateaus of odd length, the middle point is marked as a peak. For plateaus of even length, the two middle points are marked as peaks.

  • first: The first point of each plateau is marked as a peak

  • last: The last point of each plateau is marked as a peak

  • all: Every point in the plateau is marked as a peak

Note that in all cases, the plateau must still qualify as a peak relative to its surrounding window (i.e., higher than all other points in the window).

Missing Values (NA) Handling

The function uses the following rules for handling NAs:

  • If a point is NA, it cannot be a peak (returns NA)

  • If any point in the window is NA, peak status cannot be determined (returns NA)

  • For prominence calculations, stretches of NAs are handled appropriately

  • A minimum of window_size points is required; shorter series return all NAs

Note

  • The function is optimized for use with dplyr's mutate

  • For noisy data, consider using a larger window_size or smoothing the series before peak detection

  • Adjust min_height and min_prominence to filter out unwanted peaks

  • Choose plateau_handling based on your specific needs

  • Larger window_size values result in more stringent peak detection

References

Palshikar, G. (2009). Simple Algorithms for Peak Detection in Time-Series. Proc. 1st Int. Conf. Advanced Data Analysis, Business Analytics and Intelligence.

See Also

  • find_troughs for finding local minima

  • findpeaks in the pracma package for alternative peak detection methods

Examples

# Basic usage with default window size (3)
x <- c(1, 3, 2, 6, 4, 5, 2)
find_peaks(x)

# With larger window size
find_peaks(x, window_size = 5)  # More stringent peak detection

# With minimum height
find_peaks(x, min_height = 4, window_size = 3)

# With plateau handling
x <- c(1, 3, 3, 3, 2, 4, 4, 1)
find_peaks(x, plateau_handling = "middle", window_size = 3)  # Middle of plateaus
find_peaks(x, plateau_handling = "all", window_size = 5)     # All plateau points

# With missing values
x <- c(1, 3, NA, 6, 4, NA, 2)
find_peaks(x)

# Usage with dplyr
library(dplyr)
data_frame(
  time = 1:10,
  value = c(1, 3, 7, 4, 2, 6, 5, 8, 4, 2)
) %>%
  mutate(peaks = find_peaks(value, window_size = 3))

Find Troughs in Time Series Data

Description

Identifies troughs (local minima) in a numeric time series, with options to filter troughs based on height and prominence. The function handles missing values (NA) appropriately and is compatible with dplyr's mutate. Includes flexible handling of plateaus and adjustable window size for trough detection.

Usage

find_troughs(
  x,
  max_height = Inf,
  min_prominence = 0,
  plateau_handling = c("strict", "middle", "first", "last", "all"),
  window_size = 3
)

Arguments

x

Numeric vector containing the time series data

max_height

Maximum height threshold for troughs (default: Inf)

min_prominence

Minimum prominence threshold for troughs (default: 0)

plateau_handling

String specifying how to handle plateaus. One of:

  • "strict" (default): No points in plateau are troughs

  • "middle": Middle point(s) of plateau are troughs

  • "first": First point of plateau is trough

  • "last": Last point of plateau is trough

  • "all": All points in plateau are troughs

window_size

Integer specifying the size of the window to use for trough detection (default: 3). Must be odd and >= 3. Larger values detect troughs over wider ranges.

Details

The function uses a sliding window algorithm for trough detection (window size specified by window_size parameter), combined with a region-based prominence calculation method similar to that described in Palshikar (2009).

Value

A logical vector of the same length as the input where:

  • TRUE indicates a confirmed trough

  • FALSE indicates a non-trough

  • NA indicates trough status could not be determined due to missing data

Trough Detection

A point is considered a trough if it is the lowest point within its window (default window_size of 3 compares each point with its immediate neighbors). The first and last (window_size-1)/2 points in the series cannot be troughs and are marked as NA. Larger window sizes will identify troughs that dominate over a wider range, typically resulting in fewer troughs being detected.

Prominence

Prominence measures how much a trough stands out relative to its surrounding values. It is calculated as the height of the lowest maximum between this trough and any lower troughs (or the end of the series if no lower troughs exist) minus the height of the trough.

Plateau Handling

Plateaus (sequences of identical values) are handled according to the plateau_handling parameter:

  • strict: No points in a plateau are considered troughs (traditional behavior)

  • middle: For plateaus of odd length, the middle point is marked as a trough. For plateaus of even length, the two middle points are marked as troughs.

  • first: The first point of each plateau is marked as a trough

  • last: The last point of each plateau is marked as a trough

  • all: Every point in the plateau is marked as a trough

Note that in all cases, the plateau must still qualify as a trough relative to its surrounding window (i.e., lower than all other points in the window).

Missing Values (NA) Handling

The function uses the following rules for handling NAs:

  • If a point is NA, it cannot be a trough (returns NA)

  • If any point in the window is NA, trough status cannot be determined (returns NA)

  • For prominence calculations, stretches of NAs are handled appropriately

  • A minimum of window_size points is required; shorter series return all NAs

Note

  • The function is optimized for use with dplyr's mutate

  • For noisy data, consider using a larger window_size or smoothing the series before trough detection

  • Adjust max_height and min_prominence to filter out unwanted troughs

  • Choose plateau_handling based on your specific needs

  • Larger window_size values result in more stringent trough detection

References

Palshikar, G. (2009). Simple Algorithms for Peak Detection in Time-Series. Proc. 1st Int. Conf. Advanced Data Analysis, Business Analytics and Intelligence.

See Also

  • find_peaks for finding local maxima

  • findpeaks in the pracma package for alternative extrema detection methods

Examples

# Basic usage with default window size (3)
x <- c(5, 3, 4, 1, 4, 2, 5)
find_troughs(x)

# With larger window size
find_troughs(x, window_size = 5)  # More stringent trough detection

# With maximum height
find_troughs(x, max_height = 3, window_size = 3)

# With plateau handling
x <- c(5, 2, 2, 2, 3, 1, 1, 4)
find_troughs(x, plateau_handling = "middle", window_size = 3)  # Middle of plateaus
find_troughs(x, plateau_handling = "all", window_size = 5)     # All plateau points

# With missing values
x <- c(5, 3, NA, 1, 4, NA, 5)
find_troughs(x)

# Usage with dplyr
library(dplyr)
data_frame(
  time = 1:10,
  value = c(5, 3, 1, 4, 2, 1, 3, 0, 4, 5)
) %>%
  mutate(troughs = find_troughs(value, window_size = 3))

Download example tracking data

Description

Downloads example data for different animal tracking software and returns the path to the downloaded file. The function caches the data to avoid repeated downloads.

Usage

get_example_data(source, cache_dir = tempdir())

Arguments

source

Character string specifying the tracking software. Currently supported:

  • "deeplabcut": Data from DeepLabCut tracking

cache_dir

Character string specifying the directory where to cache the downloaded files. Defaults to a temporary directory using tempdir().

Details

The function downloads example data from a GitHub repository and caches it locally. If the file already exists in the cache directory, it will use the cached version instead of downloading it again.

The data sources are hosted at: https://github.com/roaldarbol/movement-data

Value

Character string with the path to the downloaded file.

Examples

## Not run: 
# Get path to DeepLabCut example data
path <- get_example_data("deeplabcut")

# Read the data using preferred method
data <- read_deeplabcut(path)

## End(Not run)

Get/extract metadata

Description

[Experimental]

Usage

get_metadata(data)

Arguments

data

movement data frame

Value

the metadata associated with the movement data frame


Group every N observations together

Description

[Experimental]

Sometimes your sampling rate is too high; group_every allows you to down-sample by creating "bins" which can subsequently be summarised on. When using n, data needs to be regularly sampled; if there are gaps in time, the bin duration will differ. Works well with calculate_summary() for movement data.

Usage

group_every(data, seconds = NULL, n = NULL)

Arguments

data

Input data frame

seconds

Number of seconds to bin together

n

Number of observations to include in each bin/group

Value

Grouped data frame, with new "bin" variable.

Examples

## Group by every 5 seconds
df_time <- data.frame(
  time = seq(from = 0.02, to = 100, by = 1/30), # time at 30Hz, slightly offset
  y = rnorm(3000)) # random numbers

df_time |>
  group_every(seconds = 5) |> # group for every 5 seconds
  dplyr::summarise(time = min(time), # summarise for time and y
                   mean_y = mean(y)) |>
  dplyr::mutate(time = floor(time)) # floor to get the round second number

# Group every n observations
df <- data.frame(
  x = seq(1:1000),
  y = rnorm(1000))

df |>
  group_every(n = 30) |> # group every 30 observations together
  dplyr::summarise(mean_x = mean(x),
                   mean_y = mean(y))

Initiate movement metadata

Description

[Experimental]

Usage

init_metadata(data)

Arguments

data

movement data frame

Value

data frame with metadata


Map from polar to Cartesian coordinates

Description

Map from polar to Cartesian coordinates

Usage

map_to_cartesian(data)

Arguments

data

movement data frame with polar coordinates

Value

movement data frame with Cartesian coordinates


Map from Cartesian to polar coordinates

Description

Map from Cartesian to polar coordinates

Usage

map_to_polar(data)

Arguments

data

movement data frame with Cartesian coordinates

Value

movement data frame with polar coordinates


Plot Time Series of Keypoint Position

Description

Creates a multi-panel visualization of keypoint position data over time. Each keypoint gets its own panel showing its x and/or y coordinates, with different colors distinguishing between x (orange) and y (blue) coordinates. Useful for visually inspecting movement patterns and identifying potential tracking issues.

Usage

plot_position_timeseries(data, reference_keypoint = NULL, dimension = "xy")

Arguments

data

A data frame containing tracked keypoint data with the following columns:

  • time: Numeric time values

  • keypoint: Factor specifying the keypoint names

  • x: x-coordinates

  • y: y-coordinates

reference_keypoint

Optional character string. If provided, all coordinates will be translated relative to this keypoint's position. Must match one of the keypoint levels in the data.

dimension

Character string specifying which coordinates to plot. Options are:

  • "xy": Plot both x and y coordinates (default)

  • "x": Plot only x coordinates

  • "y": Plot only y coordinates

Value

A ggplot object combining individual time series plots for each keypoint using patchwork. The plots are stacked vertically with shared axes and legends.

See Also

translate_coords() for the coordinate translation functionality used when reference_keypoint is specified.

Examples

## Not run: 
# Plot all coordinates
check_timeseries(movement_data)

# Plot coordinates relative to "head" keypoint
check_timeseries(movement_data, reference_keypoint = "head")

# Plot only x coordinates
check_timeseries(movement_data, dimension = "x")

## End(Not run)

Plot Time Series of Keypoint Speed

Description

Creates a multi-panel visualization of keypoint speed data over time. Each keypoint gets its own panel showing its speed, useful for analyzing movement intensity and identifying potential tracking issues.

Usage

plot_speed_timeseries(data, y_max = NULL)

Arguments

data

A data frame containing tracked keypoint data with the following columns:

  • time: Numeric time values

  • keypoint: Factor specifying the keypoint names

  • x: x-coordinates

  • y: y-coordinates

y_max

Optional numeric value specifying the maximum value for the y-axis. If NULL (default), the y-axis limit is automatically determined from the data.

Value

A ggplot object combining individual time series plots for each keypoint using patchwork. The plots are stacked vertically with shared axes and legends.

See Also

  • plot_position_timeseries() for plotting position data

  • calculate_speed() for the speed calculation

Examples

## Not run: 
# Plot with automatic y-axis scaling
plot_speed_timeseries(movement_data)

# Plot with fixed maximum speed of 100
plot_speed_timeseries(movement_data, y_max = 100)

## End(Not run)

Read AnimalTA data

Description

Read a data frame from AnimalTA

Usage

read_animalta(path, detailed = FALSE)

Arguments

path

An AnimalTA data frame

detailed

Animal export either raw (default) or detailed data files. We only have limited support for detailed data.

Value

a movement dataframe

References

  • Chiara, V., & Kim, S.-Y. (2023). AnimalTA: A highly flexible and easy-to-use program for tracking and analysing animal movement in different environments. Methods in Ecology and Evolution, 14, 1699–1707. doi:0.1111/2041-210X.14115.


Read centroid tracking data from Bonsai

Description

Read a Bonsai data frame

Usage

read_bonsai(path)

Arguments

path

Path to a Bonsai data file

Value

a movement dataframe


Read DeepLabCut data

Description

Read csv files from DeepLabCut (DLC). The function recognises whether it is a single- or multi-animal dataset.

Usage

read_deeplabcut(path, multianimal = NULL)

Arguments

path

Path to a DeepLabCut data file

multianimal

By default, whether a file is multi-animal is detected automatically. This gives an option to ensure it. logical TRUE/FALSE.

Value

a movement dataframe


Read idtracker.ai data

Description

Read idtracker.ai data

Usage

read_idtracker(path, path_probabilities = NULL, version = 6)

Arguments

path

Path to an idtracker.ai data frame

path_probabilities

Path to a csv file with probabilities. Only needed if you are reading csv files as they are included in h5 files.

version

idtracker.ai version. Currently only v6 output is implemented

Value

a movement dataframe


Read LightningPose data

Description

Read csv files from LightningPose (LP).

Usage

read_lightningpose(path)

Arguments

path

Path to a LightningPose data file

Value

a movement dataframe


Read movement data

Description

[Experimental]

Usage

read_movement(data)

Arguments

data

A movement data frame

Value

a movement dataframe


Read SLEAP data

Description

Read SLEAP data

Usage

read_sleap(path)

Arguments

path

A SLEAP analysis data frame in HDF5 (.h5) format

Value

a movement dataframe


Read trackball data

Description

Read trackball data from a variety of setups and configurations.

Usage

read_trackball(
  paths,
  setup = c("of_free", "of_fixed", "fictrac"),
  sampling_rate,
  col_time = "time",
  col_dx = "x",
  col_dy = "y",
  ball_calibration = NULL,
  ball_diameter = NULL,
  distance_scale = NULL,
  distance_unit = NULL,
  verbose = FALSE
)

Arguments

paths

Two file paths, one for each sensor (although one is allowed for a fixed setup, of_fixed).

setup

Which type of experimental setup was used. Expects either of_free, of_fixed or fictrac (soon).

sampling_rate

Sampling rate tells the function how long time it should integrate over. A sampling rate of 60(Hz) will mean windows of 1/60 sec are used to integrate over.

col_time

Which column contains the information about time. Can be specified either by the column number (numeric) or the name of the column if it has one (character). Should either be a datetime (POSIXt) or seconds (numeric).

col_dx

Column name for x-axis values

col_dy

Column name for y-axis values

ball_calibration

When running an of_fixed experiment, you may (but it is not necessary) provide a calibration factor. This factor is the number recorded after a 360 degree spin. You can use the calibrate_trackball function to get this number. Alternatively, provide the ball_diameter and a distance_scale (e.g. mouse dpcm).

ball_diameter

When running a of_fixed experiment, the ball diameter is needed together with either ball_calibration or distance_scale.

distance_scale

If using computer mice, you might be getting unit-less data out. However, computer mice have a factor called "dots-per-cm", which you can use to convert your estimates into centimeters.

distance_unit

Which unit should be used. If distance_scale is also used, the unit will be for the scaled data. E.g. for trackball data with optical flow sensors, you can use the mouse dots-per-cm (dpcm) of 394 by setting distance_unit = "cm" and distance_scale = 394.

verbose

If FALSE (default), suppress most warning messages.

Value

a movement dataframe


Read treadmill data

Description

[Experimental]

Usage

read_treadmill(data)

Arguments

data

A treadmill data frame

Value

a movement dataframe


Read TRex Movement Tracking Data

Description

Reads and formats movement tracking data exported from TRex (Walter & Couzin, 2021). TRex is a software for tracking animal movement in videos, which exports coordinate data in CSV format. This function processes these files into a standardized movement data format.

Usage

read_trex(path)

Arguments

path

Character string specifying the path to a TRex CSV file. The file should contain columns for:

  • time

  • x and y coordinates for tracked points (e.g., x_head, y_head)

  • x and y coordinates for centroid (x_number_wcentroid_cm, y_number_wcentroid_cm)

Details

The function performs several processing steps:

  1. Validates the input file format (must be CSV)

  2. Reads the data using vroom for efficient processing

  3. Cleans column names to a consistent format

  4. Restructures the data from wide to long format

  5. Initializes metadata fields required for movement data

Value

A data frame containing movement data with the following columns:

  • time: Time values from the tracking

  • individual: Factor (set to NA, as TRex tracks one individual)

  • keypoint: Factor identifying tracked points (e.g., "head", "centroid")

  • x: x-coordinates in centimeters

  • y: y-coordinates in centimeters

  • confidence: Numeric confidence values (set to NA as TRex doesn't provide these)

References

Walter, T., & Couzin, I. D. (2021). TRex, a fast multi-animal tracking system with markerless identification, and 2D estimation of posture and visual fields. eLife, 10, e64000.

See Also

  • init_metadata() for details on metadata initialization

  • TRex software: https://trex.run

Examples

## Not run: 
# Read a TRex CSV file
data <- read_trex("path/to/trex_export.csv")

## End(Not run)

Replace Missing Values Using Various Methods

Description

A wrapper function that replaces missing values using various interpolation or filling methods.

Usage

replace_na(x, method = "linear", value = NULL, min_gap = 1, max_gap = Inf, ...)

Arguments

x

A vector containing numeric data with missing values (NAs)

method

Character string specifying the replacement method:

  • "linear": Linear interpolation (default)

  • "spline": Spline interpolation for smoother curves

  • "stine": Stineman interpolation preserving data shape

  • "locf": Last observation carried forward

  • "value": Replace with a constant value

value

Numeric value for replacement when method = "value"

min_gap

Integer specifying minimum gap size to interpolate/fill. Gaps shorter than this will be left as NA. Default is 1 (handle all gaps).

max_gap

Integer or Inf specifying maximum gap size to interpolate/fill. Gaps longer than this will be left as NA. Default is Inf (no upper limit).

...

Additional parameters passed to the underlying interpolation functions

Value

A numeric vector with NA values replaced according to the specified method where gap length criteria are met.

See Also

  • replace_na_linear() for linear interpolation details

  • replace_na_spline() for spline interpolation details

  • replace_na_stine() for Stineman interpolation details

  • replace_na_locf() for last observation carried forward details

  • replace_na_value() for constant value replacement details

Examples

## Not run: 
x <- c(1, NA, NA, 4, 5, NA, NA, NA, 9)

# Different methods
replace_na(x, method = "linear")
replace_na(x, method = "spline")
replace_na(x, method = "stine")
replace_na(x, method = "locf")
replace_na(x, method = "value", value = 0)

# With gap constraints
replace_na(x, method = "linear", min_gap = 2)
replace_na(x, method = "spline", max_gap = 2)
replace_na(x, method = "linear", min_gap = 2, max_gap = 3)

## End(Not run)

Replace Missing Values Using Linear Interpolation

Description

Replaces missing values using linear interpolation, with control over both minimum and maximum gap sizes to interpolate.

Usage

replace_na_linear(x, min_gap = 1, max_gap = Inf, ...)

Arguments

x

A vector containing numeric data with missing values (NAs)

min_gap

Integer specifying minimum gap size to interpolate. Gaps shorter than this will be left as NA. Default is 1 (interpolate all gaps).

max_gap

Integer or Inf specifying maximum gap size to interpolate. Gaps longer than this will be left as NA. Default is Inf (no upper limit).

...

Additional parameters passed to stats::approx

Details

The function applies both minimum and maximum gap criteria:

  • Gaps shorter than min_gap are left as NA

  • Gaps longer than max_gap are left as NA

  • Only gaps that meet both criteria are interpolated If both parameters are specified, min_gap must be less than or equal to max_gap.

Value

A numeric vector with NA values replaced by interpolated values where gap length criteria are met.

Examples

## Not run: 
x <- c(1, NA, NA, 4, 5, NA, NA, NA, 9)
replace_na_linear(x)  # interpolates all gaps
replace_na_linear(x, min_gap = 2)  # only gaps >= 2
replace_na_linear(x, max_gap = 2)  # only gaps <= 2
replace_na_linear(x, min_gap = 2, max_gap = 3)  # gaps between 2 and 3

## End(Not run)

Replace Missing Values Using Last Observation Carried Forward

Description

Replaces missing values by carrying forward the last observed value, with control over both minimum and maximum gap sizes to fill.

Usage

replace_na_locf(x, min_gap = 1, max_gap = Inf)

Arguments

x

A vector containing numeric data with missing values (NAs)

min_gap

Integer specifying minimum gap size to fill. Gaps shorter than this will be left as NA. Default is 1 (fill all gaps).

max_gap

Integer or Inf specifying maximum gap size to fill. Gaps longer than this will be left as NA. Default is Inf (no upper limit).

Details

The function applies both minimum and maximum gap criteria:

  • Gaps shorter than min_gap are left as NA

  • Gaps longer than max_gap are left as NA

  • Only gaps that meet both criteria are filled If both parameters are specified, min_gap must be less than or equal to max_gap.

Value

A numeric vector with NA values replaced by the last observed value where gap length criteria are met.

Examples

## Not run: 
x <- c(1, NA, NA, 4, 5, NA, NA, NA, 9)
replace_na_locf(x)  # fills all gaps
replace_na_locf(x, min_gap = 2)  # only gaps >= 2
replace_na_locf(x, max_gap = 2)  # only gaps <= 2
replace_na_locf(x, min_gap = 2, max_gap = 3)  # gaps between 2 and 3

## End(Not run)

Replace Missing Values Using Spline Interpolation

Description

Replaces missing values using spline interpolation, with control over both minimum and maximum gap sizes to interpolate.

Usage

replace_na_spline(x, min_gap = 1, max_gap = Inf, ...)

Arguments

x

A vector containing numeric data with missing values (NAs)

min_gap

Integer specifying minimum gap size to interpolate. Gaps shorter than this will be left as NA. Default is 1 (interpolate all gaps).

max_gap

Integer or Inf specifying maximum gap size to interpolate. Gaps longer than this will be left as NA. Default is Inf (no upper limit).

...

Additional parameters passed to stats::spline

Details

The function applies both minimum and maximum gap criteria:

  • Gaps shorter than min_gap are left as NA

  • Gaps longer than max_gap are left as NA

  • Only gaps that meet both criteria are interpolated If both parameters are specified, min_gap must be less than or equal to max_gap.

Value

A numeric vector with NA values replaced by interpolated values where gap length criteria are met.

Examples

## Not run: 
x <- c(1, NA, NA, 4, 5, NA, NA, NA, 9)
replace_na_spline(x)  # interpolates all gaps
replace_na_spline(x, min_gap = 2)  # only gaps >= 2
replace_na_spline(x, max_gap = 2)  # only gaps <= 2
replace_na_spline(x, min_gap = 2, max_gap = 3)  # gaps between 2 and 3

## End(Not run)

Replace Missing Values Using Stineman Interpolation

Description

Replaces missing values using Stineman interpolation, with control over both minimum and maximum gap sizes to interpolate.

Usage

replace_na_stine(x, min_gap = 1, max_gap = Inf, ...)

Arguments

x

A vector containing numeric data with missing values (NAs)

min_gap

Integer specifying minimum gap size to interpolate. Gaps shorter than this will be left as NA. Default is 1 (interpolate all gaps).

max_gap

Integer or Inf specifying maximum gap size to interpolate. Gaps longer than this will be left as NA. Default is Inf (no upper limit).

...

Additional parameters passed to stinepack::stinterp

Details

The function applies both minimum and maximum gap criteria:

  • Gaps shorter than min_gap are left as NA

  • Gaps longer than max_gap are left as NA

  • Only gaps that meet both criteria are interpolated If both parameters are specified, min_gap must be less than or equal to max_gap.

Stineman interpolation is particularly good at preserving the shape of the data and avoiding overshooting.

Value

A numeric vector with NA values replaced by interpolated values where gap length criteria are met.

Examples

## Not run: 
x <- c(1, NA, NA, 4, 5, NA, NA, NA, 9)
replace_na_stine(x)  # interpolates all gaps
replace_na_stine(x, min_gap = 2)  # only gaps >= 2
replace_na_stine(x, max_gap = 2)  # only gaps <= 2
replace_na_stine(x, min_gap = 2, max_gap = 3)  # gaps between 2 and 3

## End(Not run)

Replace Missing Values with a Constant Value

Description

Replaces missing values with a specified constant value, with control over both minimum and maximum gap sizes to fill.

Usage

replace_na_value(x, value, min_gap = 1, max_gap = Inf)

Arguments

x

A vector containing numeric data with missing values (NAs)

value

Numeric value to use for replacement

min_gap

Integer specifying minimum gap size to fill. Gaps shorter than this will be left as NA. Default is 1 (fill all gaps).

max_gap

Integer or Inf specifying maximum gap size to fill. Gaps longer than this will be left as NA. Default is Inf (no upper limit).

Details

The function applies both minimum and maximum gap criteria:

  • Gaps shorter than min_gap are left as NA

  • Gaps longer than max_gap are left as NA

  • Only gaps that meet both criteria are filled If both parameters are specified, min_gap must be less than or equal to max_gap.

Value

A numeric vector with NA values replaced by the specified value where gap length criteria are met.

Examples

## Not run: 
x <- c(1, NA, NA, 4, 5, NA, NA, NA, 9)
replace_na_value(x, value = 0)  # fills all gaps with 0
replace_na_value(x, value = -1, min_gap = 2)  # only gaps >= 2
replace_na_value(x, value = -999, max_gap = 2)  # only gaps <= 2
replace_na_value(x, value = 0, min_gap = 2, max_gap = 3)  # gaps between 2 and 3

## End(Not run)

Rotate coordinates in Cartesian space

Description

[Experimental]

Rotates coordinates in Cartesian space based on two alignment points. The rotation aligns these points either with the 0-degree axis (parallel) or makes them perpendicular to it. This is particularly useful for creating egocentric reference frames or standardizing orientation across multiple frames or individuals.

Usage

rotate_coords(data, alignment_points, align_perpendicular = FALSE)

Arguments

data

movement data frame with columns: time, individual, keypoint, x, y

alignment_points

character vector of length 2 specifying the keypoint names to use for alignment

align_perpendicular

logical; if TRUE, alignment_points will be rotated to be perpendicular to the 0-degree axis. If FALSE (default), alignment_points will be rotated to align with the 0-degree axis

Details

The function processes each individual separately and maintains their independence. For each time point, it:

  1. Calculates the vector between the alignment points

  2. Determines the current angle of this vector

  3. Rotates all points to achieve the desired alignment

Value

movement data frame with rotated coordinates


Adjust time values to reflect a new framerate

Description

This function modifies time values in a dataset to match a new framerate and updates the corresponding metadata. It handles both integer and non-integer time values, ensuring time series start from zero when appropriate.

Usage

set_framerate(data, framerate, old_framerate = 1)

Arguments

data

A data frame or tibble containing the time series data

framerate

The new target framerate to convert to

old_framerate

The original framerate of the data (defaults to 1)

Details

The function calculates a scaling factor based on the ratio of old to new framerates. For integer time values, it ensures they start from zero. All time values are then scaled proportionally to maintain relative temporal relationships.

Value

A modified data frame with adjusted time values and updated metadata

Examples

data <- data.frame(time = 0:10, value = rnorm(11))
result <- set_framerate(data, framerate = 60, old_framerate = 30)

Assign a new individual identifier to all rows in a dataset

Description

This function replaces any existing individual identifiers with a new specified identifier across all rows in the dataset. The data is first ungrouped to ensure consistent application of the new identifier.

Usage

set_individual(data, individual)

Arguments

data

A data frame or tibble containing the data to be modified

individual

The new identifier value to be assigned to all rows

Value

A modified data frame with the new individual identifier applied as a factor

Examples

data <- data.frame(time = 1:5, value = rnorm(5))
result <- set_individual(data, "subject_A")

Set starting datetime

Description

[Experimental]

Usage

set_start_datetime(data, start_datetime)

Arguments

data

movement data frame

start_datetime

starting datetime. provided either as POSIXt, or as a string that can be parsed by the anytime package.

Value

movement data frame with starting datetime in metadata


Set UUID

Description

[Experimental]

Adds a unique identifier (UUID) to the data frames metadata

Usage

set_uuid(data, length = 20)

Arguments

data

movement data frame

length

length of identifier. (default: 20)

Value

data frame with the "uuid" metadata field filled out


Transform coordinates to egocentric reference frame

Description

[Experimental]

Transforms Cartesian coordinates into an egocentric reference frame through a two-step process: translation followed by rotation. First translates all coordinates relative to a reference keypoint, then rotates the coordinate system based on specified alignment points.

Usage

transform_to_egocentric(
  data,
  to_keypoint,
  alignment_points,
  align_perpendicular = FALSE
)

Arguments

data

movement data frame with columns: time, individual, keypoint, x, y

to_keypoint

character; keypoint to use as the new origin

alignment_points

character vector of length 2 specifying the keypoint names to use for alignment

align_perpendicular

logical; if TRUE, alignment_points will be rotated to be perpendicular to the 0-degree axis. If FALSE (default), alignment_points will be rotated to align with the 0-degree axis

Details

This function combines translation and rotation to create an egocentric reference frame. It:

  1. Translates all coordinates relative to the specified keypoint (to_keypoint)

  2. Rotates the coordinate system based on the alignment points

The translation makes the reference keypoint the new origin (0,0), while the rotation standardizes the orientation. This is particularly useful for:

  • Creating egocentric reference frames

  • Standardizing pose data across frames or individuals

  • Analyzing relative motion patterns

Value

movement data frame in egocentric reference frame

Examples

## Not run: 
# Transform coordinates to make nose the origin and align body axis
transformed_data <- transform_to_egocentric(
  data,
  to_keypoint = "nose",
  alignment_points = c("nose", "tail"),
  align_perpendicular = FALSE
)

# Transform to make nose origin and ears perpendicular to forward axis
transformed_data <- transform_to_egocentric(
  data,
  to_keypoint = "nose",
  alignment_points = c("ear_left", "ear_right"),
  align_perpendicular = TRUE
)

## End(Not run)

Translate coordinates (Cartesian)

Description

[Experimental]

Translates coordinates in Cartesian space. Takes either a single point (to_x and to_y), a vector with the same length as the time dimension or a keypoint (to_keypoint), which can be used to transform the data into an egocentric reference frame.

Usage

translate_coords(data, to_x = 0, to_y = 0, to_z = NULL, to_keypoint = NULL)

Arguments

data

movement data frame with columns: time, individual, keypoint, x, y

to_x

x coordinates; either a single value or a time-length vector

to_y

y coordinates; either a single value or a time-length vector

to_z

z coordinates (only if 3D); either a single value or a time-length vector

to_keypoint

all other coordinates becomes relative to this keypoint

Value

movement data frame with translated coordinates