IoT and Machine Learning for Pure Gas Detection

Back to Blog

IoT and Machine Learning for Pure Gas Detection

In a paper published in the journal Applied Sciences, researchers integrated the Internet of Things (IoT) with machine learning (ML) to distinguish pure gases in various applications. They networked gas sensors for continuous, real-time monitoring, generating data for ML models.

Using supervised algorithms like random forests, they created accurate classification models based on spectral signatures and sensor responses. This integration enables robust, automated gas detection with high accuracy and minimal delay. The approach enhances safety, efficiency, and sustainability in industrial and commercial scenarios.

 

Background

Past work has explored refrigerants’ environmental and health impacts, emphasizing the complex trade-offs between reducing ozone depletion and global warming potential. Studies highlight the need for comprehensive policies and the transition to low global warming potential (GWP) alternatives like difluoromethane (R32). Advancements in ML and sensor technology enhance refrigerant management, while European fluorinated gas (F-gas) regulations drive market trends toward sustainable refrigeration solutions.

Gas Classification Workflow

The detailed workflow for classifying refrigerant gases involves IoT technology and ML focusing on gases 1,1,1,2-tetrafluoroethane (R134a) and R32, monitored by sensors during the signal acquisition phase to collect real-time data. This data is processed through a Raspberry Pi, leveraging IoT technology for efficient data handling.

Feature extraction transforms raw sensor data into a machine learning-compatible format using the ‘tsfresh’ library, f. A random forest (RF) classifier is trained to distinguish between gases based on recognized patterns, achieving precise and automated gas classification.

The mobile device integrates IoT capabilities for user identification via an external mobile application and unique quick response (QR) codes for the device and refrigerant storage bottle. The device’s QR code is linked to its media access control (MAC), recorded, and saved in a database, preventing data manipulation. The QR code on the refrigerant bottle is also scanned and stored with measurement results in an external database, ensuring comprehensive data collection and secure storage.

An STM32 microcontroller manages hardware peripherals and rapid data acquisition, collecting data at 1 point per millisecond. The Infrared (IR) source functions in pulsations at 10 Hz with a 62% duty cycle, yielding 100 data points per channel per pulse, of which 60 are used. Data is collected over 1000 pulses and transferred to a Raspberry Pi for analysis. This method ensures efficient and accurate data collection for subsequent analysis.

Feature extraction, crucial for signal classification, transforms raw signals into useful features for machine learning. The ‘tsfresh’ library automates this process, efficiently handling large time series datasets and extracting valuable features. Decision trees (DT) are constructed using random subsets of training data and features, which helps mitigate overfitting and enhances the model’s ability to generalize.

Analysts used random subsets of training data and features to build decision trees, reducing overfitting and improving model generalization. The RF model, configured with 300 trees and a maximum depth of 5, demonstrated robust performance and computational balance in handling dataset variability.

Model Performance Analysis

This section presents a comprehensive analysis of the outcomes from the trained models, utilizing a stratified holdout methodology to partition the dataset into train and test sets. The team defined true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) for a multi-class classification problem to ensure a nuanced evaluation of model performance.

Accuracy, precision, recall, and the F1 score were the primary metrics to estimate the model’s performance. These metrics were calculated for two distinct classification tasks: distinguishing between refrigerant gases R32 and R134a in a binary classification and differentiating among R32, R134a, and various dilution levels in a twelve-class classification.

In the first experiment, focusing on gas type classification, a confusion matrix shows high accuracy in differentiating between R134a and R32, with the diagonal elements indicating correct predictions and off-diagonal elements showing misclassifications. The precision, recall, and F1 scores for both gases are consistently high, reflecting the model’s effectiveness in identifying and classifying these gases. This strong performance underscores the model’s suitability for precise gas detection applications, such as environmental monitoring and industrial safety.

The second experiment, addressing gas type and dilution level classification, provides detailed performance metrics across twelve distinct classes. The confusion matrix reveals variable performance, with perfect scores for certain dilution levels but challenges at lower concentrations and some mid-level dilutions. These results highlight areas where the model excels and areas needing improvement, particularly distinguishing gases at lower and mid-range dilutions. The analysis points to potential enhancements in feature extraction and sensor technology to address these issues.

Data visualization with t-distributed stochastic neighbor embedding (t-SNE) revealed distinct clustering patterns for different gases and dilution levels, validating the effectiveness of feature extraction while highlighting challenges like overlapping dilution levels. These insights are crucial for refining preprocessing methods and enhancing model accuracy for real-world gas detection systems.

Conclusion

In summary, integrating IoT technology with ML algorithms effectively classified refrigerant gases R32 and R134a. Gas sensors collected a dataset that trained models, with RF showing the best performance. This approach improved real-time gas detection and operational efficiency—future work aimed at expanding the dataset and exploring additional ML techniques to enhance model accuracy.

Share this post

Back to Blog