Linked e-resources
Details
Table of Contents
Intro
Preface
Organization
Abstracts of Keynote Talks
Privacy in the Era of Big Data, Machine Learning, IoT, and 5G
Don't Handicap AI without Explicit Knowledge
Extreme-Scale Model-Based Time Series Management with ModelarDB
Big Minds Sharing their Vision on the Future of AI (Panel)
Contents
Part II
Contents
Part I
Authenticity, Privacy, Security and Trust
Less is More: Feature Choosing under Privacy-Preservation for Efficient Web Spam Detection
1 Introduction
2 The PPGAFS Approach
2.1 Preselecting Privacy-Preserving Features
2.2 Generating Minimum Feature Subset Based on the Improved GA
3 Spam Detection and Verification Experiment Analysis
3.1 Web Spam Detection Procedure
3.2 Dataset and Evaluation Measures
3.3 Experiment Design and Result Analysis
4 Conclusion
References
Construction of Differentially Private Summaries Over Fully Homomorphic Encryption
1 Introduction
2 Preliminaries
2.1 Homomorphic Encryption
2.2 Differential Privacy
3 Related Work
3.1 Combination of Homomorphic Encryption and Differential Privacy
3.2 Range Queries Under Differential Privacy
4 Proposed Method
4.1 Overview
4.2 Adoption of Differential Privacy over Fully Homomorphic Encryption
4.3 Security Analysis
5 Experimental Evaluation
5.1 Experimental Setup
5.2 DP-Summary Construction Time
5.3 Accuracy of DP-Summary
6 Conclusion
References
SafecareOnto: A Cyber-Physical Security Ontology for Healthcare Systems
1 Introduction
2 Safecare Ontology
3 Knowledge Acquisition
4 Formalization and Implementation
4.1 Concepts Identification
4.2 Relationships Identification
4.3 Axioms Definition
4.4 Implementation
5 Safecare Use Cases
6 Related Work
7 Conclusion
References.
Repurpose Image Identification for Fake News Detection
1 Introduction
2 Related Work
3 Proposed Framework
3.1 Event Type Classifier
3.2 Image Repurpose Detector
4 Experimental Evaluation
4.1 Experimental Datasets
4.2 Experiments on Event Type Classification
4.3 Comparative Study
4.4 Variants of RECAST
4.5 Case Study
5 Conclusion
References
Data and Information Processing
An Urgency-Aware and Revenue-Based Itemset Placement Framework for Retail Stores
1 Introduction
2 Proposed Framework of the Problem
3 URIP: Urgency-Aware Itemset Placement Scheme
4 Performance Evaluation
5 Conclusion
References
NV-QALSH: An NVM-Optimized Implementation of Query-Aware Locality-Sensitive Hashing
1 Introduction
2 Preliminaries
2.1 The c-ANN Search Problem
2.2 The QALSH Method
2.3 Non-Volatile Memory
2.4 LB-Tree and LB-QALSH
3 Optimization Designs
3.1 Three-Level Storage Architecture
3.2 Leaf Node Optimization
3.3 Collision Counting Granularity Optimization
4 Experiments
4.1 Experiment Setup
4.2 Datasets and Queries
4.3 Evaluation Metrics
4.4 Benchmark Methods
4.5 Results and Analysis
5 Conclusion
References
NCRedis: An NVM-Optimized Redis with Memory Caching
1 Introduction
2 Implementation of NCRedis
2.1 Architecture of NCRedis
2.2 Log-Free Designs of LFSlab
2.3 Handling Persistent Memory Leak by LFSlab
2.4 Log-Free Designs of NCRedis
3 Evaluation
3.1 Experimental Setup
3.2 Memtier Benchmark Test
4 Conclusions
References
A Highly Modular Architecture for Canned Pattern Selection Problem
1 Introduction
2 System Architecture
2.1 Graph Similarity Module
2.2 Graph Clustering Module
2.3 Graph Connection Module
2.4 Pattern Mining Module
3 Conclusions
References
AutoEncoder for Neuroimage.
1 Introduction
2 The Proposed Approach
2.1 Variational AutoEncoder Based Regression
2.2 Supervised Linear Autoencoder
2.3 Implementation Details
3 Experiments
4 Conclusion
References
Knowledge Discovery
Towards New Model for Handling Inconsistency Issues in DL-Lite Knowledge Bases
1 Introduction
2 Related Works
3 DL-Lite Ontology and Management of Inconsistencies: An Overview
4 Most-Possible Repair Proposed Approach
4.1 Most-Possible Repair Algorithm
4.2 Experimental Study and Results Analysis
5 Conclusion and Prospects
References
ContextWalk: Embedding Networks with Context Information Extracted from News Articles
1 Introduction
2 Related Work
3 Dataset
3.1 Challenges
4 Algorithm
4.1 Context Embedding
4.2 ContextWalk
4.3 Complexity
5 Experiments
5.1 Compare Clusterings
5.2 Network and Embedding Distances
6 Discussion
References
FIP-SHA
Finding Individual Profiles Through SHared Accounts
1 Introduction
2 Background
3 Related Work
4 FIP-SHA
4.1 Session Representation
5 Experimental Evaluation Setup and Metrics
6 Results
6.1 Cut Off Sessions
6.2 Clustering
6.3 Analysis of (Weighted) User Separation
6.4 Discussion
7 Final Considerations
References
A Tag-Based Transformer Community Question Answering Learning-to-Rank Model in the Home Improvement Domain
1 Introduction
2 Related Work
3 Task Definition
4 Our Approach
4.1 Transformer Models
4.2 Input and Tag Representation
4.3 CQA Pair Matching Model
4.4 Model Optimisation
4.5 Candidate Answers Ranking
5 Dataset Building and Validation
5.1 Subjective CQA
5.2 Gold Standard Definition
6 Evaluation
6.1 Experiment Setup
6.2 Rank-Aware Evaluation Metrics
6.3 Results
7 Conclusion
References.
An Autonomous Crowdsourcing System
1 Introduction
2 Related Work
3 Crowdsourcing Task
3.1 Workflow
4 Experimental Evaluation
4.1 Experimental Setup
4.2 Results
5 Conclusion
References
Machine Learning
The Effect of IoT Data Completeness and Correctness on Explainable Machine Learning Models
1 Introduction
2 Related Work
3 Method
4 Observation, Analysis and Validation
5 Conclusion
References
Analysis of Behavioral Facilitation Tweets for Large-Scale Natural Disasters Dataset Using Machine Learning
1 Introduction
2 Related Work
3 Extraction of Behavioral Facilitation Tweets
3.1 A Classifier Based on LSTM
3.2 A Classifier Based on BiLSTM
3.3 A Classifier Based on BERT
4 Experiment 1: Comparison of Models for Classification Accuracy
4.1 Data
4.2 Method
4.3 Result
5 Experiment 2: Analysis Characteristics of BF-Tweets in a Large-Scale Disaster Situation
5.1 Experimental Conditions
5.2 Results
5.3 Discussion
6 Conclusion
References
Using Cross Lingual Learning for Detecting Hate Speech in Portuguese
1 Introduction
2 Related Work
3 Methodology
4 Evaluation and Results
5 Final Remarks
References
MMEnsemble: Imbalanced Classification Framework Using Metric Learning and Multi-sampling Ratio Ensemble
1 Introduction
2 Related Work: Resampling Approaches
2.1 Oversampling
2.2 Undersampling
3 MMEnsemble
3.1 Base Ensemble Classifier
MLEnsemble
3.2 Ensemble Using Asset-Based Weighting
4 Experimental Evaluation
4.1 Settings
4.2 Results
4.3 Lessons Learned
5 Conclusion
References
Evaluate the Contribution of Multiple Participants in Federated Learning
1 Introduction
2 Method
2.1 Shapley Value for Models
2.2 Invalid Shapley Value
2.3 Method
2.4 Properties
3 Experiment.
3.1 Utility Function
3.2 Noisy Labels
4 Conclusion
References
DFL-Net: Effective Object Detection via Distinguishable Feature Learning
1 Introduction
2 Related Work
3 Design of DFL-Net
3.1 High-Level Idea of DFL-Net
3.2 Full-Scale Fusion
3.3 Attention Guided Feature Refinement
4 Performance Evaluation
4.1 Settings
4.2 Results
4.3 Ablation Study
5 Conclusion and Future Work
References
Transfer Learning for Larger, Broader, and Deeper Neural-Network Quantum States
1 Introduction
2 Related Work
3 Background
3.1 Quantum Many-Body Systems
3.2 Deep Neural-Network Quantum States
4 Methodology
5 Performance Evaluation
5.1 Broader Networks
5.2 Deeper Networks
6 Conclusion
References
LGTM: A Fast and Accurate kNN Search Algorithm in High-Dimensional Spaces
1 Introduction
2 Theoretical Motivation
2.1 Preliminary
2.2 Theoretical Foundation
3 LGTM: From Theory to Practice
3.1 Pre-processing
3.2 Online (Query) Processing
4 Experiment
4.1 Comparison with AKNNG
4.2 Comparison with State-of-the-art Algorithms
5 Conclusion
References
TSX-Means: An Optimal K Search Approach for Time Series Clustering
1 Introduction
2 Notations and Definitions
3 TSX-Means: A New Method for Time Series Clustering
3.1 Principle of the Method
3.2 TSX-Means Algorithm
4 Experimental Results
5 Conclusion and Perspectives
References
A Globally Optimal Label Selection Method via Genetic Algorithm for Multi-label Classification
1 Introduction
2 Preliminaries
3 The Proposed Method
3.1 Uninformative Label Reduction via EBMD
3.2 Most Informative Label Selection via GA
3.3 Label Selection Algorithm Combining EBMD and GA
4 Experiments
4.1 Basic Experimental Settings
4.2 Experimental Results and Analysis
5 Conclusions.
Preface
Organization
Abstracts of Keynote Talks
Privacy in the Era of Big Data, Machine Learning, IoT, and 5G
Don't Handicap AI without Explicit Knowledge
Extreme-Scale Model-Based Time Series Management with ModelarDB
Big Minds Sharing their Vision on the Future of AI (Panel)
Contents
Part II
Contents
Part I
Authenticity, Privacy, Security and Trust
Less is More: Feature Choosing under Privacy-Preservation for Efficient Web Spam Detection
1 Introduction
2 The PPGAFS Approach
2.1 Preselecting Privacy-Preserving Features
2.2 Generating Minimum Feature Subset Based on the Improved GA
3 Spam Detection and Verification Experiment Analysis
3.1 Web Spam Detection Procedure
3.2 Dataset and Evaluation Measures
3.3 Experiment Design and Result Analysis
4 Conclusion
References
Construction of Differentially Private Summaries Over Fully Homomorphic Encryption
1 Introduction
2 Preliminaries
2.1 Homomorphic Encryption
2.2 Differential Privacy
3 Related Work
3.1 Combination of Homomorphic Encryption and Differential Privacy
3.2 Range Queries Under Differential Privacy
4 Proposed Method
4.1 Overview
4.2 Adoption of Differential Privacy over Fully Homomorphic Encryption
4.3 Security Analysis
5 Experimental Evaluation
5.1 Experimental Setup
5.2 DP-Summary Construction Time
5.3 Accuracy of DP-Summary
6 Conclusion
References
SafecareOnto: A Cyber-Physical Security Ontology for Healthcare Systems
1 Introduction
2 Safecare Ontology
3 Knowledge Acquisition
4 Formalization and Implementation
4.1 Concepts Identification
4.2 Relationships Identification
4.3 Axioms Definition
4.4 Implementation
5 Safecare Use Cases
6 Related Work
7 Conclusion
References.
Repurpose Image Identification for Fake News Detection
1 Introduction
2 Related Work
3 Proposed Framework
3.1 Event Type Classifier
3.2 Image Repurpose Detector
4 Experimental Evaluation
4.1 Experimental Datasets
4.2 Experiments on Event Type Classification
4.3 Comparative Study
4.4 Variants of RECAST
4.5 Case Study
5 Conclusion
References
Data and Information Processing
An Urgency-Aware and Revenue-Based Itemset Placement Framework for Retail Stores
1 Introduction
2 Proposed Framework of the Problem
3 URIP: Urgency-Aware Itemset Placement Scheme
4 Performance Evaluation
5 Conclusion
References
NV-QALSH: An NVM-Optimized Implementation of Query-Aware Locality-Sensitive Hashing
1 Introduction
2 Preliminaries
2.1 The c-ANN Search Problem
2.2 The QALSH Method
2.3 Non-Volatile Memory
2.4 LB-Tree and LB-QALSH
3 Optimization Designs
3.1 Three-Level Storage Architecture
3.2 Leaf Node Optimization
3.3 Collision Counting Granularity Optimization
4 Experiments
4.1 Experiment Setup
4.2 Datasets and Queries
4.3 Evaluation Metrics
4.4 Benchmark Methods
4.5 Results and Analysis
5 Conclusion
References
NCRedis: An NVM-Optimized Redis with Memory Caching
1 Introduction
2 Implementation of NCRedis
2.1 Architecture of NCRedis
2.2 Log-Free Designs of LFSlab
2.3 Handling Persistent Memory Leak by LFSlab
2.4 Log-Free Designs of NCRedis
3 Evaluation
3.1 Experimental Setup
3.2 Memtier Benchmark Test
4 Conclusions
References
A Highly Modular Architecture for Canned Pattern Selection Problem
1 Introduction
2 System Architecture
2.1 Graph Similarity Module
2.2 Graph Clustering Module
2.3 Graph Connection Module
2.4 Pattern Mining Module
3 Conclusions
References
AutoEncoder for Neuroimage.
1 Introduction
2 The Proposed Approach
2.1 Variational AutoEncoder Based Regression
2.2 Supervised Linear Autoencoder
2.3 Implementation Details
3 Experiments
4 Conclusion
References
Knowledge Discovery
Towards New Model for Handling Inconsistency Issues in DL-Lite Knowledge Bases
1 Introduction
2 Related Works
3 DL-Lite Ontology and Management of Inconsistencies: An Overview
4 Most-Possible Repair Proposed Approach
4.1 Most-Possible Repair Algorithm
4.2 Experimental Study and Results Analysis
5 Conclusion and Prospects
References
ContextWalk: Embedding Networks with Context Information Extracted from News Articles
1 Introduction
2 Related Work
3 Dataset
3.1 Challenges
4 Algorithm
4.1 Context Embedding
4.2 ContextWalk
4.3 Complexity
5 Experiments
5.1 Compare Clusterings
5.2 Network and Embedding Distances
6 Discussion
References
FIP-SHA
Finding Individual Profiles Through SHared Accounts
1 Introduction
2 Background
3 Related Work
4 FIP-SHA
4.1 Session Representation
5 Experimental Evaluation Setup and Metrics
6 Results
6.1 Cut Off Sessions
6.2 Clustering
6.3 Analysis of (Weighted) User Separation
6.4 Discussion
7 Final Considerations
References
A Tag-Based Transformer Community Question Answering Learning-to-Rank Model in the Home Improvement Domain
1 Introduction
2 Related Work
3 Task Definition
4 Our Approach
4.1 Transformer Models
4.2 Input and Tag Representation
4.3 CQA Pair Matching Model
4.4 Model Optimisation
4.5 Candidate Answers Ranking
5 Dataset Building and Validation
5.1 Subjective CQA
5.2 Gold Standard Definition
6 Evaluation
6.1 Experiment Setup
6.2 Rank-Aware Evaluation Metrics
6.3 Results
7 Conclusion
References.
An Autonomous Crowdsourcing System
1 Introduction
2 Related Work
3 Crowdsourcing Task
3.1 Workflow
4 Experimental Evaluation
4.1 Experimental Setup
4.2 Results
5 Conclusion
References
Machine Learning
The Effect of IoT Data Completeness and Correctness on Explainable Machine Learning Models
1 Introduction
2 Related Work
3 Method
4 Observation, Analysis and Validation
5 Conclusion
References
Analysis of Behavioral Facilitation Tweets for Large-Scale Natural Disasters Dataset Using Machine Learning
1 Introduction
2 Related Work
3 Extraction of Behavioral Facilitation Tweets
3.1 A Classifier Based on LSTM
3.2 A Classifier Based on BiLSTM
3.3 A Classifier Based on BERT
4 Experiment 1: Comparison of Models for Classification Accuracy
4.1 Data
4.2 Method
4.3 Result
5 Experiment 2: Analysis Characteristics of BF-Tweets in a Large-Scale Disaster Situation
5.1 Experimental Conditions
5.2 Results
5.3 Discussion
6 Conclusion
References
Using Cross Lingual Learning for Detecting Hate Speech in Portuguese
1 Introduction
2 Related Work
3 Methodology
4 Evaluation and Results
5 Final Remarks
References
MMEnsemble: Imbalanced Classification Framework Using Metric Learning and Multi-sampling Ratio Ensemble
1 Introduction
2 Related Work: Resampling Approaches
2.1 Oversampling
2.2 Undersampling
3 MMEnsemble
3.1 Base Ensemble Classifier
MLEnsemble
3.2 Ensemble Using Asset-Based Weighting
4 Experimental Evaluation
4.1 Settings
4.2 Results
4.3 Lessons Learned
5 Conclusion
References
Evaluate the Contribution of Multiple Participants in Federated Learning
1 Introduction
2 Method
2.1 Shapley Value for Models
2.2 Invalid Shapley Value
2.3 Method
2.4 Properties
3 Experiment.
3.1 Utility Function
3.2 Noisy Labels
4 Conclusion
References
DFL-Net: Effective Object Detection via Distinguishable Feature Learning
1 Introduction
2 Related Work
3 Design of DFL-Net
3.1 High-Level Idea of DFL-Net
3.2 Full-Scale Fusion
3.3 Attention Guided Feature Refinement
4 Performance Evaluation
4.1 Settings
4.2 Results
4.3 Ablation Study
5 Conclusion and Future Work
References
Transfer Learning for Larger, Broader, and Deeper Neural-Network Quantum States
1 Introduction
2 Related Work
3 Background
3.1 Quantum Many-Body Systems
3.2 Deep Neural-Network Quantum States
4 Methodology
5 Performance Evaluation
5.1 Broader Networks
5.2 Deeper Networks
6 Conclusion
References
LGTM: A Fast and Accurate kNN Search Algorithm in High-Dimensional Spaces
1 Introduction
2 Theoretical Motivation
2.1 Preliminary
2.2 Theoretical Foundation
3 LGTM: From Theory to Practice
3.1 Pre-processing
3.2 Online (Query) Processing
4 Experiment
4.1 Comparison with AKNNG
4.2 Comparison with State-of-the-art Algorithms
5 Conclusion
References
TSX-Means: An Optimal K Search Approach for Time Series Clustering
1 Introduction
2 Notations and Definitions
3 TSX-Means: A New Method for Time Series Clustering
3.1 Principle of the Method
3.2 TSX-Means Algorithm
4 Experimental Results
5 Conclusion and Perspectives
References
A Globally Optimal Label Selection Method via Genetic Algorithm for Multi-label Classification
1 Introduction
2 Preliminaries
3 The Proposed Method
3.1 Uninformative Label Reduction via EBMD
3.2 Most Informative Label Selection via GA
3.3 Label Selection Algorithm Combining EBMD and GA
4 Experiments
4.1 Basic Experimental Settings
4.2 Experimental Results and Analysis
5 Conclusions.