In this paper we present a comparison between stand
ard computer vision techniques and Deep
Learning approach for automatic metal corrosion (ru
st) detection. For the classic approach, a
classification based on the number of pixels contai
ning specific red components has been
utilized. The code written in Python used OpenCV li
braries to compute and categorize the
images. For the Deep Learning approach, we chose Ca
ffe, a powerful framework developed at
“Berkeley Vision and Learning Center” (BVLC). The
test has been performed by classifying
images and calculating the total accuracy for the t
wo different approaches.
JOINT IMAGE WATERMARKING, COMPRESSION AND ENCRYPTION BASED ON COMPRESSED SENS...ijma
ABSTRACT
Image usage over the internet becomes more and more important each day. Over 3 billion images are shared each day over the internet which raise a concern about how to protect images copyrights? Or how to utilize image sharing experience? This paper proposes a new robust image watermarking algorithm based on compressed sensing (CS) and quantization index modulation (QIM) watermark embedding. The algorithm capitalizes on the CS to compress and encrypt images jointly with Entropy Coding, Arnold Cat Map, Pseudo-random numbers and Advanced Encryption Standard (AES). Our proposed algorithm works under the JPEG standard umbrella. Watermark embedding is done in 3 different locations inside the image using QIM. Those locations differ with each 8-by-8 image block. Choosing which combination of coefficients to be used in QIM watermark embedding depends on selecting a combination from combinations table, which is generated at the same time with projection matrices using a 10-digits Pseudorandom number secret key SK1. After quantization phase, the algorithm shuffles image blocks using Arnold’s Cat Map with a 10-digits Pseudo-random number secret key SK2, followed by a unique method for splitting every 8x8 block into two unequal parts. Part number one will act as the host for two QIM watermarks then goes through encoding phase using Run-Length Encoding (RLE) followed by Huffman Encoding, while part number two goes through sparse watermark embedding followed by a third QIM watermark embedding and compression phase using CS, then Huffman encoder is used to encode this part. The algorithm aims to combine image watermarking, compression and encryption capabilities in one algorithm while balancing how those capabilities works with each other to achieve significant improvement in terms of image watermarking, compression and encryption. 15 different images usually used in image processing benchmarking were used for testing the algorithm capabilities and experiments show that our proposed algorithm achieves robust watermarking jointly with encryption and compression under the JPEG standard framework.
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...IJERA Editor
Recent advances in video capturing and display technologies, along with the exponentially increasing demand of
video services, challenge the video coding research community to design new algorithms able to significantly
improve the compression performance of the current H.264/AVC standard. This target is currently gaining
evidence with the standardization activities in the High Efficiency Video Coding (HEVC) project. The distortion
models used in HEVC are mean squared error (MSE) and sum of absolute difference (SAD). However, they are
widely criticized for not correlating well with perceptual image quality. The structural similarity (SSIM) index
has been found to be a good indicator of perceived image quality. Meanwhile, it is computationally simple
compared with other state-of-the-art perceptual quality measures and has a number of desirable mathematical
properties for optimization tasks. We propose a perceptual video coding method to improve upon the current
HEVC based on an SSIM-inspired divisive normalization scheme as an attempt to transform the DCT domain
frame prediction residuals to a perceptually uniform space before encoding.
Based on the residual divisive normalization process, we define a distortion model for mode selection and show
that such a divisive normalization strategy largely simplifies the subsequent perceptual rate-distortion
optimization procedure. We further adjust the divisive normalization factors based on local content of the video
frame. Experiments show that the scheme can achieve significant gain in terms of rate-SSIM performance and
better visual quality when compared with HEVC
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A SEMI-BLIND WATERMARKING SCHEME FOR RGB IMAGE USING CURVELET TRANSFORMijfcstjournal
In this paper, a semi-blind watermarking technique of embedding the color watermark using curvelet coefficient in RGB cover image has been proposed. The technique used the concept of HVS that the human eyes are not much sensitive to blue color. So the blue color plane of the cover image is used as embedding domain. A bit planes method is also used, the most significant bit (MSB) plane of watermark image is used
as embedding information. Selected scale and orientation of the curvelet coefficients of the blue channel in the cover image has been used for embedding the watermark information. All other 0-7 bit planes are used as a key at the time of extraction. The results of the watermarking scheme have been analyzed by different quality assessment metric such as PSNR, Correlation Coefficient (CC) and Mean Structure Similarity Index Measure (MSSIM). The experimental results show that the proposed technique gives the good invisibility of watermark, quality of extracted watermark and robustness against different attacks.
Scanned document compression using block based hybrid video codecMuthu Samy
Sybian Technologies Pvt Ltd
Final Year Projects & Real Time live Projects
JAVA(All Domains)
DOTNET(All Domains)
ANDROID
EMBEDDED
VLSI
MATLAB
Project Support
Abstract, Diagrams, Review Details, Relevant Materials, Presentation,
Supporting Documents, Software E-Books,
Software Development Standards & Procedure
E-Book, Theory Classes, Lab Working Programs, Project Design & Implementation
24/7 lab session
Final Year Projects For BE,ME,B.Sc,M.Sc,B.Tech,BCA,MCA
PROJECT DOMAIN:
Cloud Computing
Networking
Network Security
PARALLEL AND DISTRIBUTED SYSTEM
Data Mining
Mobile Computing
Service Computing
Software Engineering
Image Processing
Bio Medical / Medical Imaging
Contact Details:
Sybian Technologies Pvt Ltd,
No,33/10 Meenakshi Sundaram Building,
Sivaji Street,
(Near T.nagar Bus Terminus)
T.Nagar,
Chennai-600 017
Ph:044 42070551
Mobile No:9790877889,9003254624,7708845605
Mail Id:[email protected],[email protected]
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...IRJET Journal
The document discusses challenges in deblurring license plate images from fast moving vehicles. It proposes using sparse representation techniques to estimate blur kernels and deconvolve blurred license plate images. A block diagram shows blurred license plates being processed through kernel regression, orientation estimation, deconvolution and Wiener filtering to recover a sharp license plate image that can be recognized. The goal is to develop a method that can handle large blur kernels and improve license plate recognition from blurred images of fast moving vehicles.
SVM Based Saliency Map Technique for Reducing Time Complexity in HEVCIRJET Journal
This document proposes a technique using support vector machine-based saliency maps to reduce computational complexity in HEVC video coding. It describes saliency detection methods that identify prominent regions of images and discusses implementing an SVM-based saliency detection algorithm in MATLAB. The results show transmitting time is significantly reduced compared to full search, reducing complexity while maintaining comparable compression efficiency.
Self-Directing Text Detection and Removal from Images with SmoothingPriyanka Wagh
This document presents a method for self-directing text detection and removal from images using smoothing and exemplar-based inpainting (TLES+EBI). It introduces the problem of existing text detection systems requiring technical skills and outlines objectives to improve automaticity and visually plausible region filling. The methodology applies L0 gradient minimization smoothing to text detection, followed by exemplar-based inpainting for hole filling. Experimental results show smoothing improves detection rates while exemplar-based inpainting decreases MSE and increases PSNR compared to other methods. The document concludes the approach achieves better text detection and visually plausible region filling.
This document summarizes an analysis of iris recognition based on false acceptance rate (FAR) and false rejection rate (FRR) using the Hough transform. It first provides an overview of iris recognition and its typical stages: image acquisition, localization/segmentation, normalization, feature extraction, and pattern matching. It then describes existing methods used in each stage, including the Hough transform and rubber sheet model for localization and normalization. The proposed methodology applies Canny edge detection, Hough transform for boundary detection, normalization with the rubber sheet model, and calculates metrics like mean squared error, root mean squared error, signal-to-noise ratio, and root signal-to-noise ratio to evaluate the accuracy of iris recognition using FAR
Robust Image Watermarking Scheme Based on Wavelet TechniqueCSCJournals
In this paper, an image watermarking scheme based on multi bands wavelet transformation method is proposed. At first, the proposed scheme is tested on the spatial domain (for both a non and semi blind techniques) in order to compare its results with a frequency domain. In the frequency domain, an adaptive scheme is designed and implemented based on the bands selection criteria to embed the watermark. These criteria depend on the number of wavelet passes. In this work three methods are developed to embed the watermark (one band (LL|HH|HL|LH), two bands (LL&HH | LL&HL | LL&LH | HL&LH | HL&HH | LH&HH) and three bands (LL&HL&LH | LL&HH&HL | LL&HH&LH | LH&HH&HL) selection. The analysis results indicate that the performance of the proposed watermarking scheme for the non-blind scheme is much better than semi-blind scheme in terms of similarity of extracted watermark, while the security of semi-blind is relatively high. The results show that in frequency domain when the watermark is added to the two bands (HL and LH) for No. of pass =3 led to good correlation between original and extracted watermark around (similarity = 99%), and leads to reconstructed images of good objective quality (PSNR=24 dB) after JPEG compression attack (QF=25). The disadvantage of the scheme is the involvement of a large number of wavelet bands in the embedding process.
The growing trend of online image sharing and downloads today mandate the need for better encoding and
decoding scheme. This paper looks into this issue of image coding. Multiple Description Coding is an
encoding and decoding scheme that is specially designed in providing more error resilience for data
transmission. The main issue of Multiple Description Coding is the lossy transmission channels. This work
attempts to address the issue of re-constructing high quality image with the use of just one descriptor
rather than the conventional descriptor. This work compare the use of Type I quantizer and Type II
quantizer. We propose and compare 4 coders by examining the quality of re-constructed images. The 4
coders are namely JPEG HH (Horizontal Pixel Interleaving with Huffman Coding) model, JPEG HA
(Horizontal Pixel Interleaving with Arithmetic Encoding) model, JPEG VH (Vertical Pixel Interleaving
with Huffman Encoding) model, and JPEG VA (Vertical Pixel Interleaving with Arithmetic Encoding)
model. The findings suggest that the use of horizontal and vertical pixel interleavings do not affect the
results much. Whereas the choice of quantizer greatly affect its performance.
This document discusses video quality analysis for H.264 based on the human visual system. It proposes an improved video quality assessment method that adds color comparison to structural similarity measurement. The method separates similarity measurement into four comparisons: luminance, contrast, structure, and color. Experimental results on video sets with two distortion types show the proposed method's quality scores are more consistent with visual quality than classical methods. It also discusses the H.264 video coding standard and provides examples of encoding and decoding experimental results.
Video Compression Algorithm Based on Frame Difference Approaches ijsc
The huge usage of digital multimedia via communications, wireless communications, Internet, Intranet and cellular mobile leads to incurable growth of data flow through these Media. The researchers go deep in developing efficient techniques in these fields such as compression of data, image and video. Recently, video compression techniques and their applications in many areas (educational, agriculture, medical …) cause this field to be one of the most interested fields. Wavelet transform is an efficient method that can be used to perform an efficient compression technique. This work deals with the developing of an efficient video compression approach based on frames difference approaches that concentrated on the calculation of frame near distance (difference between frames). The
selection of the meaningful frame depends on many factors such as compression performance, frame details, frame size and near distance between frames. Three different approaches are applied for removing the lowest frame difference. In this paper, many videos are tested to insure the efficiency of this technique, in addition a good performance results has been obtained.
A New Technique to Digital Image Watermarking Using DWT for Real Time Applica...IJERA Editor
Digital watermarking is an essential technique to add hidden copyright notices or secret messages to digital audio, image, or image forms. In this paper we introduce a new approach for digital image watermarking for real time applications. We have successfully implemented the digital watermarking technique on digital images based on 2-level Discrete Wavelet Transform and compared the performance of the proposed method with Level-1 and Level-2 and Level-3 Discrete Wavelet Transform using the parameter peak signal to noise ratio. To make the watermark robust and to preserve visual significant information a 2-Level Discrete wavelet transform used as transformation domain for both secret image and original image. The watermark is embedded in the original image using Alpha blending technique and implemented using Matlab Simulink.
Review On Different Feature Extraction AlgorithmsIRJET Journal
This document discusses different feature extraction algorithms that can be used in visual sensor networks (WVSN). It begins by explaining the traditional "compress-then-analyze" approach and introduces the newer "analyze-then-compress" approach. The document then reviews several real-valued and binary feature extraction algorithms such as SIFT, SURF, BRIEF, BRISK, and BinBoost. It proposes a video coding system that uses the BRISK algorithm to extract features, encodes the features using entropy coding, and transmits the data to a base station for matching with stored image data. Diagrams of the proposed system and the BRISK algorithm are also included.
Comparison of Wavelet Watermarking Method With & without Estimator Approachijsrd.com
1. The document compares a wavelet watermarking method with and without an estimator approach for improving robustness against noise attacks.
2. Using an M-estimator at extraction improves imperceptibility and robustness by estimating and rejecting outlier pixels caused by noise.
3. Statistical analysis on watermarked images subjected to noise attacks shows the estimator approach reduces MSE and increases PSNR and correlation, indicating superior extraction quality compared to the standard wavelet method without estimator.
This document provides information on several remote sensing projects from IEEE 2015. It lists the titles, languages, and abstracts for 8 projects related to classification and analysis of hyperspectral and multispectral images. The projects focus on techniques such as sparse representation in tangent space, Gabor feature-based collaborative representation, level set evolutions for object extraction, and dimension reduction using spatial and spectral regularization.
Watermarking Scheme based on Redundant Discrete Wavelet Transform and SVDIRJET Journal
This document presents a watermarking scheme based on Redundant Discrete Wavelet Transform (RDWT) and Singular Value Decomposition (SVD) with the following key points:
1. The host image is first transformed into the wavelet domain using 1-level RDWT. SVD is then applied to embed the watermark by modifying the singular values of the host and watermark images.
2. For extraction, the watermarked image is transformed using RDWT and SVD to recover the singular values. The extracted watermark is obtained by calculating the difference between the singular values of the watermarked and original host images.
3. Experimental results on standard test images show the scheme is robust against various attacks
Energy efficiency is one of the most critical issue in design of System on Chip. In Network On
Chip (NoC) based system, energy consumption is influenced dramatically by mapping of
Intellectual Property (IP) which affect the performance of the system. In this paper we test the
antecedently extant proposed algorithms and introduced a new energy proficient algorithm
stand for 3D NoC architecture. In addition a hybrid method has also been implemented using
bioinspired optimization (particle swarm optimization) technique. The proposed algorithm has
been implemented and evaluated on randomly generated benchmark and real life application
such as MMS, Telecom and VOPD. The algorithm has also been tested with the E3S benchmark
and has been compared with the existing algorithm (spiral and crinkle) and has shown better
reduction in the communication energy consumption and shows improvement in the
performance of the system. Comparing our work with spiral and crinkle, experimental result
shows that the average reduction in communication energy consumption is 19% with spiral and
17% with crinkle mapping algorithms, while reduction in communication cost is 24% and 21%
whereas reduction in latency is of 24% and 22% with spiral and crinkle. Optimizing our work
and the existing methods using bio-inspired technique and having the comparison among them
an average energy reduction is found to be of 18% and 24%.
Tchebichef image watermarking along the edge using YCoCg-R color space for co...IJECEIAES
The document summarizes a research paper that proposes a Tchebichef watermarking technique along image edges using the YCoCg-R color space for copyright protection. The technique embeds a scrambled watermark bit into selected blocks of the image that are transformed using Tchebichef moments. The blocks are selected based on having minimum human visual characteristic entropy. The locations of the matrix moments C(0,1), C(1,0), C(0,2) and C(2,0) are used for embedding to maintain image quality. An optimal threshold is determined to balance imperceptibility and robustness against JPEG compression attacks. The technique is tested on various color images and is shown to produce good
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
This document discusses techniques for improving video compression efficiency for surveillance videos. It proposes modifying the architecture of scalable video coding to make it surveillance-centric by allowing adaptive rate-distortion optimization at the GOP level based on whether events of interest are present. Experimental results show foreground detection and updating of background adaptively over time to improve compression. Future work includes further enhancing selective motion estimation techniques to improve processing efficiency without degrading video quality.
Review On Fractal Image Compression TechniquesIRJET Journal
This document reviews different techniques for fractal image compression. It discusses how fractal image compression works by dividing an image into range and domain blocks. It then summarizes several papers that propose techniques for fractal image compression using discrete cosine transform (DCT) or discrete wavelet transform (DWT). These techniques aim to improve compression ratio and reduce encoding time. Finally, the document proposes a new method that combines wavelets with fractal image compression to further increase compression ratio while maintaining low image quality losses during decompression.
- How to tackle an object detection competition
- Schwert's 6th-place solution on Open Images Challenge 2019
- presented at the lunch workshop of the 26th Symposium on Sensing via Image Information (2020).
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKINGIRJET Journal
This document presents a project report on removing unnecessary objects from photos using masking techniques. It discusses using algorithms like Fast Marching and Navier-Stokes to fill in missing image data and maintain continuity across boundaries. The Fast Marching method begins at region boundaries and works inward, prioritizing completion of boundary pixels first. Navier-Stokes uses fluid dynamics equations to continue intensity value functions and ensure they remain continuous at boundaries. Color filtering can also be used to segment specific colored objects or regions. The project aims to implement these techniques to remove unwanted objects from images and fill the resulting gaps seamlessly.
This document describes a proposed method for real-time object detection using Single Shot Multi-Box Detection (SSD) with the MobileNet model. SSD is a single, unified network for object detection that eliminates feature resampling and combines predictions. MobileNet is used to create a lightweight network by employing depthwise separable convolutions, which significantly reduces model size compared to regular convolutions. The proposed SSD with MobileNet model achieved improved accuracy in identifying real-time household objects while maintaining the detection speed of SSD.
This document presents a novel approach for jointly optimizing spatial prediction and transform coding in video compression. It aims to improve performance and reduce complexity compared to existing techniques. The proposed method uses singular value decomposition (SVD) to compress images. SVD decomposes an image matrix into three matrices, allowing the image to be approximated using only a few singular values. This achieves compression by removing redundant information. The document outlines the SVD approach for image compression and measures compression performance using compression ratio and mean squared error between the original and compressed images. It then discusses trends in image and video coding, including combining natural and synthetic content. Finally, it provides a block diagram of the proposed system and compares its compression performance to existing discrete cosine transform-
This document summarizes an analysis of iris recognition based on false acceptance rate (FAR) and false rejection rate (FRR) using the Hough transform. It first provides an overview of iris recognition and its typical stages: image acquisition, localization/segmentation, normalization, feature extraction, and pattern matching. It then describes existing methods used in each stage, including the Hough transform and rubber sheet model for localization and normalization. The proposed methodology applies Canny edge detection, Hough transform for boundary detection, normalization with the rubber sheet model, and calculates metrics like mean squared error, root mean squared error, signal-to-noise ratio, and root signal-to-noise ratio to evaluate the accuracy of iris recognition using FAR
Robust Image Watermarking Scheme Based on Wavelet TechniqueCSCJournals
In this paper, an image watermarking scheme based on multi bands wavelet transformation method is proposed. At first, the proposed scheme is tested on the spatial domain (for both a non and semi blind techniques) in order to compare its results with a frequency domain. In the frequency domain, an adaptive scheme is designed and implemented based on the bands selection criteria to embed the watermark. These criteria depend on the number of wavelet passes. In this work three methods are developed to embed the watermark (one band (LL|HH|HL|LH), two bands (LL&HH | LL&HL | LL&LH | HL&LH | HL&HH | LH&HH) and three bands (LL&HL&LH | LL&HH&HL | LL&HH&LH | LH&HH&HL) selection. The analysis results indicate that the performance of the proposed watermarking scheme for the non-blind scheme is much better than semi-blind scheme in terms of similarity of extracted watermark, while the security of semi-blind is relatively high. The results show that in frequency domain when the watermark is added to the two bands (HL and LH) for No. of pass =3 led to good correlation between original and extracted watermark around (similarity = 99%), and leads to reconstructed images of good objective quality (PSNR=24 dB) after JPEG compression attack (QF=25). The disadvantage of the scheme is the involvement of a large number of wavelet bands in the embedding process.
The growing trend of online image sharing and downloads today mandate the need for better encoding and
decoding scheme. This paper looks into this issue of image coding. Multiple Description Coding is an
encoding and decoding scheme that is specially designed in providing more error resilience for data
transmission. The main issue of Multiple Description Coding is the lossy transmission channels. This work
attempts to address the issue of re-constructing high quality image with the use of just one descriptor
rather than the conventional descriptor. This work compare the use of Type I quantizer and Type II
quantizer. We propose and compare 4 coders by examining the quality of re-constructed images. The 4
coders are namely JPEG HH (Horizontal Pixel Interleaving with Huffman Coding) model, JPEG HA
(Horizontal Pixel Interleaving with Arithmetic Encoding) model, JPEG VH (Vertical Pixel Interleaving
with Huffman Encoding) model, and JPEG VA (Vertical Pixel Interleaving with Arithmetic Encoding)
model. The findings suggest that the use of horizontal and vertical pixel interleavings do not affect the
results much. Whereas the choice of quantizer greatly affect its performance.
This document discusses video quality analysis for H.264 based on the human visual system. It proposes an improved video quality assessment method that adds color comparison to structural similarity measurement. The method separates similarity measurement into four comparisons: luminance, contrast, structure, and color. Experimental results on video sets with two distortion types show the proposed method's quality scores are more consistent with visual quality than classical methods. It also discusses the H.264 video coding standard and provides examples of encoding and decoding experimental results.
Video Compression Algorithm Based on Frame Difference Approaches ijsc
The huge usage of digital multimedia via communications, wireless communications, Internet, Intranet and cellular mobile leads to incurable growth of data flow through these Media. The researchers go deep in developing efficient techniques in these fields such as compression of data, image and video. Recently, video compression techniques and their applications in many areas (educational, agriculture, medical …) cause this field to be one of the most interested fields. Wavelet transform is an efficient method that can be used to perform an efficient compression technique. This work deals with the developing of an efficient video compression approach based on frames difference approaches that concentrated on the calculation of frame near distance (difference between frames). The
selection of the meaningful frame depends on many factors such as compression performance, frame details, frame size and near distance between frames. Three different approaches are applied for removing the lowest frame difference. In this paper, many videos are tested to insure the efficiency of this technique, in addition a good performance results has been obtained.
A New Technique to Digital Image Watermarking Using DWT for Real Time Applica...IJERA Editor
Digital watermarking is an essential technique to add hidden copyright notices or secret messages to digital audio, image, or image forms. In this paper we introduce a new approach for digital image watermarking for real time applications. We have successfully implemented the digital watermarking technique on digital images based on 2-level Discrete Wavelet Transform and compared the performance of the proposed method with Level-1 and Level-2 and Level-3 Discrete Wavelet Transform using the parameter peak signal to noise ratio. To make the watermark robust and to preserve visual significant information a 2-Level Discrete wavelet transform used as transformation domain for both secret image and original image. The watermark is embedded in the original image using Alpha blending technique and implemented using Matlab Simulink.
Review On Different Feature Extraction AlgorithmsIRJET Journal
This document discusses different feature extraction algorithms that can be used in visual sensor networks (WVSN). It begins by explaining the traditional "compress-then-analyze" approach and introduces the newer "analyze-then-compress" approach. The document then reviews several real-valued and binary feature extraction algorithms such as SIFT, SURF, BRIEF, BRISK, and BinBoost. It proposes a video coding system that uses the BRISK algorithm to extract features, encodes the features using entropy coding, and transmits the data to a base station for matching with stored image data. Diagrams of the proposed system and the BRISK algorithm are also included.
Comparison of Wavelet Watermarking Method With & without Estimator Approachijsrd.com
1. The document compares a wavelet watermarking method with and without an estimator approach for improving robustness against noise attacks.
2. Using an M-estimator at extraction improves imperceptibility and robustness by estimating and rejecting outlier pixels caused by noise.
3. Statistical analysis on watermarked images subjected to noise attacks shows the estimator approach reduces MSE and increases PSNR and correlation, indicating superior extraction quality compared to the standard wavelet method without estimator.
This document provides information on several remote sensing projects from IEEE 2015. It lists the titles, languages, and abstracts for 8 projects related to classification and analysis of hyperspectral and multispectral images. The projects focus on techniques such as sparse representation in tangent space, Gabor feature-based collaborative representation, level set evolutions for object extraction, and dimension reduction using spatial and spectral regularization.
Watermarking Scheme based on Redundant Discrete Wavelet Transform and SVDIRJET Journal
This document presents a watermarking scheme based on Redundant Discrete Wavelet Transform (RDWT) and Singular Value Decomposition (SVD) with the following key points:
1. The host image is first transformed into the wavelet domain using 1-level RDWT. SVD is then applied to embed the watermark by modifying the singular values of the host and watermark images.
2. For extraction, the watermarked image is transformed using RDWT and SVD to recover the singular values. The extracted watermark is obtained by calculating the difference between the singular values of the watermarked and original host images.
3. Experimental results on standard test images show the scheme is robust against various attacks
Energy efficiency is one of the most critical issue in design of System on Chip. In Network On
Chip (NoC) based system, energy consumption is influenced dramatically by mapping of
Intellectual Property (IP) which affect the performance of the system. In this paper we test the
antecedently extant proposed algorithms and introduced a new energy proficient algorithm
stand for 3D NoC architecture. In addition a hybrid method has also been implemented using
bioinspired optimization (particle swarm optimization) technique. The proposed algorithm has
been implemented and evaluated on randomly generated benchmark and real life application
such as MMS, Telecom and VOPD. The algorithm has also been tested with the E3S benchmark
and has been compared with the existing algorithm (spiral and crinkle) and has shown better
reduction in the communication energy consumption and shows improvement in the
performance of the system. Comparing our work with spiral and crinkle, experimental result
shows that the average reduction in communication energy consumption is 19% with spiral and
17% with crinkle mapping algorithms, while reduction in communication cost is 24% and 21%
whereas reduction in latency is of 24% and 22% with spiral and crinkle. Optimizing our work
and the existing methods using bio-inspired technique and having the comparison among them
an average energy reduction is found to be of 18% and 24%.
Tchebichef image watermarking along the edge using YCoCg-R color space for co...IJECEIAES
The document summarizes a research paper that proposes a Tchebichef watermarking technique along image edges using the YCoCg-R color space for copyright protection. The technique embeds a scrambled watermark bit into selected blocks of the image that are transformed using Tchebichef moments. The blocks are selected based on having minimum human visual characteristic entropy. The locations of the matrix moments C(0,1), C(1,0), C(0,2) and C(2,0) are used for embedding to maintain image quality. An optimal threshold is determined to balance imperceptibility and robustness against JPEG compression attacks. The technique is tested on various color images and is shown to produce good
Jiri ece-01-03 adaptive temporal averaging and frame prediction based surveil...Ijripublishers Ijri
This document discusses techniques for improving video compression efficiency for surveillance videos. It proposes modifying the architecture of scalable video coding to make it surveillance-centric by allowing adaptive rate-distortion optimization at the GOP level based on whether events of interest are present. Experimental results show foreground detection and updating of background adaptively over time to improve compression. Future work includes further enhancing selective motion estimation techniques to improve processing efficiency without degrading video quality.
Review On Fractal Image Compression TechniquesIRJET Journal
This document reviews different techniques for fractal image compression. It discusses how fractal image compression works by dividing an image into range and domain blocks. It then summarizes several papers that propose techniques for fractal image compression using discrete cosine transform (DCT) or discrete wavelet transform (DWT). These techniques aim to improve compression ratio and reduce encoding time. Finally, the document proposes a new method that combines wavelets with fractal image compression to further increase compression ratio while maintaining low image quality losses during decompression.
- How to tackle an object detection competition
- Schwert's 6th-place solution on Open Images Challenge 2019
- presented at the lunch workshop of the 26th Symposium on Sensing via Image Information (2020).
A PROJECT REPORT ON REMOVAL OF UNNECESSARY OBJECTS FROM PHOTOS USING MASKINGIRJET Journal
This document presents a project report on removing unnecessary objects from photos using masking techniques. It discusses using algorithms like Fast Marching and Navier-Stokes to fill in missing image data and maintain continuity across boundaries. The Fast Marching method begins at region boundaries and works inward, prioritizing completion of boundary pixels first. Navier-Stokes uses fluid dynamics equations to continue intensity value functions and ensure they remain continuous at boundaries. Color filtering can also be used to segment specific colored objects or regions. The project aims to implement these techniques to remove unwanted objects from images and fill the resulting gaps seamlessly.
This document describes a proposed method for real-time object detection using Single Shot Multi-Box Detection (SSD) with the MobileNet model. SSD is a single, unified network for object detection that eliminates feature resampling and combines predictions. MobileNet is used to create a lightweight network by employing depthwise separable convolutions, which significantly reduces model size compared to regular convolutions. The proposed SSD with MobileNet model achieved improved accuracy in identifying real-time household objects while maintaining the detection speed of SSD.
This document presents a novel approach for jointly optimizing spatial prediction and transform coding in video compression. It aims to improve performance and reduce complexity compared to existing techniques. The proposed method uses singular value decomposition (SVD) to compress images. SVD decomposes an image matrix into three matrices, allowing the image to be approximated using only a few singular values. This achieves compression by removing redundant information. The document outlines the SVD approach for image compression and measures compression performance using compression ratio and mean squared error between the original and compressed images. It then discusses trends in image and video coding, including combining natural and synthetic content. Finally, it provides a block diagram of the proposed system and compares its compression performance to existing discrete cosine transform-
This document discusses various techniques for image segmentation. It begins with an abstract discussing image segmentation and its importance in image processing. It then discusses different types of image segmentation like semantic and instance segmentation.
The document then discusses implementation of different image segmentation techniques. It implements region-based segmentation using Mask R-CNN. It performs thresholding-based segmentation using simple thresholding, Otsu's automatic thresholding. It also implements clustering-based segmentation using K-means and Fuzzy C-means. Furthermore, it implements edge-based segmentation using gradient-based techniques like Sobel and Prewitt, and Gaussian-based techniques like Laplacian and Canny edge detectors. Code snippets and output images are provided.
Implementation of Computer Vision Applications using OpenCV in C++IRJET Journal
This document discusses implementing computer vision applications using OpenCV in C++. It explores integrating OpenCV and C++ for developing robust CV applications. Examples are presented demonstrating seamless integration, including facial recognition, motion detection, and augmented reality. Four use cases are implemented: color detection, barcode decoding, text recognition, and text on images. While some pre-trained libraries were unavailable in C++, the expected outcomes were achieved through alternative methods using OpenCV functions and algorithms. The future of OpenCV in C++ is discussed to include industrial automation, security, real-time systems, and integration with deep learning.
A Robust Automatic Meter Reading System based on Mask-RCNNsivafiles007
Convert the pdf to ppt like a teacher explaining to a student with help of only taking points that are having the exact meaning and images that are having the required word to show
Hardware Unit for Edge Detection with Comparative Analysis of Different Edge ...paperpublications3
Abstract: An edge in an image is a contour across which the brightness of the image changes abruptly. In image processing, an edge is often interpreted as one class of singularities. Edge detection is an important task in image processing. It is a main tool in pattern recognition, image segmentation, and scene analysis. An edge detector is basically a high pass filter that can be applied to extract the edge points in an image. This topic has attracted many researchers and many achievements have been made. Many researchers provided different approaches based on mathematical calculations which some of them are either robust or cost effective. A new algorithm will be proposed to detect the edges of image with increased robustness and throughput. Using this algorithm we will reduce the time complexity problem which is faced by previous algorithm. We will also propose hardware unit for proposed algorithm which will reduce the area, power and speed problem. We will compare our proposed algorithm with previous approach. For image quality measurement we will use some scientific parameters those are PSNR, SSIM, FSIM. Implementation of proposed algorithm will be done by Matlab and hardware implementation will be done by using of Verilog on Xilinx 14.1 simulator. Verification will be done on Model sim.
This document discusses a project to implement image segmentation algorithms in parallel on a RISC processor for real-time processing. The project proposes a new segmentation chain using topological operators like thinning and crest restoration. This chain was tested on medical images and able to process 33 512x512 images per second on a 3.06GHz Pentium IV processor using SSE instructions, demonstrating it can meet real-time constraints. The new chain provides high quality segmentation and is well-suited for medical applications.
A Novel Blind SR Method to Improve the Spatial Resolution of Real Life Video ...IRJET Journal
This document proposes a novel blind super resolution method to improve the spatial resolution of real-life video sequences. The key aspects of the proposed method are:
1) It estimates blur without knowing the point spread function or noise statistics using a non-uniform interpolation super resolution method and multi-scale processing.
2) It uses a cost function with fidelity and regularization terms of a Huber-Markov random field to preserve edges and fine details in the reconstructed high resolution frames.
3) It performs masking to suppress artifacts from inaccurate motions, adaptively weighting the fidelity term at each iteration for faster convergence.
The method is tested on real-life videos with complex motions, objects, and brightness changes, showing
A Novel Approaches For Chromatic Squander Less Visceral Coding Techniques Usi...IJERA Editor
Recent advances in video capturing and display technologies, along with the exponentially increasing demand of
video services, challenge the video coding research community to design new algorithms able to significantly
improve the compression performance of the current H.264/AVC standard. This target is currently gaining
evidence with the standardization activities in the High Efficiency Video Coding (HEVC) project. The distortion
models used in HEVC are mean squared error (MSE) and sum of absolute difference (SAD). However, they are
widely criticized for not correlating well with perceptual image quality. The structural similarity (SSIM) index
has been found to be a good indicator of perceived image quality. Meanwhile, it is computationally simple
compared with other state-of-the-art perceptual quality measures and has a number of desirable mathematical
properties for optimization tasks. We propose a perceptual video coding method to improve upon the current
HEVC based on an SSIM-inspired divisive normalization scheme as an attempt to transform the DCT domain
frame prediction residuals to a perceptually uniform space before encoding.
Based on the residual divisive normalization process, we define a distortion model for mode selection and show
that such a divisive normalization strategy largely simplifies the subsequent perceptual rate-distortion
optimization procedure. We further adjust the divisive normalization factors based on local content of the video
frame. Experiments show that the scheme can achieve significant gain in terms of rate-SSIM performance and
better visual quality when compared with HEVC
Text Detection and Recognition in Natural ImagesIRJET Journal
1. The document presents a framework for detecting and recognizing text in natural images.
2. The framework includes candidate text generation using MSER extraction and discrete wavelet transform, followed by non-text filtering using occupation ratio and stroke width.
3. A component level classifier then determines relationships between connected components, and an SVM classifier further rejects non-text blocks to output the detected text, which is then recognized using OCR.
IRJET - Object Detection using Hausdorff DistanceIRJET Journal
This document proposes using Hausdorff distance for object detection as it can better handle noise compared to other methods like Euclidean distance. The document discusses preprocessing images using Gaussian filtering for noise cancellation. It then represents shapes as point sets for feature extraction before using Hausdorff distance to match shapes between reference and test images for object recognition. Encouraging results were obtained when testing on MNIST, COIL and private handwritten digit datasets.
Image super resolution using Generative Adversarial Network.IRJET Journal
This document discusses using a generative adversarial network (GAN) for image super resolution. It begins with an abstract that explains super resolution aims to increase image resolution by adding sub-pixel detail. Convolutional neural networks are well-suited for this task. Recent years have seen interest in reconstructing super resolution video sequences from low resolution images. The document then reviews literature on image super resolution techniques including deep learning methods. It describes the methodology which uses a CNN to compare input images to a trained dataset to predict if high-resolution images can be generated from low-resolution images.
IRJET- Object Detection using Hausdorff DistanceIRJET Journal
This document proposes a new object recognition system using Hausdorff distance. The system aims to improve on existing methods like YOLO that struggle with small objects and can capture garbage data. The document outlines preprocessing steps like noise cancellation, representing shapes as point sets, and extracting features. It then describes using Hausdorff distance and shape context to find the best match between input and reference shapes. Testing on datasets showed encouraging results for recognizing handwritten digits.
This document provides an overview of a project that implemented image filtering using VHDL on an FPGA board. It discusses designing filters like average, Sobel, Gaussian, and Laplacian filters. Cache memory and a processing unit were developed to hold pixel values and apply filter kernels. Different methods for multiplication in the convolution process were evaluated. Results showed the output images after applying each filter both in software and on the FPGA board. In conclusion, FPGAs provide reconfigurable, accelerated processing for image applications like filtering compared to general purpose computers.
Department of Environment (DOE) Mix Design with Fly Ash.MdManikurRahman
Concrete Mix Design with Fly Ash by DOE Method. The Department of Environmental (DOE) approach to fly ash-based concrete mix design is covered in this study.
The Department of Environment (DOE) method of mix design is a British method originally developed in the UK in the 1970s. It is widely used for concrete mix design, including mixes that incorporate supplementary cementitious materials (SCMs) such as fly ash.
When using fly ash in concrete, the DOE method can be adapted to account for its properties and effects on workability, strength, and durability. Here's a step-by-step overview of how the DOE method is applied with fly ash.
UNIT-1-PPT-Introduction about Power System Operation and ControlSridhar191373
Power scenario in Indian grid – National and Regional load dispatching centers –requirements of good power system - necessity of voltage and frequency regulation – real power vs frequency and reactive power vs voltage control loops - system load variation, load curves and basic concepts of load dispatching - load forecasting - Basics of speed governing mechanisms and modeling - speed load characteristics - regulation of two generators in parallel.
Video Games and Artificial-Realities.pptxHadiBadri1
🕹️ #GameDevs, #AIteams, #DesignStudios — I’d love for you to check it out.
This is where play meets precision. Let’s break the fourth wall of slides, together.
This presentation provides a comprehensive overview of air filter testing equipment and solutions based on ISO 5011, the globally recognized standard for performance testing of air cleaning devices used in internal combustion engines and compressors.
Key content includes:
Better Builder Magazine brings together premium product manufactures and leading builders to create better differentiated homes and buildings that use less energy, save water and reduce our impact on the environment. The magazine is published four times a year.
Expansive soils (ES) have a long history of being difficult to work with in geotechnical engineering. Numerous studies have examined how bagasse ash (BA) and lime affect the unconfined compressive strength (UCS) of ES. Due to the complexities of this composite material, determining the UCS of stabilized ES using traditional methods such as empirical approaches and experimental methods is challenging. The use of artificial neural networks (ANN) for forecasting the UCS of stabilized soil has, however, been the subject of a few studies. This paper presents the results of using rigorous modelling techniques like ANN and multi-variable regression model (MVR) to examine the UCS of BA and a blend of BA-lime (BA + lime) stabilized ES. Laboratory tests were conducted for all dosages of BA and BA-lime admixed ES. 79 samples of data were gathered with various combinations of the experimental variables prepared and used in the construction of ANN and MVR models. The input variables for two models are seven parameters: BA percentage, lime percentage, liquid limit (LL), plastic limit (PL), shrinkage limit (SL), maximum dry density (MDD), and optimum moisture content (OMC), with the output variable being 28-day UCS. The ANN model prediction performance was compared to that of the MVR model. The models were evaluated and contrasted on the training dataset (70% data) and the testing dataset (30% residual data) using the coefficient of determination (R2), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) criteria. The findings indicate that the ANN model can predict the UCS of stabilized ES with high accuracy. The relevance of various input factors was estimated via sensitivity analysis utilizing various methodologies. For both the training and testing data sets, the proposed model has an elevated R2 of 0.9999. It has a minimal MAE and RMSE value of 0.0042 and 0.0217 for training data and 0.0038 and 0.0104 for testing data. As a result, the generated model excels the MVR model in terms of UCS prediction.
This presentation provides a comprehensive overview of a specialized test rig designed in accordance with ISO 4548-7, the international standard for evaluating the vibration fatigue resistance of full-flow lubricating oil filters used in internal combustion engines.
Key features include:
Peak ground acceleration (PGA) is a critical parameter in ground-motion investigations, in particular in earthquake-prone areas such as Iran. In the current study, a new method based on particle swarm optimization (PSO) is developed to obtain an efficient attenuation relationship for the vertical PGA component within the northern Iranian plateau. The main purpose of this study is to propose suitable attenuation relationships for calculating the PGA for the Alborz, Tabriz and Kopet Dag faults in the vertical direction. To this aim, the available catalogs of the study area are investigated, and finally about 240 earthquake records (with a moment magnitude of 4.1 to 6.4) are chosen to develop the model. Afterward, the PSO algorithm is used to estimate model parameters, i.e., unknown coefficients of the model (attenuation relationship). Different statistical criteria showed the acceptable performance of the proposed relationships in the estimation of vertical PGA components in comparison to the previously developed relationships for the northern plateau of Iran. Developed attenuation relationships in the current study are independent of shear wave velocity. This issue is the advantage of proposed relationships for utilizing in the situations where there are not sufficient shear wave velocity data.
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDINGDr. BASWESHWAR JIRWANKAR
: Introduction to Acoustics & Green Building -
Absorption of sound, various materials, Sabine’s formula, optimum reverberation time, conditions for good acoustics Sound insulation:
Acceptable noise levels, noise prevention at its source, transmission of noise, Noise control-general considerations
Green Building: Concept, Principles, Materials, Characteristics, Applications
2. 92 Computer Science & Information Technology (CS & IT)
for infrastructure asset managers, so is human subjectivity. Infrastructure operators are nowadays
requesting methods to analyse pixel-based datasets without the need for human intervention and
interpretation. The end result desired is to objectively conclude if their assets present a fault or
not. Currently, this conclusion varies according to the person doing the image interpretation and
analysis. Results are therefore inconsistent, since the existence of a fault or not is interpreted
differently depending on the individual. Where one individual sees a fault, another may not.
Developing an objective fault recognition system would add value to existing datasets by
providing a reliable baseline for infrastructure asset managers. One of the key indicators most
asset managers look for during inspections is the presence of corrosion. Therefore, this
feasibility study has focused on automatic rust detection. This project created an autonomous
classifier that enabled detection of rust present in pictures or frames. The challenge associated
with this approach was the fact that the rust has no defined shape and colour. Also, the changing
landscape and the presence of misleading object (red coloured leaves, houses, road signs, etc)
may lead to miss-classification of the images. Furthermore, the classification process should still
be relatively fast in order to be able to process large amount of videos in a reasonable time.
2. APPROACH USED
Some authors tried to solve similar problems using “watershed segmentation” [4] for coated
materials, supervised classification schemes [5-6] for cracks and corrosion in sewer pipes and
metal, or Artificial Neural Networks [7] for corrosion in vessels. We decided to implement one
version of classic computer vision (based on red component) and one deep learning model and
perform a comparison test between the two different approaches. Many different frameworks and
libraries are available for both the classic computer vision techniques and the Deep Learning
approach.
2.1. Classic Computer Vision Technique
For almost two decades, developers in computer vision have relied on OpenCV[8] libraries to
develop their solutions. With a user community of more than 47 thousand people and estimated
number of downloads exceeding 7 million, this set of >2500 algorithms [8] and useful functions
can be considered standard libraries for image and video applications. The library has two
interfaces, C++ and Python. However, since Python-OpenCV is just a wrapper around C++
functions (which perform the real computation intensive code), the loss in performance by using
Python interface is often negligible. For these reasons we chose to develop our first classifier
using this set of tools. The classifier was relatively basic. Since a corroded area (rust) has no clear
shape, we decided to focus on the colours, and in particular the red component. After basic
filtering, we changed the image colour space from RGB to HSV, in order to reduce the impact of
illumination on the images[6]. After the conversion we extracted the red component from the
HSV image (in OpenCV, Hue range is [0,179], Saturation range is [0,255] and Value range is
[0,255]). Since the red components is spread in a non-contiguous interval (range of red color in
HSV is around 160-180 and 0-20 for the H component) we had to split the image into two
masks, filter it and then re-add them together. Moreover, because not all the red interval was
useful for the rust detection, we tried to empirically narrow down the component in order to find
the best interval that was not result in too many false positives. After extensive testing we found
the best interval in rust detection, to be 0-11 and 175-180. Also we flattened the S and I
component to the range 50-255. This mask was then converted into black and white and the white
pixels were counted. Every image having more than 0.3% of white pixels was finally classified as
3. Computer Science & Information Technology (CS & IT) 93
“rust”, while having less than 0.3% of white pixels indicated a “non-rust” detection. Below are
some snippets of the classification code:
# define range of red color in HSV 160-180 and 0-20
lower_red = np.array([0,50,50])
upper_red = np.array([11,255,255])
lower_red2 = np.array([175,50,50])
upper_red2 = np.array([179,255,255])
# Threshold the HSV image to get only red colors
mask1 = cv2.inRange(hsv, lower_red, upper_red)
mask2 = cv2.inRange(hsv, lower_red2, upper_red2)
mask=mask1+mask2
ret,maskbin = cv2.threshold(mask,
127,255,cv2.THRESH_BINARY)
#calculate the percentage
height, width = maskbin.shape
size=height * width
percentage=cv2.countNonZero(maskbin)/float(size)
if percentage>0.003:
return True
else:
return False
2.2. Deep Learning Model
The second approach was based on artificial intelligence, in particular using Deep Learning
methods. This approach is not new. The mathematical model of back-propagation was first
developed in ‘70s and was originally reused by Yann LeCun in [9]. This was one of the first real
applications of Deep Learning. However a major step forward was made in 2012 when Geoff
Hinton won the imageNet competition by using Deep Learning network, outperforming other
more classic algorithms. Among many frameworks available such as torch [10], theano library
for python, or the most recent tensorflow [11] released by google, we chose caffe from “Berkeley
Vision and Learning Center” (BVLC)[12]. This framework is specifically suited for image
processing, offering good speed and great flexibility. It also offers the opportunity to easily use
clusters of GPUs support for model training which could be useful in the case of large networks.
Furthermore, it is released under a BSD 2 license.The first step was to collect a good dataset to be
used to train the network. We were able to collect around 1300 images for the “rust” class and
2200 images for the “non-rust” class. Around 80% of the images were used for the training set,
while the rest was used for the validation set. Since the dataset was relatively small, we decided
to fine tune an existing model called “bvlc_reference_caffenet” which is based on the AlexNet
model and released with license for unrestricted use. In fine tuning, the framework took an
already trained network and adjusted it (resuming the training) using the new data as input. This
technique provides several advantages. First of all, it allows the reuse of previously trained
networks, saving a lot of time. Furthermore, since the “bvlc_reference_caffenet” has been already
pre-trained with 1 Million images, the network has prior “knowledge” of the correct weight
4. 94 Computer Science & Information Technology (CS & IT)
parameters for the initial layers. We could thus reuse that information and avoid over-fitting
problems (excessively complex model and not enough data to constrain it). The last layer of the
model was also modified to reflect the rust requirements. In particular the layer 8 definition was
changed to:
layer {
name: "fc8_orbiton"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_orbiton"
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
Notice that the learning rate (lr) multiplier was set to 10 in order to make the last layer weight
parameters “move” more in respect to the other layers where the learning rate multiplier was set
to 1 (because they were already pre-trained). Also we set up the number of outputs to two to
reflect our two categories “rust”/“non-rust”. The images were resized to 256x256 pixels and the
mean file for the new set of images was recalculated before we trained the model. The training
process was performed with a learning rate of 0.00005 with 100.000 iterations in total, performed
on an Ubuntu 14.04 machine with GPU CUDA support. The hardware included I7 skylake CPU
and Nvidia GTX 980 Ti GPU. The training process took around 2 days.
3. TESTS
Test and comparison of the trained model was performed by writing a small classification script
in Python and using it for classifying a new image set. This new set of images was different from
the one used in the Deep Learning training and validation steps and consisted of 100 images,
divided into two groups of 37 images of “rust” and 63 images of “non-rust”. Images were chosen
as a mix of real case image (picture of bridges, metal sheets, etc) and other added just to trick the
algorithm such as images from desert landscape or images of red apple trees. In Figure 1 shows
some example of the images used.
5. Computer Science & Information Technology (CS & IT) 95
Figure 1: Example of test images
4. RESULTS
Results of the test were divided into two groups:
1. False Positive: The images of “non-rust” which were wrongly classified as “rust”
2. False Negative: The images of “rust” which were wrongly classified as “non-rust”
For each algorithm developed, we counted the number of false positives and false negatives
occurrences. The partial accuracy for the classification in each class was also calculated based on
the total images of that class. For example OpenCV had 4 false negative images on 37 “rust”
image. This implies that 33 images over 37 were correctly classified as “rust” giving a total
accuracy of for the “rust class” of:
Similarly, for the “non-rust” class the partial accuracy is given by the correctly classified images
of “non-rust” (36) over the total “non-rust” images (63), giving an partial accuracy for the “non-
rust” class of 57%. We also included a total accuracy for the total number of correctly classified
images over the total. In this case OpenCV classified correctly 69 images out of 100 (69%).
We repeated the same calculation for the Deep Learning model and the results are reported in the
column two of Table 1.
The Deep Learning classifier also provides a probability associated with each prediction. This
number reflects how confident the model is that the prediction is correct. Among the 100 images,
15 of those had a probability below 80%. We discarded those images and recalculated the
accuracy values (third column). This also means that for the 15% of the image, the Deep
Learning model was “undecided” on how to classify it. A complete summary of the results is
reported in Table 1.
6. 96 Computer Science & Information Technology (CS & IT)
Table 1. Models comparison: resume table
OpenCV
Deep
Learning
Deep Learning
Probability >0.8
False Positive 27/ 63 14/63 5/51
Partial Accuracy for “non-
rust”
57% 78% 90%
Number False Negative 4/ 37 8/37 7/34
Partial Accuracy for “rust” 89% 78% 79.4%
Total Accuracy (correctly
classified/total of images)
69% 78% 88%
5. DISCUSSION
The results show a few interesting facts about the two approaches. The OpenCV based model
showed a total accuracy (in all the images) of 69%. According to our expectations, it presented a
reduced accuracy (57%) for the “non-rust” classification, while it had great accuracy for the
“rust” classification (almost 90%). The reason for this is pretty clear: all the “rusty” images had
red components, so it was easy for the algorithm to detect it. However, for the “non-rust” class,
the presence of red pixels does not necessary imply the presence of rust. So when we pass a red
apple picture, the model just detected red component and misclassified it as “rust”, reducing the
“non-rust” accuracy. All the four pictures in Figure 1 for example, have been classified by the
OpenCV algorithm as “rust”, while only two of them are actually correct. The few false negatives
involved (where there was rust but it was not correctly detected), seemed were due mainly to the
bad illumination of the image, problems associated with colour (we also provided few out of
focus test images), or the rust spot was too small (less than 0.3% of the image).
For the Deep Learning Algorithm, things get more interesting. Indeed, we noticed a more
uniform accuracy (78% in total) between the “rust” detection and the “non-rust” detection (78%
in both the cases). In this case the model is also more difficult to “trick”: For example all the
images in Figure 1 were correctly classified from the model, despite the fact that we never used
any apple or desert image during the training process. So we analysed the most common pictures
where it failed, to get some useful information from it. In Figure 2 are reported a few examples of
“non rust” picture, wrongly classified as “rust” from the Deep Learning model. It is important to
mention that all the pictures in Figure 2 were also misclassified by the OpenCV algorithm. We
believe that in the first and last image, the presence of red leaves led the algorithm to believe that
it was rust. In the second image, the rust on the concrete section was wrongly classified as “rust”
in metal. The third image was more difficult to explain, however a reasonable explanation may be
the presence of the mesh pattern in the metal and a little reddish drift of the colours.
In Figure 3 are shown some examples of pictures classified as “non-rust”, while there was
actually rust. It seems that they have something in common, so the reason for the
misclassification may be that the system has “never seen” something similar. The two images on
the top were correctly categorized from the OpenCV, while the two bottom ones were not.
A few considerations about the confidence level of the Deep Learning model are also interesting.
We noticed that for most of the images the model gave us a “confident rate” above 80%. In 15%
of the images, this confidence was less than 80%. If we analyse this 15% in detail, we discovered
that actually 9 of those were wrongly classified, while only 6 were correct. By discarding these
7. Computer Science & Information Technology (CS & IT) 97
images we were able to increase the total accuracy from 78%to 88%. So the model already
provides us with a useful and reliable parameter that can be directly used to improve the overall
accuracy.
Figure 2: Example of picture wrongly classified as Rust from the Deep Learning model
Figure 3: Examples of pictures wrongly classified as No-Rust from the Deep Learning model
Even more interesting are the results from a possible combination of the two algorithms. In 77
images both the algorithms agree on the result. Of these 77 images, only 12 (3+9) were wrong.
This would have given us a partial accuracy of 92% for the “rust” and 78% for the “non-rust”.
Another interesting solution would be to use the OpenCV to filter out the “non-rust” image, and
then pass the possible rust image to the Deep Learning model. In this case we could potentially
8. 98 Computer Science & Information Technology (CS & IT)
create a system much more accurate with an accuracy of 90% of “rust” and 81% for the “non-
rust”. More complex solutions are also possible, for example by discarding from the “possible
rust”, where the Deep Learning model has a confidence level less than 80%.
6. CONCLUSIONS
In this paper we presented a comparison between two different models for rust detection: one
based on red component detection using OpenCV library, while the second one using Deep
Learning models. We trained the model with more than 3500 images and tested with a new set of
100 images, finding out that the Deep Learning model performs better in a real case scenario.
However for a real application, it may be beneficial to include both the systems, with the
OpenCV model used just for removing the false positives before they are passed to the Deep
Learning method. Also, the OpenCV based algorithm may also be useful for the classification of
images where the Deep Learning algorithm has low confidence. In future work we will seek to
refine the model and train it with a new and larger dataset of images, which we believe would
improve the accuracy of the Deep Learning model. Subsequently, we will do some field testing
using real time video from real bridge inspections.
ACKNOWLEDGEMENTS
We would like to thank Innovation Norway and Norwegian Centres of Expertise Micro- and
Nanotechnology (NCE-MNT) for funding this project and Statens vegvesen and Aas-Jakobsen
for providing image datasets. Some of the pictures used were also downloaded from pixabay.com
REFERENCES
[1] A.Leibbrandt et all. “Climbing robot for corrosion monitoring of reinforced concrete structures” DOI:
10.1109/CARPI.2012.6473365 2nd International Conference on Applied Robotics for the Power
Industry (CARPI), 2012
[2] Jong Seh Lee, Inho Hwang, Don-Hee Choi Sang-Hyun Hong, “Advanced Robot System for
Automated Bridge Inspection and Monitoring”, IABSE Symposium Report 12/2008; DOI:
10.2749/222137809796205557.
[3] “Bridge blown up, to be built anew”, newsinenglish.no,
http://www.newsinenglish.no/2015/02/23/bridge-blown-up-to-be-built-anew/
[4] Gang Ji, Yehua Zhu, Yongzhi Zhang, “The Corroded Defect Rating System of Coating Material
Based on Computer Vision” Transactions on Edutainment VIII Springer Volume 7220 pp 210-220
[5] F Bonnín-Pascual, A Ortiz, “Detection of Cracks and Corrosion for Automated Vessels Visual
Inspection”, A.I. Research and Development: Proceedings of the 13th conference.
[6] N. Hwang, H. Son, C. Kim, and C. Kim, “Rust Surface Area Determination Of Steel Bridge
Component For Robotic Grit-Blast Machine”, isarc2013Paper305.
[7] Moselhi, O. and Shehab-Eldeen, T. (2000). "Classification of Defects in Sewer Pipes Using Neural
Networks." J. Infrastruct. Syst., 10.1061/(ASCE)1076-0342(2000)6:3(97), 97-104.
[8] Open CV, Computer Vision Libraries: OpenCV.org
9. Computer Science & Information Technology (CS & IT) 99
[9] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard and L. D. Jackel:
“Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation”, 1(4):541-
551, Winter 1989.
[10] Torch, Scientific Computing Framework http://torch.ch/
[11] Tensor Flow, an open source software library for numerical computation,
https://www.tensorflow.org/
[12] Caffe Deep Learning Framework, Berkeley Vision and Learning Center (BVLC),
http://caffe.berkeleyvision.org/.